A new paper by Tessa Charlesworth, an assistant professor of management and organizations at the Kellogg School, explores that complex question by computationally analyzing language usage over 115 years, from 1900 to 2015. In particular, Charlesworth and her colleagues—Nishanth Sanjeev of New York University and Mark L. Hatzenbuehler and Mahzarin R. Banaji of Harvard University—looked at how the traits associated with various social groups have changed over time.

They investigated both the overt and implied meanings of these stereotypes. “I often give the analogy of an iceberg,” Charlesworth says. “There’s the tip of the iceberg that we can see above the water line. These are the actual words we use to describe different groups”—what the researchers call “manifest” content. “But then hiding under the surface of the water are the hidden meanings, like how positive or negative, or competent or incompetent, those words are”—the “latent” content.

Overall, the researchers discovered, group stereotypes have changed significantly in their manifest content, but their latent content has remained much more stable. For example, “you can think of some archetypal examples of how our stereotypes of Black Americans have changed over time, from lazy in the 1900s to helpless in the 1990s,” Charlesworth explains. “It’s a different word, but it’s got the same meaning of incompetence and negativity. We can think of similar examples with women—they used to be called hysterical; now they’re emotional.”

To Charlesworth, this pattern suggests that surface-level descriptors may change in meaningful ways, but deep-rooted feelings and beliefs are more stubborn.

“It’s a really interesting social phenomenon,” she says. “Society can reinvent itself and, on the surface, pretend to be changing and making progress—despite the fact that there are hidden messages that continue to persist.”

How word embeddings capture social biases

To understand how stereotypes both have and haven’t changed over time, the researchers used word embeddings. This type of computational text analysis involves representing words in space based on how frequently they occur together in a given body of text. Word embeddings allow scholars to measure the relatedness of two words based on how close or far apart they sit. The technique allows a computer system that has no concept of what words mean to determine that dog is closer to cat than it is to refrigerator; it’s also how systems like ChatGPT learn to generate such human-sounding text.

Previous research has shown that word embeddings also correlate with experimentally documented biases. For example, the Implicit Association Test demonstrates that people more quickly associate terms related to youth with pleasantness than unpleasantness (and vice-versa for terms related to age). Word embeddings show the very same patterns: terms related to youth sit closer to words like pleasant than words like unpleasant, while the opposite is true for terms related to older age.

And because word embeddings can be quantified numerically, changes over time can be quantified numerically too—which is exactly what Charlesworth and her colleagues did.

By looking across time at the traits most closely associated with different social groups, as well as the latent meaning of those traits, the researchers could understand how stereotypes have changed—and how they haven’t.

A century of language change

To begin their study, Charlesworth and her colleagues amassed a huge trove of text that spanned 115 years and included both fiction and nonfiction works. Their data set included Google Books, The New York Times archive, and the Common Crawl, a massive repository scraped from the internet.

Then, they devised a list of social groups they wanted to study. The 72 groups they chose fell into four different categories: sociodemographic groups (e.g., black, white, young, old, gay, straight), body-related (e.g., fat, thin, disabled, abled), mental-health-related (e.g., depressed, happy, bipolar), and occupational (e.g., employed, unemployed, educated, uneducated).

The researchers also created a list of synonyms for each group, making sure to include a range of historically specific terms. For example, the term schizophrenic was not popularized until the early 19th century, so they also included psychosis, which was commonly used to describe the same set of symptoms before that time.

Next, they compared how those 72 social groups (and their synonymous terms) were related to some 600 trait adjectives that have been widely used in other psychology research.

For each decade between 1900 and 2015, they determined which ten trait words sat closest to each of the 72 groups. “Are they words like lazy and helpless, or are they words like warm and kind?” Charlesworth explained.

Finally, she and her colleagues assigned scores—numerical measurements of positivity, warmth, and competence—to the top 10 trait words associated with each group at each point in time. This allowed them to calculate how much the latent meaning of the stereotypes changed, in addition to the actual turnover of individual manifest words.

Stereotype continuity and change

Across all 72 groups, manifest stereotypes—that is, the actual language used—changed meaningfully, while latent content—the underlying associations—remained much more stable. However, Charlesworth notes, “it’s not the case that every single group is changing in manifest content and not changing in latent valence. There’s actually a lot of variability.”

For example, sociodemographic group stereotypes changed considerably more than body-related group stereotypes.

Charlesworth and her colleagues suspect this may have to do with how cohesive stereotypes are at any given point in time. For example, if novels, the New Yorker, and the National Review are all portraying the same group in the same way at the same time, there’s little room for stereotypes to change in the future. But if there’s more variability across sources, it suggests that the consensus view is breaking down.

And that’s essentially what the researchers found for body-related versus sociodemographic groups, Charlesworth explains: “Everyone’s saying the same kind of negativity about being fat or being disabled. And that kind of cohesiveness means that those stereotypes can remain uncontested.” Meanwhile, she adds, “there’s a little bit more variability in how we talk about sociodemographic groups, which can open the door to social change.”

How often a group was mentioned at all also emerged as a predictor of manifest (and to a lesser extent, latent) stereotype change. “Groups that we talk about more are also going to be groups that change more, because there are just more opportunities to intervene on attitudes that we talk about than on attitudes that we ignore” Charlesworth says. “And maybe that’s our best source for intervention going forward. Groups that we can bring front of mind in activism and policy are going to be the ones we can move the needle on.”

Changing hearts (and words)

Charlesworth says the research sheds light on one of the most persistent paradoxes of modern life: despite immense progress for many marginalized social groups, profound inequities and biases remain.

“It resolves some of the ambiguity about how we can have both evidence of some change . . . and really persistent discrimination,” she says. “The words we’re using to describe these groups are changing, but under the surface, there are hierarchies that are just so persistent.”

Figuring out what can be done is an issue she hopes to address in future research. “How do you address the latent, underlying meaning of our group stereotypes? How do you disrupt the idea that we feel functional or even legitimate reasons to stigmatize groups?” she says. “That, I think, will be the key point of intervention, and the main open question.”