A recent report from researchers, published on October 8 in Nature, indicates that when generating resumes for individuals with female names like Allison Baker or Maria Garcia, versus male names such as Matthew Owens or Joe Alvarez, ChatGPT presented female candidates as being, on average, 1.6 years younger. This age discrepancy is a clear indication of gender bias compounded with the model’s tendency to evaluate female applicants as less qualified than their male counterparts.
This inclination of the AI toward favoring younger women and older men fails to align with real-world demographics. U.S. Census data reveals that the ages of male and female employees are relatively comparable. Moreover, this age-gender bias emerged even in sectors where women typically skew older than men, such as in sales and service roles.
While discrimination against older women in the workforce is a known issue, providing quantitative evidence has proven challenging, notes DanaĂ© Metaxa, a computer scientist at the University of Pennsylvania who was not part of the study. The finding brings attention to pervasive “gendered ageism,” which can have detrimental effects. “It’s concerning for women to see themselves represented as having a life expectancy narrative that declines in their 30s or 40s,” they remarked.
The researchers employed various methodologies to underscore how biased information can skew AI outputs. This included analyzing nearly 1.4 million online images and videos, conducting text analyses, and engaging in a randomized controlled experiment, all of which highlighted the AI’s preference for resumes belonging to certain demographic groups.
These insights may illuminate the ongoing glass ceiling for women in the workplace, according to Douglas Guilbeault, a coauthor of the study and computational social scientist. Although many organizations have aimed for greater diversity over the past decade, men still hold the highest positions, as research indicates. “Companies trying to diversify tend to hire young women but do not adequately promote them,” Guilbeault, from Stanford University, stated.
In their study, Guilbeault and his team had over 6,000 coders assess the age of individuals in various online images, such as those associated with Google and Wikipedia, across numerous occupations. The coders also evaluated age in YouTube videos, consistently indicating women in the visuals appeared younger than men. This bias was particularly pronounced in prestigious roles like doctors and CEOs, suggesting a societal perception that older men are more authoritative than older women.
The team further investigated online texts using nine different language models to rule out visual biases stemming from factors such as image filters or makeup. The analysis revealed that less prestigious job categories—like secretary or intern—were more frequently associated with younger women, whereas roles with greater prestige—like chairman of the board or director of research—were typically linked to older males.
To assess how online distortions might influence perceptions, the researchers conducted an experiment with more than 450 participants. Those in the experimental group searched for images of various occupations in Google Images, submitted the images to the researchers’ database, identified them as male or female, and estimated the age of the individuals depicted. The control group submitted random pictures without age context.
The results indicated that uploading images affected the participants’ estimations of age. Those who uploaded images of female employees estimated the average age of other females in similar occupations to be two years younger than those in the control group. Conversely, those who uploaded images of male employees guessed their average age to be over half a year older.
The study also demonstrated that AI models trained on vast online datasets inherit and amplify existing age and gender biases. When the researchers prompted ChatGPT to create resumes for 54 occupations using 16 female and 16 male names, leading to almost 17,300 resumes in total, it resulted in consistently younger and less experienced resumes for women, accompanied by lower overall scores as compared to those for men.
These biases not only disadvantage women but also affect men; Guilbeault noted that resumes for young men were scored lower than those for their young female counterparts.
In a perspective piece accompanying the main article, sociologist Ana Macanovic of the European University Institute in Fiesole, Italy, warns that as AI usage grows, such biases are likely to intensify.
Companies like Google and OpenAI, which owns ChatGPT, often aim to address one specific bias at a time, such as racism or sexism. Guilbeault argues, though, that this strategy overlooks the interconnectedness of overlapping biases—like those of gender and age or race and class. For example, initiatives aimed at enhancing the representation of Black individuals online must also consider the biases that intersect with a lack of racially diverse images. If neglected, the online landscape may become saturated with depictions of affluent white individuals and impoverished Black individuals. “True discrimination arises from the convergence of various inequalities,” he concluded.