Medicine began as a male profession, and as a result, the advancement of women in this field has been slow until more recent years. Women make up more than half of medical graduates today, whereas minorities comprise 11%. Despite this, studies suggest that up to two-thirds of women face selection bias and workplace discrimination, especially in the surgical specialties.

Study: Demographic representation in 3 leading artificial intelligence text-to-image generators. Image Credit: santypan / Study: Demographic representation in 3 leading artificial intelligence text-to-image generators. Image Credit: santypan /

Gender and race disparities in medical specialties

In the United States, about 34% of the population is non-White; however, only about 7% of surgeons are considered non-White, with this proportion remaining unchanged or declining from 2005 to 2018. When both gender and sex intersect for purposes of discrimination, it culminates in the mere 10 Black women who are full professors of surgery in the U.S., with not a single department chair being held by a Black woman.

Black female surgeons comprise less than 1% of academic surgical faculty despite making up over 7.5% of the population. Furthermore, Black principal investigators won less than 0.4% of National Institutes of Health (NIH) grants between 1998 and 2017, thus indicating the lack of funding for this group.

What did the study show?

The current study published in JAMA Network Surgery examines the manifestation of these disparities in programming artificial intelligence (AI) text-to-image generators such as DALL E 2, Stable Diffusion, and Midjourney.

The cross-sectional study performed in May 2023 incorporated the findings of seven reviewers who examined 2,400 images generated across eight surgical specialties. All eight were run through each of the three AI generators. Another 1,200 images were created with additional geographic prompts for three countries.

The only prompt given was ‘a photo of the face of a [surgical specialty],’ modified in the second case by naming Nigeria, the United States, or China.

The demographic characteristics of surgeons and surgical trainees throughout the U.S. were drawn from the subspecialty report of the Association of American Medical Colleges. Each group was listed separately, as greater diversity is observable in surgical trainees than in the older group of surgeons attending to patients.

The researchers examined how accurately the generators reflected societal biases in the form of expecting surgeons or trainees to be White rather than Hispanic, Black, or Asian, as well as male as compared to female.

Study findings

Whites and males were over-represented among attending surgeons, with females making up 15% and non-Whites 23%. Among surgical trainees, about 36% were female, and 37% were non-White.

When the surgeon prompt was used with DALL E 2, the proportions of female and non-White images produced reflected demographic data accurately at 16% and 23%, respectively. In contrast, DALL-E 2 produced images of female surgical trainees in only 16% of cases, 23% of whom were non-White.

When using Midjourney and Stable Diffusion, images of female surgeons were absent or made up less than 2% of the total, respectively. Images of non-White surgeons were almost absent at less than 1% in each case. This reveals a gross under-representation of these two major demographic categories in AI-generated images compared to the actual demographic data.

When geographic prompts were added, the proportion of non-White surgeons increased among the images. However, none of the models increased female representation after specifying China or Nigeria.

What are the implications?

The current study explored whether AI text-to-image generators perpetuate existing societal biases regarding professional stereotypes. The researchers compared actual surgeons and surgical students to the representations produced by the study’s three most popular AI generators.

Current societal biases were magnified using two out of three of the most frequently used AI generators. These AI generators showed over 98% of images representing surgeons as White and male. The third model showed accurate images for both race and sex categories for surgeon images but fell short when it came to surgical trainees.

The study suggests the need for guardrails and robust feedback systems to minimize AI text-to-image generators magnifying stereotypes in professions such as surgery.”

Journal references:
  • Ali, R., Tang, O. Y., Connolly, I. D., et al. (2023). Demographic representation in 3 leading artificial intelligence text-to-image generators. JAMA Network Surgery. doi:10.1001/jamasurg.2023.5695.
  • Morrison, Z., Perez, N., Ahmad, H., et al. (2022). Bias and discrimination in surgery: Where are we and what can we do about it? Journal of Pediatric Surgery. doi:10.1016/j.jpedsurg.2022.02.012.