Skin of Color Underrepresented in Datasets Used by AI to Identify Skin Cancer

Jeff Craven

November 09, 2021

An analysis of open-access skin image datasets available to train machine learning algorithms to identify skin cancer has revealed that darker skin types are markedly underrepresented in the databases, researchers in the United Kingdom report.

Out of 106,950 skin lesions documented in 21 open-access databases and 17 open-access atlases identified by David Wen, BMBCh, from the University of Oxford, United Kingdom, and colleagues, 2436 images contained information on Fitzpatrick skin type. Of these, "only ten images were from individuals with Fitzpatrick skin type V, and only a single image was from an individual with Fitzpatrick skin type VI," the researchers said. "The ethnicity of these individuals was either Brazilian or unknown."

In two datasets containing 1585 images with ethnicity data, "no images were from individuals with an African, Afro-Caribbean, or South Asian background," Wen and colleagues noted. "Coupled with the geographical origins of datasets, there was massive under-representation of skin lesion images from darker skinned populations."

The results of their systematic review were presented at the National Cancer Research Institute (NCRI) Festival and published on November 9 in The Lancet Digital Health. To the best of their knowledge, they write, this is "the first systematic review of publicly available skin lesion images comprising predominantly dermoscopic and macroscopic images available through open access datasets and atlases."

Overall, 11 of 14 datasets (79%) were from North America, Europe, or Oceania among datasets with information on country of origin, the researchers said. Either dermoscopic images or macroscopic photographs were the only types of images available in 19 of 21 (91%) datasets. There was some variation in the clinical information available, with 81,662 images (76.4%) containing information on age, 82,848 images (77.5%) having information on gender, and 79,561 images having information about body site (74.4%).

The researchers explained that these datasets might be of limited use in a real-world setting where the images aren't representative of the population. Artificial intelligence (AI) programs that train using images of patients with one skin type, for example, can potentially misdiagnose patients of another skin type, they said.

"AI programs hold a lot of potential for diagnosing skin cancer because it can look at pictures and quickly and cost-effectively evaluate any worrying spots on the skin," Wen said in a press release from the NCRI Festival. "However, it's important to know about the images and patients used to develop programs, as these influence which groups of people the programs will be most effective for in real-life settings. Research has shown that programs trained on images taken from people with lighter skin types only might not be as accurate for people with darker skin, and vice versa."

There was also "limited information on who, how and why the images were taken," Wen said in the release. "This has implications for the programs developed from these images, due to uncertainty around how they may perform in different groups of people, especially in those who aren't well represented in datasets, such as those with darker skin. This can potentially lead to the exclusion or even harm of these groups from AI technologies."

While there are no current guidelines for developing skin image datasets, quality standards are needed, according to the researchers.

"Ensuring equitable digital health includes building unbiased, representative datasets to ensure that the algorithms that are created benefit people of all backgrounds and skin types," they conclude in the study.

Neil Steven, MBBS, MA, PhD, FRCP, an NCRI Skin Group member who was not involved with the research, stated in the press release that the results from the study by Wen and colleagues "raise concerns about the ability of AI to assist in skin cancer diagnosis, especially in a global context."

"I hope this work will continue and help ensure that the progress we make in using AI in medicine will benefit all patients, recognising that human skin colour is highly diverse," said Steven, Honorary Consultant in Medical Oncology at University Hospitals Birmingham NHS Foundation Trust, United Kingdom.

"We Need More Images of Everybody"

Dermatologist Adewole Adamson, MD, MPP, assistant professor in the department of internal medicine (division of dermatology) at Dell Medical School at the University of Texas at Austin, said in an interview that a "major potential downside" of algorithms not trained on diverse datasets is the potential for incorrect diagnoses.

"The harms of algorithms used for diagnostic purposes in the skin can be particularly significant because of the scalability of this technology. A lot of thought needs to be put into how these algorithms are developed and tested," said Adamson, who reviewed the manuscript of The Lancet Digital Health study but was not involved with the research.

He referred to the results of a recently published study in JAMA Dermatology, which found that only 10% of studies used to develop or test deep learning algorithms contained metadata on skin tone. "Furthermore, most datasets are from countries where darker skin types are not represented. [These] algorithms therefore likely underperform on people of darker skin types and thus, users should be wary," Adamson said.

A consensus guideline should be developed for public AI algorithms, he said, which should have metadata containing information on sex, race/ethnicity, geographic location, skin type, and part of the body. "This distribution should also be reported in any publication of an algorithm so that users can see if the distribution of the population in the training data mirrors that of the population in which it is intended to be used," he said.

Adam Friedman, MD, professor and chair of dermatology at George Washington University School of Medicine and Health Sciences, Washington, DC, who was not involved with the research, said that while this issue of underrepresentation has been known in dermatology for some time, the strength of the Lancet study is that it is a large study, with a message of "we need more images of everybody."

"This is probably the broadest study looking at every possible accessible resource and taking an organized approach," Friedman said in an interview. "But I think it also raises some important points about how we think about skin tones and how we refer to them as well with respect to misusing classification schemes that we currently have."

While using ethnicity data and certain Fitzpatrick skin types as a proxy for darker skin is a limitation of the metadata the study authors had available, it also highlights "a broader problem with respect to lexicon regarding skin tone," he explained.

"Skin does not have a race, it doesn't have an ethnicity," Friedman said.

A dataset that contains not only different skin tones but how different dermatologic conditions look across skin tones is important, he noted. "If you just look at one photo of one skin tone, you missed the fact that clinical presentations can be so polymorphic, especially because of different skin tones," he said.

"We need to keep pushing this message to ensure that images keep getting collected. We [need to] ensure that there's quality control with these images and that we're disseminating them in a way that everyone has access, both from self-learning, but also to teach others," said Friedman, coeditor of a recently introduced dermatology atlas showing skin conditions in different skin tones.

Lancet Digit Health. Published online November 9, 2021. Full text

Adamson reports no relevant financial relationships. Friedman is a coeditor of a dermatology atlas supported by Allergan Aesthetics and SkinBetter Science.

This study was funded by NHSX and the Health Foundation. Three authors reported being paid employees of Databiology at the time of the study. The other authors reported no relevant financial relationships.

Jeff Craven is an independent journalist living in Wilmington, Delaware.

For more news, follow Medscape on Facebook, Twitter, Instagram, YouTube, and LinkedIn.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.