Thursday, June 18, 2026

Studies have shown that skewed dermatology data sets cause models to be less accurate


It’s no secret that people with darker skin Underrepresented in the field of dermatology for a long time — Most textbooks used to train future doctors are dominated by white patients, and there are few examples of melanoma and other diseases in patients with black and brown skin. As more and more companies build artificial intelligence tools to assess common skin conditions, a recent study found that such oblique data makes them less accurate on dark skin.

Researchers at the MIT Media Lab and Scale AI analyzed two widely used dermatological maps: DermaAmin and Atlas Dermatologico. They labeled more than 16,500 images based on Fitzpatrick skin type, which is a method to classify skin pigmentation. Although not perfect, it is still useful for evaluating algorithm fairness.

Most publicly available dermatology data sets do not contain any information about skin type, race, or ethnicity, so it is difficult to quantify the degree of deviation of the data. According to the findings of the researchers, the two lightest skin types have 3.6 times more images than the two darkest skin types.

When looking at specific skin conditions, the difference is even more pronounced. Although all 114 skin conditions represent the three lightest skin types, only 89 skin types represent the darkest skin types.

Alexandr Wang, CEO of Scale AI, said that the results show that the dark skin images in the online dermatology map are underrepresented. If only neural network training is performed on these data, the results may eventually be inaccurate.

“Popular dermatology atlases and datasets contain more images of people with lighter skin than darker skin, so models trained with these atlases are most accurate on lighter skin types-the accuracy of the model The farther and deeper the skin type is reduced comes from the skin types present in its training data,” he wrote in an email.

The researchers demonstrated this by training a model to use a data set to classify different skin conditions. They found that this resulted in a significant difference in the model’s ability to correctly diagnose skin conditions involving darker skin.By sharing their results Computer Vision Foundation.

The results could have a significant impact on companies that develop tools to help patients or primary care physicians identify different skin conditions. For example, Google recently announced plans for a consumer-oriented symptom checker, where people can upload images of their skin and view the three most likely results.But in one Published research on its AI model, The deepest skin type is not represented in the data set at all.

Photo credit: Andrii Shyp, Getty Images



Source link

Related articles

spot_imgspot_img