AI Outperforms Radiologists for Detecting Breast Cancer on Mammography

By Will Boggs MD

February 22, 2020

NEW YORK (Reuters Health) - An artificial intelligence (AI) algorithm outperforms radiologists for detecting breast cancer on mammography, according to a retrospective study.

If shown to work "in real practice, AI-CAD (computer-aided detection) can replace the role of second readers in double-reading settings or reduce radiologist workload to triage a portion of mammograms as cancer-free," Dr. Eun-Kyung Kim of Severance Hospital, Yonsei University College of Medicine, in Seoul, told Reuters Health by email.

As many as 30% of breast cancers can be missed in mammography as a result of dense parenchyma obscuring lesions, poor positioning, perception error, interpretation error and other reasons. Efforts to reduce these false-negatives can lead to excessive recalls and biopsies.

Dr. Kim and colleagues from five institutions in South Korea, the U.S. and the UK developed and validated an AI algorithm to detect breast cancer on mammograms and explored whether it could improve the performance of radiologists.

On its own, AI had an overall diagnostic performance of 0.959 (by area under the receiver operating characteristic curve, or AUROC), with individual performances of 0.970 in the South Korea dataset, 0.953 in the U.S. dataset and 0.938 in the U.K. dataset.

In contrast, the overall diagnostic performance of radiologists was 0.810, significantly less than that of the AI algorithm (0.940), the researchers report in The Lancet Digital Health.

When general radiologists were aided by AI, their diagnostic performance improved significantly from 0.772 to 0.869. Improvements were even more noticeable in dense breasts.

Radiologists also showed improvements in both sensitivity and specificity for detecting pathologic features of breast cancer when helped by AI.

Of 160 breast cancers, 89% were detected by AI with an abnormality score of at least 0.1, compared with 76% detected by more than half of the radiologists.

AI detected more T1 cancers (91%) and more node-negative cancers (87%) than did more than half of the radiologist readers (74% and 74%, respectively).

"AI-CAD is ready for clinical use," Dr. Kim said. "However, studies within screening scenarios should be performed to validate previous findings and to evaluate the real effect of AI support in screening."

Dr. Nehmat Houssami of the University of Sydney, Australia, who has studied AI for breast-cancer screening, told Reuters Health by email, "The evidence does not yet justify using AI to read a mammogram in routine breast screening; while the accuracy of the AI reported by Kim at al outperformed experienced readers (which is very promising), you will note that where the AI is compared to humans, 50% of the mammograms showed a breast cancer. This does not reflect routine mammography screening (where breast cancer is present in <1%, so the challenge is most with normal screens)."

"So the critical evidence gap is whether the AI's performance will generalize (transfer) into real-world mammography interpretation (it is possible that when the AI is exposed to mammograms that are mostly normal, its performance will differ; it may deteriorate)," she said.

Dr. Ritse Mann of Radboud University Medical Centre, in Nijmegen, the Netherlands, recently reviewed the use of AI for mammography and digital breast tomosynthesis. He told Reuters Health, "In order to increase the benefit of AI for mammography reading, we must move beyond this and accept that the AI system alone is capable of making decent decisions. This is obviously by recalling women with lesions that are highly suspicious of malignancy according to the AI system, but more importantly, by accepting that a mammogram is negative when the AI system does not find a lesion. This will decrease the workload tremendously and hence will have a major impact on healthcare."

"The current study is not good enough to evaluate whether that is indeed feasible," he said in an email. "The effect should be evaluated on real screening populations with longitudinal follow up, rather than highly enriched case-sets; however, the findings are very promising."

Dr. Janine Katzen of Weill Cornell Medicine and New York-Presbyterian, in New York City, who recently reviewed CAD in mammography, said, "In the future, once validated, these AI algorithms will likely be employed in a way similar to CAD, however with greatly improved accuracy, which has the potential to both improve efficiency for the radiologists and outcomes for our patients."

"Advances in AI and machine learning are exciting and have the potential for improvement in patient care," she told Reuters Health by email. "However, all of these tests need to undergo significant validation prior to clinical implementation."

"This study was performed utilizing digital mammography," Dr. Katzen added. "It does not appear that tomosynthesis was included. Given the increased utilization of tomosynthesis and associated synthesized two-dimensional imaging for screening mammography in this country, it would be important to see how AI and deep-learning algorithms perform with this modality."

The study was funded by Lunit, a company selling medical AI software. Three of Dr. Kim's coauthors are Lunit employees.

SOURCE: and Lancet Digital Health, online February 6, 2020.