Mammography-Based Deep Learning Model Improves Breast Cancer Risk Prediction

By Will Boggs MD

May 21, 2019

NEW YORK (Reuters Health) - A mammography-based deep learning model is more accurate than established clinical models for predicting breast cancer risk, according to a retrospective study.

"We were surprised to find that a mammogram alone can determine breast cancer risk significantly more accurately than previous approaches that incorporate age, family history, and hormonal information," Adam Yala from Massachusetts Institute of Technology, Cambridge, Massachusetts told Reuters Health by email. "This is an exciting finding, and it points that perhaps other routinely collected imaging can drive new insights into which diseases we may be at risk for in ways that prior approaches have not."

Mammographic breast density improves the accuracy of breast cancer risk models, but its use is limited by subjective assessment, variation across radiologists, and restricted data.

Yala and colleagues developed a deep learning (DL) breast cancer risk model based on full-field mammograms. In 80,243 mammographic examinations, they compared their new model to three other models (a logistic regression model based only on risk factors (RF-LR), the Tyrer-Cuzick model, and a hybrid that combined the DL model and the TC model).

In the full data set, the accuracies (as determined by AUC) of the models for distinguishing women who developed cancer within five years from those who did not develop cancer within five years were 0.62 for TC, 0.67 for RF-LR, 0.68 for image-only DL, and 0.70 for hybrid DL, according to the May 7th Radiology online report.

Hybrid DL had a significantly higher AUC than TC and RF-LR, and image-only DL outperformed TC but not RF-LR.

Hybrid DL also performed best in placing women in the top and bottom deciles of breast cancer risk.

The hybrid DL model performed similarly well in white and African American women, in premenopausal and postmenopausal women, and in women with and without a family history of breast or ovarian cancer.

The DL models showed similar performance when mammograms from women in whom cancer was diagnosed in less than three years were excluded, suggesting that these models were able to learn features associated with long-term risk as well as early detection.

In confusion matrix analyses, a woman's risk assessed at hybrid DL was more informative than her breast density category, and hybrid DL was more accurate and more informative than TC.

"We hope that our deep learning based risk models will support accurate personalized screening for earlier detection," Yala said. "We imagine that after running a mammogram through our model, doctors will be much more empowered to make an accurate and informed decisions on when a patient should come back for her next mammogram, and if she should get an MRI. We also hope to reduce the anxiety for the majority of women who are told they have dense breasts, but are truly at an average or below average risk of breast cancer."

Dr. Arkadiusz Sitek from IBM Watson Health, Cambridge, Massachusetts, who co-authored an editorial related to this report, told Reuters Health by email, "It is actually very surprising that just based on mammography image, authors can create a better breast cancer risk prediction model than a model which uses traditional risk factors. This finding suggests that we may not fully exploit information contained in mammography images."

"In terms of applications of DL in breast cancer risk assessment, the risk obtained by DL is no different than risk obtained from breast density scores," he said. "It is just more accurate according to the current study. However, as we emphasize in the editorial, the lack of explainability is problematic, and physicians may be reluctant to use a risk assessment tool which they do not understand. The same goes for patients."

Dr. Sitek added, "One must wonder if data other than mammography can also provide more useful and actionable information about patients health than we currently utilize in clinical practice."

Dr. James Mainprize, a medical physicist from University of Toronto, Ontario, Canada who has researched the effect of masking in mammography and whether image-only models can predict whether certain density patterns have increased risk of masking potential, told Reuters Health by email, "I think the most intriguing result is that the image-only model was nearly as good as the hybrid model (combining the classical risk factors such as age, family history, etc.). This may be particularly useful because the data isn't always available for the other risk factors (e.g., family history, genetic risk may be unknown)."

"Deep learning can be a powerful tool that can help in developing prognostic information from medical images that may be too subtle for the human eye," he said. "However, DL comes with the caveat that it is often difficult to understand 'why' DL is finding differences in the images -- what is the biological or physiological mechanism that is relates to these subtle differences in image appearance?"

"The performance in the subgroup analysis is interesting (e.g., by race)," Dr. Mainprize said, "but as the authors point out, this is data from a single specific population (the catchment for a tertiary academic institution) and local differences in societal, environmental factors may play a strong role in the subgroups. Further analysis in other populations is warranted before definite conclusions can be made."

In order to validate the model across institutions and vendors, the researchers have made their trained model and code available for research at


Radiology 2019.