Performance of Breast Cancer Risk-Assessment Models in a Large Mammography Cohort

Anne Marie McCarthy; Zoe Guan; Michaela Welch; Molly E. Griffin; Dorothy A. Sippo; Zhengyi Deng; Suzanne B. Coopey; Ahmet Acar; Alan Semine; Giovanni Parmigiani; Danielle Braun; Kevin S. Hughes


J Natl Cancer Inst. 2020;112(5):489-497. 

In This Article

Abstract and Introduction


Background: Several breast cancer risk-assessment models exist. Few studies have evaluated predictive accuracy of multiple models in large screening populations.

Methods: We evaluated the performance of the BRCAPRO, Gail, Claus, Breast Cancer Surveillance Consortium (BCSC), and Tyrer-Cuzick models in predicting risk of breast cancer over 6 years among 35 921 women aged 40–84 years who underwent mammography screening at Newton-Wellesley Hospital from 2007 to 2009. We assessed model discrimination using the area under the receiver operating characteristic curve (AUC) and assessed calibration by comparing the ratio of observed-to-expected (O/E) cases. We calculated the square root of the Brier score and positive and negative predictive values of each model.

Results: Our results confirmed the good calibration and comparable moderate discrimination of the BRCAPRO, Gail, Tyrer-Cuzick, and BCSC models. The Gail model had slightly better O/E ratio and AUC (O/E = 0.98, 95% confidence interval [CI] = 0.91 to 1.06, AUC = 0.64, 95% CI = 0.61 to 0.65) compared with BRCAPRO (O/E = 0.94, 95% CI = 0.88 to 1.02, AUC = 0.61, 95% CI = 0.59 to 0.63) and Tyrer-Cuzick (version 8, O/E = 0.84, 95% CI = 0.79 to 0.91, AUC = 0.62, 95% 0.60 to 0.64) in the full study population, and the BCSC model had the highest AUC among women with available breast density information (O/E = 0.97, 95% CI = 0.89 to 1.05, AUC = 0.64, 95% CI = 0.62 to 0.66). All models had poorer predictive accuracy for human epidermal growth factor receptor 2 positive and triple-negative breast cancers than hormone receptor positive human epidermal growth factor receptor 2 negative breast cancers.

Conclusions: In a large cohort of patients undergoing mammography screening, existing risk prediction models had similar, moderate predictive accuracy and good calibration overall. Models that incorporate additional genetic and nongenetic risk factors and estimate risk of tumor subtypes may further improve breast cancer risk prediction.


Approximately 40 000 US women die from breast cancer annually.[1] Given the disease burden, identifying high-risk women before breast cancer develops remains a pressing goal, so they can consider more frequent screening with mammography and breast magnetic resonance imaging, genetic testing, and chemoprevention. Many breast cancer risk-assessment models have been developed.[2] Despite the abundance of risk models, they have not been widely implemented to guide screening decisions in routine clinical settings. This is partly due to lack of clarity on which risk model to use, limited accuracy of risk models, and the time needed to perform risk assessment and interpret results.

Few studies have evaluated multiple breast cancer risk models simultaneously to compare their performance. Additionally, breast cancer subtypes defined by estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor 2 (HER2) have unique risk profiles,[3] but few studies have evaluated model performance for tumor subtypes. We evaluated five models that have been well validated and used most commonly in clinical practice in a large population of women undergoing mammography, including the Gail model,[4–7] the BRCAPRO model,[8] the Breast Cancer Surveillance Consortium (BCSC) model,[9] the Claus model,[10–12] and the Tyrer-Cuzick (TC) model.[13] These models differ in terms of populations in which they were developed, risk factors included, and treatment of family history (Table 1). The goal of the study was to determine which risk models are most appropriate for use at the time of screening mammography to guide personalized screening decisions.