Effects of Radiotherapy in Early-Stage, Low-Recurrence Risk, Hormone-Sensitive Breast Cancer

Jinani Jayasekera; Clyde B. Schechter; Joseph A. Sparano, Reshma Jagsi; Julia White; Judith-Anne W. Chapman; Timothy Whelan; Stewart J. Anderson; Anthony W. Fyles; Willi Sauerbrei; Richard C. Zellars; Yisheng Li; Juhee Song; Xuelin Huang; Thomas B. Julian; George Luta; Donald A. Berry; Eric J. Feuer; Jeanne Mandelblatt; for the CISNET-BOLD Collaborative Group


J Natl Cancer Inst. 2018;110(12):1370-1379. 

In This Article


Clinico-pathological Estimation of Oncotype DX® Scores

The Oncotype DX® Recurrence Score is to date the most well-established and well-studied risk assessment tool available. It is widely and routinely used in the United States to make clinical determinations regarding the use of adjuvant systemic chemotherapy in hormone receptor-positive, node-negative patients with early stage disease.

Breast cancer clinicopathologicial features can be used to guide treatment based on prognosis. Gene-expression profile tests like Oncotype DX® provide information that is complementary to these clinical features,(1) and have been validated to predict loco-regional and distant breast cancer recurrences.(2, 3) Use of combinations of patient, tumor, and gene-expression profile test results has better prognostic ability than any source individually.(1, 4) Consequently, many modern clinical trials include Oncotype DX® and other clinicopathological factors in eligibility criteria and/or analyses. Unfortunately, since many older clinical trials do not have Oncotype DX® data, it is difficult to conduct meta-analyses or modeling of potential trials examining treatment effects and interactions of Oncotype DX® scores with treatment results.

We collaborated with the NCI Breast Oncology and Local Disease (BOLD) Committee to simulate a proposed trial evaluating the impact of omitting radiotherapy on recurrence and survival endpoints among biologically low recurrence-risk patients. The research included two specific-aims:

  1. Conduct a pooled analysis of past trials considering Oncotype DX®, and

  2. Simulate the proposed trial.

To complete these aims, we used individual, de-identified data from seven clinical trials to examine the effects of radiotherapy on recurrence events among patients with stage I, hormone receptor positive, HER2- tumors with Oncotype DX® scores ≤ 18. Since Oncotype DX® data were missing from six of the seven trials, we developed and evaluated a method to impute missing Oncotype DX® scores and determine the effect of radiation among low risk breast cancer patients. The algorithm and methods used to impute Oncotype DX® scores are intended for use in analyses of effects and endpoints at a cohort-level, and are not applicable for predicting outcomes of individual patients. We assumed that the Oncotype DX® scores in the six trials were missing at random (MAR), where the probability that data are missing does not depend on unobserved data but may depend on observed characteristics.(5)

The Oncotype DX® score imputation was based on the distribution of Oncotype DX® scores in the population-based linked Surveillance, Epidemiology, and End Results-Genomic Health Inc. (SEER-GHI) registry, and a deterministic regression-based multiple imputation approach.(6-9) We also obtained a separate proprietary clinical trial dataset from Genomic Health Inc. with Oncotype DX® scores (NSABP-GHI data).(3) This proprietary dataset was not included in the pooled clinical trial dataset. It was only used for external validation of the Oncotype DX® score imputation model.

The statistical technique we employed to impute Oncotype DX® scores was based on the associations between Oncotype DX® scores, age, radiation and clinicopathological features of breast cancer patients.(1, 4) First, we combined the SEER-GHI data set (with non-missing Oncotype DX scores) with the pooled clinical trial dataset and compared the distribution of observed characteristics between the SEER-GHI and pooled clinical trial data (Supplementary Table 2). The distribution of characteristics was approximately similar. In both datasets majority of the patients had ER+ and PR+ breast cancer and good/intermediate grade tumors with an average tumor size of around 1 cm. Women recruited for clinical trials were somewhat younger than the women in the population-based SEER-GHI dataset.

Next, we fitted a negative binomial generalized linear model for Oncotype DX® scores conditional on age, tumor size, tumor grade, ER/PR status, radiation and HER2 status (negative vs. missing/unknown) among women diagnosed with Stage I, hormone receptor positive (ER+ and/or PR+), HER2 negative (or unknown), node negative breast cancer (pathologically determined) who have undergone breast conserving surgery. We examined the goodness-of-fit of this imputation model by calculating deviance residuals as recommended by McCullagh and Nelder (1989).(10) Deviance residuals should be approximately normally distributed if the model is correct. As shown in Supplementary Figure 1, the deviance residuals were approximately normally distributed indicating that the model fit is correct.

We employed a simulation-based procedure,(7) to impute Oncotype DX® scores multiple (50) times to each woman based on the imputation model. This dataset was analyzed using STATA version 14 'mi estimate' command, which adjusts coefficients and standard errors for variability between imputations according to the combination rules outlined by Rubin (1987).(8) We randomly selected a sample and compared the distribution of the imputed Oncotype DX® scores with actual scores in SEER-GHI data. The distribution of the imputed scores in the clinical dataset and actual scores in the SEER-GHI dataset were similar among patients with stage I, ER+ and/or PR+, HER2-breast cancer who have undergone breast conserving surgery (Supplementary Table 3). Hence, we infer that the Oncotype DX® score imputation provides good cohort-level predictive validity.

Since the proposed trial eligibility only included Oncotype DX® scores of 18 or less, we used an estimation method that allowed the estimation sample to vary across imputations. We examined the variance information, including within-imputation and between-imputation variances, increase in relative variance due to missing data, fraction of missing information (FMI) (i.e. the ratio of information lost due to the missing data to the total information that would be present if there were no missing data) and relative efficiencies for using 50 imputations versus the theoretically optimal infinite number of imputations. Within-imputation and between-imputation variances for radiation were small (0.08 and 0.03 respectively), the relative variance increase due to missing data was 0.4, FMI was 0.3, and relative efficiency was one. FMI for Oncotype DX® score (45%) was less than the percentage of overall missing data in the pooled data set (66%) indicating that the imputation could reduce bias due to missing data, and handle missing information to provide reasonable statistical inference.

Imputation Validation

The imputation model developed for Oncotype DX® scores was evaluated using both a split-sample and an external validation. For the split sample validation, the SEER-GHI linked dataset with non-missing recurrence scores was randomly split into a derivation sample (N=13,569) and a validation sample (N=13,464). The derivation sample was used to fit a generalized linear model for Oncotype DX® scores conditional upon age, ER/PR status, HER2 status, radiation, tumor size, and tumor grade as described above. Oncotype DX® scores in the validation sample were estimated using the coefficients derived from the derivation sample. Summary statistics and histograms were used to compare the distributions of the clinicopathologically estimated and actual Oncotype DX® scores in the SEER-GHI validation sample (Supplementary Table 4). The summary statistics and histograms illustrate the similarities in the estimated and actual Oncotype DX® scores at the population-level.

For external validation of the algorithm used to impute Oncotype DX® scores in clinical trial data, we applied the model developed in SEER-GHI data (a population-based registry) to the proprietary NSABP-GHI clinical trial dataset and evaluated the concordance between the predicted and actual Oncotype DX® scores in NSABP-GHI clinical trial data using Pearson's correlation coefficient and Kappa statistic. The Pearson correlation coefficient between the estimated and actual scores in the NSABP-GHI data was high (r=.70, p=<.001). The concordance between the two scores for score categories was also very good (kappa=.71) (Supplementary Table 5). These statistics indicate good overall concordance. As shown in Supplementary Table 5, among patients with actual Oncotype DX® scores in the 0–18 category, 71% women have a predicted score in this same score category; prediction was slightly lower in the 19+ category.

Finally, while the purpose of the Oncotype DX® score estimation was to match the overall distribution of Oncotype DX® values, rather than match actual individual scores to predicted scores based on individual patient characteristics, we plotted the predicted vs. actual individual values for the NSABP-GHI dataset (Supplementary Figure 2). The results show that there is a moderately strong positive linear relationship between predicted and actual Oncotype DX® scores in the NSABP-GHI clinical trial data.


The Oncotype DX® score imputation assumes that differences in the distribution of missing Oncotype DX® scores in the pooled clinical trial data given the Oncotype DX® information in SEER-GHI and the observed covariates in both datasets were ignorable. The similarities in the distribution of observed characteristics support the choice of imputation model and the 'missing at random' assumption implicit in the model. However, we acknowledge that there may be systematic differences between women receiving vs. not receiving the Oncotype DX® test in SEER-GHI compared to those included in clinical trials. Therefore, while the imputation was robust, and there were similarities in the distribution of observed characteristics, it may have not removed all bias due to missing data. Women receiving Oncotype DX® testing in SEER-GHI data would include mostly women with low and intermediate risk scores.(6, 9) Since women in the NSABP trials were selected to participate in the trials without knowledge of Oncotype DX®, they may have had higher or lower scores than the source population from which they were drawn. Hence, it is not possible to infer the impact of the Oncotype DX® score imputation on misclassification and bias towards or away from the null. The imputed scores are useful to determine robust trial-wide endpoints (and their uncertainty).


Clinicopathologically estimated Oncotype DX® scores reproduce population-level distributions, and can be considered for use to evaluate trial endpoints at the cohort-level that are conditional on Oncotype DX® scores.

Propensity Score Analyses (11)

Model equation for the logistic regression (for the propensity score estimation)

ln(o)= ln = β 0+ β (Age Categories) + λ (Grade) + ρ (Radiation randomized) + π (ER/PR status ) + ψ (tumor size) + α (predicted oncotype score) + Ω1 (tumor size * radiation randomized) + Ω2 (age * radiation randomized) + Ω3 (grade * radiation randomized) + Ω4(ER + PR + status * radiation randomized) + ε

o= odds of receiving radiation,

p= proportion of women receiving radiation in the sample,

β 0= the intercept,

β to μ = represents the coefficients (slopes) of each variable belonging to the categories shown in the table below. β,λ,ρ,π,ψ,α, and Ω1 to 4 are coefficients associated with the variables belonging to each category provided in the table below.

ε = random error

Multivariable logistic regression model used for the propensity score estimation

Variables Point Estimate (Odds Ratio) SE p-value
Age at randomization 0.99 0.02 0.38
Clinical Characteristics (Baseline)
   Good 0.08 0.06 0.00
   Intermediate/Poor 0.10 0.07 0.00
   Unknown Reference
ER/PR status
   ER+/PR+ 0.18 0.18 0.09
   Other Reference
Tumor size 2.10 0.77 0.04
Radiation Randomized 0.07 0.08 0.02
Tumor size*radiation randomized 0.43 0.17 0.03
Age*radiation randomized 1.01 0.02 0.51
Good grade*radiation randomized 11.12 8.59 0.00
Intermediate/Poor grade*radiation
Randomized 9.17 6.86 0.00
ER+/PR+*radiation randomized 6.11 6.27 0.08
c-statistic 0.83
Hosmer and Lemeshow Test
Hosmer-Lemeshow Chi-Square 5.26
Degrees of freedom 8
p-value 0.73

Balance Diagnostics: Comparison of individual characteristics and standardized differences in the weighted sample

Radiation Therapy (Nb=1,817) No Radiation Therapy (Nb=573) p-valuec Std. Diff. (%)
Mean, %a Mean, %a
Age at randomization (years) 58 55 - -
   Good 31% 37% 0.43 -12.6
   Intermediate/Poor 67% 59% 0.32 15.4
   Unknown 2% 4% 0.59 -8.9
ER/PR status
   ER+/PR+ 95% 97% 0.29 -14.6
HER2 negative 100% 100% 1.00 0.0
Tumor size 1.4 1.3 0.27 13.4
Oncotype DX® score 7.0 7.3 0.60 7.7
  1. % is calculated as the ratio of the frequency count for a single cell to the total count for the column that contains the cell in the weighted sample. The ratio is presented as a percentage. Age, tumor size and predicted Oncotype DX® scores represented weighted means.

  2. N represents the number of total observations in each group.

  3. F-test (or chi-square test, for categorical variables) for the significance of the difference between treated and untreated subjects


  1. U.S. Bureau of Labor Statistics. May 2016 National Occupational Employment and Wage Estimates United States. Washington, DC: U.S. U.S. Bureau of Labor Statistics. Occupational Employment Statistics. March 2017. https://www.bls.gov/oes/2016/may/oes_nat.htm Accessed December 18, 2017. In.

  2. Mamounas EP, Tang G, Fisher B, et al. Association Between the 21-Gene Recurrence Score Assay and Risk of Locoregional Recurrence in Node-Negative, Estrogen Receptor–Positive Breast Cancer: Results From NSABP B-14 and NSABP B-20. Journal of Clinical Oncology 2010;28(10):1677–1683.

  3. Paik S, Shak S, Tang G, et al. A multigene assay to predict recurrence of tamoxifentreated, node-negative breast cancer. N Engl J Med 2004;351(27):2817–26.

  4. Red Book Online. Greenwood Village, CO: Truven Health Analytics Inc. 2015.

  5. Little RJ, Rubin DB. Statistical analysis with missing data: John Wiley & Sons; 2014.

  6. Petkov VI, Miller DP, Howlader N, et al. Breast-cancer-specific mortality in patients treated based on the 21-gene assay: a SEER population-based study. Npj Breast Cancer 2016;2:16017.

  7. Maciosek MV, Xu X, Butani AL, et al. Smoking-attributable medical expenditures by age, sex, and smoking status estimated using a relative risk approach. Prev Med 2015;77:162–7.

  8. Rubin DB. Multiple imputation for nonresponse in surveys: John Wiley & Sons; 2004.

  9. Mahadevia PJ, Fleisher LA, Frick KD, et al. Lung cancer screening with helical computed tomography in older adult smokers: a decision and cost-effectiveness analysis. Jama 2003;289(3):313–22.

  10. McCullagh P, Nelder JA. Generalized Linear Models. 2nd ed: London: Chapman & Hall/CRC; 1989.

  11. Yao XI, Wang X, Speicher PJ, et al. Reporting and Guidelines in Propensity Score Analysis: A Systematic Review of Cancer and Cancer Surgical Studies. J Natl Cancer Inst 2017;109(8).

  12. Hughes KS, Schnaper LA, Berry D, et al. Lumpectomy plus tamoxifen with or without irradiation in women 70 years of age or older with early breast cancer. N Engl J Med 2004;351(10):971–7.

  13. Hughes KS, Schnaper LA, Bellon JR, et al. Lumpectomy plus tamoxifen with or without irradiation in women age 70 years or older with early breast cancer: long-term follow-up of CALGB 9343. J Clin Oncol 2013;31(19):2382–7.

  14. Fisher B, Bryant J, Dignam JJ, et al. Tamoxifen, radiation therapy, or both for prevention of ipsilateral breast tumor recurrence after lumpectomy in women with invasive breast cancers of one centimeter or less. J Clin Oncol 2002;20(20):4141–9.

  15. Fyles AW, McCready DR, Manchul LA, et al. Tamoxifen with or without breast irradiation in women 50 years of age or older with early breast cancer. N Engl J Med 2004;351(10):963–70.

  16. Winzer KJ, Sauerbrei W, Braun M, et al. Radiation therapy and tamoxifen after breastconserving surgery: updated results of a 2 x 2 randomised clinical trial in patients with low risk of recurrence. Eur J Cancer 2010;46(1):95–101.

  17. Sparano JA, Gray RJ, Makower DF, et al. Prospective Validation of a 21-Gene Expression Assay in Breast Cancer. New England Journal of Medicine 2015;373(21):2005–2014.

  18. Fisher B, Costantino J, Redmond C, et al. A Randomized Clinical Trial Evaluating Tamoxifen in the Treatment of Patients with Node-Negative Breast Cancer Who Have Estrogen-Receptor–Positive Tumors. New England Journal of Medicine 1989;320(8):479–484.

  19. Fisher B, Dignam J, Wolmark N, et al. Tamoxifen and chemotherapy for lymph nodenegative, estrogen receptor-positive breast cancer. J Natl Cancer Inst 1997;89(22):1673–82.

  20. Fisher B, Carbone P, Economou SG, et al. L-Phenylalanine Mustard (L-PAM) in the Management of Primary Breast Cancer. New England Journal of Medicine 1975;292(3):117–122.