Racial Disparities in Triple Negative Breast Cancer

Toward a Causal Architecture Approach

Scott D. Siegel; Madeline M. Brooks; Shannon M. Lynch; Jennifer Sims-Mourtada; Zachary T. Schug; Frank C. Curriero


Breast Cancer Res. 2022;24(37) 

In This Article



Patient records came from the Helen F. Graham Cancer Center and Research Institute (HFGCCRI) cancer registry, a part of the Christiana Care Health System and based in New Castle County, Delaware. The HFGCCRI provides care to an average of more than 600 breast cancer cases annually. As detailed elsewhere, the HFGCCRI breast cancer population accounts for 85% of all cases from the surrounding county and are representative of the county population of cases in terms of age, race, receptor status, and stage.[45]

Study Population

This study population consisted of 3316 adult female New Castle County residents who were diagnosed with invasive breast cancer between the years of 2012 and 2020. To better understand Black–White disparities, the population was limited to women who self-reported as either Black (n = 776) or White (n = 2540), regardless of ethnicity. The time frame was selected to maximize the number of breast cancer cases where the subtype markers necessary for classifying patients with TNBC were routinely documented in the cancer registry. Patient residential address, demographic, insurance payer, and clinical data were abstracted from the registry. Patient addresses were manually cleaned and geocoded using ArcGIS 10.8,[46] yielding a match rate of 95% (3316/3484). Of the 168 unmatched records, 114 geocoded to another county, two geocoded to out of state, 47 had PO box addresses, three had missing address information, and two could not be located. Unmatched patients did not significantly differ from matched patients by age, race, ethnicity, stage, subtype, or insurance payer.

Patient Measures

Demographic measures included age at diagnosis, race, and insurance payer status, which were all directly abstracted from the HFGCCRI cancer registry. Insurance payer status (private/commercial, Medicaid, Medicare, none, or unknown) was used as a proxy for access to health care and socioeconomic status.[47] Clinical measures included breast cancer stage and receptor status. Cases were classified into 'TNBC' when the receptors for estrogen (ER), progesterone (PR), and human epidermal growth factor 2 (HER2) were all known negative; all other invasive cases were classified as 'Non-TNBC.'

Census Tract Measures

New Castle County is subdivided into 130 census tracts, which provide stable geographic units for reporting population statistics.[48] All census tract sociodemographic data were obtained from the US Census Bureau's American Community Survey 2014–2018 5-year estimates.[49] ICE-Income, -Race, and -Income/Race metrics were calculated for all New Castle County census tracts according to the following general formula:[24,25]

where Ai is the number of advantaged persons in a census tract, Di is the number of disadvantaged persons in a census tract, and Ti is the total population in the census tract i. For ICE-Income, advantaged and disadvantaged were defined as households with income ≥ $125,000 or < $20,000. For ICE-Race, advantaged and disadvantaged were defined as non-Hispanic White and non-Hispanic Black. For ICE-Race/Income, advantaged and disadvantaged were defined as non-Hispanic White households with income ≥ $125,000 and non-Hispanic Black households with income < $20,000. ICE values for geographic units range from − 1, indicating that 100% of the population can be classified into the most disadvantaged group, to + 1, indicating that 100% of the population can be classified into the most advantaged group. All ICE measures were classified into quintiles based on their distribution within New Castle County, setting Q5 (most advantaged) as the reference group.

Area-level measures were used to estimate the potential impact of environmental or neighborhood factors on rates of obesity and unhealthy alcohol use, similar to the conceptualization of 'obesogenic' environments.[50] Census tract prevalence measures of obesity and disordered alcohol use were generated from Christiana Care Health System electronic health record (EHR) data for 20,310 unique adult New Castle County residents who were admitted to an inpatient unit between July 1, 2018 and June 30, 2019, regardless of admitting diagnosis or demographics. Previous work has shown that such measures generated from inpatient data are generally representative of risk factor prevalence among New Castle County census tracts.[51] International Classification of Diseases (ICD) diagnosis codes abstracted from the EHR for obesity and alcohol use disorder (AUD) were used to categorize patients into 'obese' or 'not obese' and 'AUD' or 'no AUD' categories. Consistent with clinical guidelines,[52] obesity was defined as a BMI of ≥ 30. AUD diagnoses were made by treating physicians and based on the Diagnostic and Statistical Manual of Mental Disorders (5th edition)[53] criteria, which assess clinically significant, unhealthy patterns of use (e.g., large quantities, cravings, tolerance, withdrawal). Patient addresses were manually cleaned and geocoded using ArcGIS 10.6, yielding a match rate of 98% (20,310/20,706). Patient-level data on obesity and alcohol use were not available for the breast cancer study population.

Census tract measures of fast-food restaurants and alcohol retailers in New Castle County were produced from commercial data and publicly available records. Fast-food retailer data were obtained from SICCODE.com, utilizing the North American Industry Classification System (NAICS) code 722513,[54] consistent with established approaches.[55] Alcohol retailer data were drawn from a public state business license database that was current as of April 17, 2019.[56] Guided by studies that have more reliably observed a relationship between disordered alcohol use and residential exposure to off-premise alcohol retailers (e.g., liquor stores), but not on-premise alcohol retailers (e.g., bars),[57] we included only off-premise retailers. All retail locations were geocoded using ArcGIS 10.8[46] with a match rate of 100% (fast-food retailer N = 221, alcohol retailer N = 160).

Statistical Analyses

Spatial data management and statistical analyses were performed in the R Statistical Computing Environment using various packages.[58–63] Descriptive and bivariate statistics, and post hoc tests with Bonferroni-adjusted p-values, were used to compare TNBC versus Non-TNBC patient groups by the sociodemographic, clinical, and ICE variables derived from patient and census tract measures.

Multilevel logistic regression models were used to examine the odds of TNBC (vs. Non-TNBC) before and after adjusting for patient (level-1) and census tract (level-2) variables. The multilevel logistic regression model included a census tract-level random effect to account for the clustering of patients within tracts. Patient-level variables included age at diagnosis, race (Black, White), and insurance (commercial, Medicaid/none). Tract-level variables included the ICE-Race, -Income, and -Race/Income quintiles.

Three univariate and multivariate models tested each ICE measure separately, with all models adjusting for patient-level age at diagnosis and race. Additional multivariate models tested cross-level interactions between patient-level race and tract-level ICE quintiles. Based on the results of these models, details of which are provided in the results, multivariate logistic regression models were stratified by Black and White race to examine differential effects of tract-level ICE-Race on odds of TNBC after adjustment for age of diagnosis. Odds ratios and 95% confidence intervals were reported; p-values less than 0.05 were significant.

The spatial covariation of TNBC and ICE measures were visualized using bivariate choropleth maps. First, breast cancer patients were aggregated to their census tract of residence to create tract-level measures of the percentage of patients with TNBC. The % TNBC and ICE values were separated into quintiles based on their respective tract-level distributions within New Castle County. For the ICE measures, quintiles were coded such that lower ICE values (representing greater disadvantage) correspond to higher quintiles representing greater relative disadvantage. The quintiles of % TNBC and ICE were combined to create 5 × 5 classification systems that denote whether census tracts are relatively low, moderate, or high in each value. The resulting 25 classification values were symbolized using color and saturation to simultaneously show variation in both measures. For ease of visualization, only the highest/lowest quintile extremes of the classification system (low/low, low/high, high/low, and high/high) were colored in the maps. Geocoding and final map preparations were conducted in ArcMap 10.8.[46]

To begin to characterize place-based systems of exposure related to metabolic risk factors for TNBC, descriptive tables were created where census tracts were classified according to their quintiles of TNBC and ICE-Race (low/low, low/high, high/low, and high/high) that were visualized in the bivariate choropleth map. Population data from the American Community Survey were used to describe race, poverty, and education levels for the census tract groups.[49] Tract-level data on systems of exposure included alcohol and fast-food retailers, as well as prevalence of AUD and obesity. Supplemental bar charts show the variation of alcohol retailers, fast-food retailers, AUD prevalence, and obesity prevalence by census tract ICE quintiles.