The Role of Breastfeeding in Racial and Ethnic Disparities in Sudden Unexpected Infant Death

A Population-Based Study of 13 Million Infants in the United States

Melissa Bartick; Alexis Woods Barr; Lori Feldman-Winter; Mònica Guxens; Henning Tiemeier


Am J Epidemiol. 2022;191(7):1190-1201. 

Study Population

We performed a cohort study using data from the public data set of de-identified birth certificates and linked infant death certificates available online from the National Center for Health Statistics.[29] This registry includes data on US live births and infant deaths from all 50 states and the District of Columbia.

All 15,611,133 live-birth certificates and 14,155 linked SUID death certificates from 2015 through 2018 were downloaded, imported, and merged into 4 linked birth-cohort years, which were then combined into a single data set. We did not classify any infants with a SUID-defining International Classification of Diseases, Tenth Revision, code at less than 7 days of age as SUID, because alternative diagnoses could have been possible[30,31] (n = 428; 3.0% of SUID occurrences). Infants with missing breastfeeding data (n = 2,067,459; 13.2%) or who had missing or unknown race/ethnicity data (n = 146,333; 1.1%) or who were not in 1 of the 5 racial/ethnic groups studied (n = 304,610; 2.2%) were excluded (e.g., infants whose mothers identified as Pacific Islanders). Infants who died of congenital malformations (n = 14,662; 0.1%) or malignancy (n = 189; < 0.01%) were excluded. The final data set consisted of 13,077,880 live births and 11,942 linked SUID deaths (Figure 1). The study was deemed exempt by the Institutional Review Board of the Harvard University School of Public Health.

Figure 1.

Sample study flow, sudden unexpected infant death, United States, 2015–2018. SUID, sudden unexpected infant death.

Exposure and Outcome Definitions

Breastfeeding initiation status was extracted from the birth certificate data. Beginning in 2015, all 50 states had implemented the US Standard Certificate of Live Birth,[32] which included the question, "Is the infant breastfed at discharge?".[33] Despite this, births from Michigan and California lacked breastfeeding data from 2015 through 2018 and were omitted.[33–37] In 2016–2018, standard breastfeeding data were available for approximately 85% of births representing 48 states.[34–37]

The National Center for Health Statistics collects data on people by race and Hispanic ethnicity. Demographic groups studied for this report are defined as follows: NHB infants and mothers, NHW infants and mothers, non-Hispanic AI/AN infants and mothers, non-Hispanic Asian (Asian) infants and mothers, and Hispanic infants and mothers; this information is self-reported by parents on the birth certificates. These groups are put in this conventional order in National Center for Health Statistics and Centers for Disease Control and Prevention documents. It should be understood that these classifications are social constructs, not biological ones.[38] Also, the term "Hispanic" encompasses people from all races and a wide variety of countries and cultures with a wide variety of health outcomes. For purposes of this study, we are referring to the infants by their mothers' race/ethnicity. SUID was defined as any death with the following codes from the International Classification of Diseases, 10th Revision: R95 (SIDS), R99 ("ill-defined"), or W75 (accidental suffocation or strangulation in bed).[39]


The covariates were selected using a directed acyclic graph of the breastfeeding–SUID association (Web Figure 1) (available at and were extracted from the data set. We used 11 dichotomous variables: maternal age (<20 years vs. ≥20 years), antenatal smoking, marital status (unmarried vs. married), maternal nativity (US born vs. born abroad), maternal obesity (body mass index ≥30 vs. <30), high multiparity (≥4 children including index child vs. <4 children), primiparity, low birthweight (<2,500 g vs. ≥2,500 g), preterm birth (<37 weeks' gestation vs. ≥37 weeks'), late prenatal care (first visit at ≥4 months vs. <4 months), and insurance status (Medicaid/uninsured vs. all other insurance). In addition, we used a variable with 4 educational categories (Table 1). The insurance variable served as a proxy for socioeconomic status, combining those mothers who were uninsured (4.3% of mothers) with those who were receiving Medicaid, contrasting them to those with any other forms of insurance. For high multiparity, the cutoff of 4 children was chosen after examination of the data showed this number was neither exceedingly common nor rare. Nativity was not included in analyses for AI/AN infants, because there were no nonbreastfed infants who died of SUID who were the children of foreign-born AI/AN mothers.

Statistical Analysis

We used "not-breastfeeding" (i.e., never breastfed) as the exposure in most analyses and race/ethnicity as the exposure in other analyses. To test whether not-breastfeeding was associated with SUID, we performed multivariable logistic regressions adjusting for the covariates just described, stratified for racial/ethnic groups. In addition to calculating ORs, we calculated adjusted risk differences (AdjRDs). Confidence intervals (CIs) for AdjRDs were calculated using the delta method.

To formally test if breastfeeding mediated the effect of race/ethnicity on SUID, we performed a mediation analysis, with race/ethnicity as the exposure (NHW infants being the reference group because it is the largest) and with not-breastfeeding as the mediator for the outcome of SUID, stratified for racial/ethnic group of infants. Models were adjusted for the covariates described in the preceding section. The proportion mediated was calculated using the following formula: (NDE) × (NIE − 1)/(NDE × NIE − 1),,[40] where NDE is natural direct effect and NIE is natural indirect effect.[40] In addition to the formal mediation analysis, we performed separate multivariable logistic regressions for each side of the mediation triangle for each racial/ethnic group. The population used for each mediation analysis and related regressions was the group of infants from the respective racial/ethnic group combined with the NHW infants, rather than the entire population of infants.

Statistical analysis was performed using Stata, version 16 (StataCorp, College Station, Texas). Mediation analysis was conducted with the "paramed" module, which is used to perform causal mediation based on parametric regressions based on the Baron and Kenny procedure.[41] Missing data were imputed using simple single imputation, with a computer-generated random distribution of the covariate among the missing subjects in the same proportion as its distribution among the nonmissing subjects. For education, single imputation was performed using the dummy variable method.