Using Census Data to Understand County-Level Differences in Overall Drug Mortality and Opioid-Related Mortality by Opioid Type

Shannon M. Monnat, PhD; David J. Peters, PhD; Mark T. Berg, PhD; Andrew Hochstetler, PhD


Am J Public Health. 2019;109(8):1084-1091. 

In This Article


Mortality data came from the restricted-use death certificate files from the National Vital Statistics System, 2002 to 2004 and 2014 to 2016. These data identify causes of death and county of residence from all death certificates filed in the United States. We categorized drug deaths on the basis of International Statistical Classification of Diseases and Related Health Problems, Tenth Revision(Geneva, Switzerland: World Health Organization; 2011 [ICD-10]) codes, as any death that included an underlying cause of accidental poisoning, intentional poisoning, poisoning of undetermined intent by exposure to drugs, assault by drugs, drug-induced diseases, finding of drugs in the blood, and mental or behavioral disorders attributable to drugs. See Appendix A (available as a supplement to the online version of this article at for ICD-10 codes. We identified opioid deaths as those with an underlying cause reflecting drug poisoning along with any multiple cause of death opioid-specific code (T40.0–T40.4, T40.6) or any mental and behavioral disorder attributable to opioids (F11.0–F11.9).

We calculated separate rates for heroin, prescription opioids, synthetic opioids, and multiple cause (those that included 2 or more opioids). Because opioid deaths are known to be underreported on death certificates, with substantial geographic variation in underreporting,[7] we calculated a fifth measure for all drug overdoses minus those that involved an opioid. To smooth potentially large fluctuations from small changes in deaths in small population counties, we pooled deaths across 3-year periods.[22] Consistent with Centers for Disease Control and Prevention (CDC) methods, we then calculated age-adjusted rates with the direct method using the 2000 US standard population.

County-level demographic, socioeconomic, and labor market measures came from the 2008 to 2012 American Community Survey (ACS) and 2000 US Census. All census variables in the analysis are listed in Table 1. Using 2000 and 2008 to 2012 measures reduces risk of reverse causality bias and allows a lagged relationship between county-level conditions and mortality.

Metropolitan status came from the US Department of Agriculture Economic Research Service's Rural-Urban Continuum Codes, 2013. We collapsed the codes to create a binary indicator in which categories 1 to 3 indicated a metropolitan county and 4 to 9 indicated a nonmetropolitan county.

To account for opioid supply, we included county-level retail opioid prescribing from the CDC.[23] These data report retail opioid prescriptions dispensed per 100 persons. A limitation is that they excluded prescriptions from high-volume prescribing pain clinics (i.e., "pill mills"). Prescribing information was missing for between 6% and 13% of counties, depending on year. We imputed average missing prescribing values using a Markov chain Monte Carlo model with 500 imputations.[24] We used the average prescribing rate for 2009 to 2011 to capture years of peak prescribing, but we tested alternate models with prescribing rates for 2006 to 2008 and 2012 to 2014. Findings were relatively unaffected by these substitutions (Appendix G, available as a supplement to the online version of this article at County-level prescribing data before 2006 are not available.

Our analyses included 3079 US counties. We restricted analyses to the 48 conterminous states and Washington, DC. We merged Broomfield County, Colorado, (created in 2003) and 29 independent cities in Virginia with populations of less than 65 000 people into their respective counties. This reduced the potential for ACS measurement error.

We used multilevel negative binomial regression (random intercepts to account for nesting of counties in states) to model overall drug mortality counts for 2014 to 2016, offset by the log of the county population, as a function of county-level census characteristics. We standardized all variables before including them in regression models, enabling comparison of the relative strength of associations across different factors. We adjusted each model for county racial and age composition and population density. We did not attempt to determine the mechanisms through which these factors are associated with drug mortality. We simply aimed to show the relative importance of several census variables for understanding county-level differences in drug mortality.

Then, we identified opioid-specific mortality classes using latent profile analysis (LPA). LPA is appropriate for this study for several reasons.[25] First, we wanted counties to be represented in higher dimensional space (e.g., classes), and multidimensional scaling is usually limited to 2 dimensions or 2 classes. Second, we did not know the number of clusters a priori. LPA permitted us to identify the correct number of clusters on the basis of statistical tests instead of subjective criteria, as with hierarchical cluster analysis. Third, LPA permits posterior estimates of the probability of correct classification, allowing us to drop poorly fitted cases from classes.

We used LPA to create county classifications on the basis of age-adjusted opioid mortality rates for 2014 to 2016 and the change in rates between 2002 to 2004 and 2014 to 2016 using the opioid-specific mortality rates described (e.g., prescription opioids, heroin, synthetic, multiple), including rates for drug overdose deaths that did not specify an opioid on the death certificate.

We chose 2002 to 2004 as the base period because it captures the first wave of the opioid overdose crisis.[20] Moreover, using 2002 to 2004 instead of 1999 to 2001 (the first years for which data are available with ICD-10 codes) reduces endogeneity with our 2000 Census covariates. We chose 2014 to 2016 as the ending point, as this was the most current restricted mortality data available from the CDC at the time of analysis.

We normalized mortality rates using z-scores to remove scale differences and allow comparisons across classes. To minimize extreme scores that can result in classes with few cases, we Winsorized scores at the 0.5 and 99.5 percentiles, roughly corresponding to ±2.6 SD. We found evidence for 6 classes on the basis of fit indices and examination of latent class means (Appendix B, available as a supplement to the online version of this article at[26,27]

We compared the means of county-level census measures across the different opioid classes. We then modeled the probability of membership in each of the opioid classes as a function of the county-level covariates using multinomial logistic regression. Because there are strong correlations between several census variables, they are ideal for generating indices that capture latent constructs. We used exploratory factor analysis to identify which of the 34 census variables grouped together. Eigenvalues indicated that the first 4 factors, representing 27 of the census variables, collectively explained 70% of the variance. We created 4 factor-weighted indices that combined the variables loading highly (≥ 0.40) onto their respective factors. We standardized all factors (mean = 0; SD = 1). The variables included in each factor are presented in Appendix D (available as a supplement to the online version of this article at The first index, urban professional (α = 0.86), reflects high-growth urban counties with a large white-collar presence. The second index measures economic disadvantage (α = 0.90). The third index, blue-collar presence (α = 0.84), reflects a large presence of working-class and manual laborers. Service economy (α = 0.65) represents the final index. Maps showing the geographic distribution of these indices are presented in Appendix D.

Multinomial logistic regression models simultaneously included our 4 indices and controlled for racial/ethnic and age composition, opioid prescribing, and metropolitan status. Models adjusted SEs for the clustering of counties in states and were weighted by the probability of LPA class membership.

We used SAS version 9.4 (SAS Institute, Cary, NC) for descriptive statistics, Mplus version 7 (Muthen & Muthen, Los Angeles, CA) and SAS version 9.4 for latent class analysis, and StataMP version 15 (StataCorp LP, College Station, TX) for regression analysis.