Association of Atmospheric Particulate Matter and Ozone With Gestational Diabetes Mellitus

Hui Hu; Sandie Ha; Barron H. Henderson; Tamara D. Warner; Jeffrey Roth; Haidong Kan; Xiaohui Xu


Environ Health Perspect. 2015;123(9):853-859. 

In This Article

Materials and Methods

Study Population

We obtained birth record data from the Bureau of Vital Statistics and Office of Health Statistics and Assessment, Florida Department of Health (Jacksonville, FL; The data included all registered live births in Florida between 1 January 2004 and 31 December 2005 (n = 445,028). Births with maternal residential addresses outside Florida (n = 4,672) were excluded. We used ArcGIS V10.1 software (ESRI, Redlands, CA, USA) to geocode the mother's residential address at birth, and 439,370 cases (99.8%) were successfully geocoded. Cases whose maternal residential address could not be geocoded were excluded (n = 986). We further excluded 937 cases because of missing values related to gestational age. In addition, we excluded women who had non-singleton deliveries (n = 13,367), previous preterm births (n = 5,591), or prepregnancy diabetes mellitus (n = 2,821). Births with congenital abnormalities (n = 5,450), with weight < 400 g (n = 240), or with a gestational age < 24 or > 42 weeks (n = 697) were also excluded. Following these exclusion criteria a total of 410,267 women remained in the study population. The research protocol for this study was approved by the Institutional Review Board at the University of Florida and the Florida Department of Health. The study was exempt from informed consent requirements because it involves no more than a minimal risk to the privacy of individuals and the research could not practicably be conducted without this exemption.

Outcome Assessment

All pregnant women in Florida are requested to screen for GDM through an oral glucose challenge test (OGCT) between the 24th and 28th weeks of the pregnancy. This test requires each pregnant woman to drink about 5 oz of a syrupy glucose solution that contains 50 g of sugar and then have her blood drawn 1 hour after drinking the solution. If a blood glucose level reaches > 140 mg/dL 1 hr after the OGCT, it indicates the possibility of GDM. Then the pregnant woman is further referred to another 3-hr fasting 100-g oral glucose tolerance test (OGTT). The test measures fasting blood glucose level and blood glucose levels at 1, 2, and 3 hr after drinking the solution. The following values are considered to be abnormal during the OGTT: fasting blood glucose level ≥ 95 mg/dL, 1-hr blood glucose ≥ 180 mg/dL, 2-hr blood glucose ≥ 155 mg/dL, and 3-hr blood glucose ≥ 140 mg/dL. Pregnant women are classified as having GDM if two abnormal values are recorded during the OGTT (American Diabetes Association 2003).

Air Pollution Exposure Assessment

Air pollution exposure data was obtained from the U.S. EPA and CDC's National Environmental Public Health Tracking Network (2003–2005) (U.S. EPA 2014). The U.S. EPA provided the HBM data from 2001 to 2008 for two air pollutants, PM2.5 and O3, with spatial resolutions of 12 km × 12 km and 36 km × 36 km across the continental areas in the United States. Daily air pollution concentration for each grid was also included. Compared with the widely used air monitoring data from the U.S. EPA's Air Quality System (AQS;, the HBM data could provide pollutant values at unobserved locations across the entire spatial field of interest. The U.S. EPA has used two important advanced methods, the Community Multiscale Air Quality (CMAQ) model and the HBM (McMillan et al. 2010), to produce the interpolated concentrations of air pollutants in space and time. The HBM approach combines the AQS monitoring data with CMAQ modeled data, which include emission, meteorology, and chemical modeling components, to predict air quality data for a specific time and spatial scale (McMillan et al. 2010). Given the limited and sparsely located air monitors in Florida, we used the 12-km grid output from the HBM data, which can account for the poor spatial coverage of air monitoring data.

Each mother's geocoded residential address at the time of her child's birth was spatially linked to the corresponding grid of the HBM data. Exposures were calculated as daily concentrations averaged over each of the first two trimesters (trimester 1: 1–13 weeks; trimester 2: 14–26 weeks) and the full gestational period determined by gestational age and delivery date of each woman. Gestational age was determined mainly by ultrasound. When ultrasound data were not available, clinical examination or last menstrual period was used to estimate gestational age.


Information on maternal characteristics such as age, race/ethnicity, marital status, pregnancy smoking status, season and year of conception, and prenatal care status was obtained directly from the births records. Maternal age at delivery was categorized into six groups, with 5-year increments for women 20–40 years old, as well as two additional groups for < 20 and ≥ 40 years old. Race/ethnicity was categorized as non-Hispanic white, non-Hispanic black, Mexican American, Puerto Rican, Cuban American, Haitian American, and others. In addition, a dichotomous variable was used to indicate marital status. Maternal education was divided into three categories: < high school, high school or equivalent, and > high school. Pregnancy smoking status was categorized into three levels based on self-reported number of cigarettes smoked per day during pregnancy: nonsmokers, smokers with < 10 cigarettes/day, and smokers with ≥ 10 cigarettes/day. Season [warm (June–November) or cool (December–May)] and year (2003, 2004, or 2005) of conception were also treated as categorical variables. Prenatal care status was categorized into five groups: no care, began in first trimester, second trimester, or third trimester, as well as an additional group for subjects with missing values. Furthermore, we extracted census block group–level median household income from the 2000 Census (, and linked it to each woman. Household income was categorized into quartiles (< US$29,663, US$29,663–US$38,056, US$38,056–US$49,375, and ≥ US$49,375). We also obtained cartographic boundary file for urban areas from the 2000 Census to determine the urbanization status (urban or rural) where each woman lived. No information was available on other risk factors for GDM such as maternal prepregnancy BMI, family history of type 2 diabetes, and low physical activity.

Statistical Analysis

We examined the distribution of categorical covariates and continuous exposures between women with GDM and those without GDM. Logistic regression models were used to investigate the association between exposure to air pollution during different trimesters of pregnancy and risks of GDM. Subjects with missing values of maternal age (n = 45), race/ethnicity (n = 6), education (n = 3,821), or marital status (n = 83) were excluded, leaving 13,943 women with GDM out of a total of 406,334 women with complete covariate data. PM2.5 and O3 were analyzed as continuous variables. Both an unadjusted model and an adjusted model controlling for maternal age, race/ethnicity, education, marital status, prenatal care, season and year of conception, urbanization, and median household income at census block group level were used. Odds ratios (ORs) and 95% confidence intervals (CIs) (per 5-μg/m3 increase in PM2.5 or per 5-ppb increase in O3) were reported for each pollutant during specific pregnancy periods. Co-pollutant logistic models were also implemented to evaluate potential confounding by co-pollutants.

Sensitivity Analyses. We conducted several sensitivity analyses to test the robustness of our results. First, to account for the potential bias created by using an indicator for missing data of prenatal care, we conducted multiple imputation for all missing data using chained equations (White et al. 2011). All covariates as well as exposure and outcome variables were included in the imputation process, and 50 imputed data sets were generated. Second, to account for the potential underdiagnoses of GDM, we assumed an underreported rate of 0.5% and 1.0% among women without GDM, and simulated data sets were generated by randomly assigning 0.5% and 1.0% of subjects without GDM as GDM cases with 500 repeats using the Monte Carlo method. Then we made the comparisons between the results from the simulated data and our original results to check whether the underdiagnosed cases have influenced the observed effects. Third, to account for the potential misclassification of exposure, we performed two sets of sensitivity analyses. In the first set of capture-area analyses, only women living within 5 mi from any AQS monitors were included, and two separated analyses were conducted for all eligible women and only for eligible women with nonmissing data for at least 75% of days. In the second set of analyses, we used interpolated 1-km × 1-km data for the exposure assessment. To create the 1-km × 1-km exposure field, we applied a bicubic spline to the 12-km × 12-km gridded HBM product and output on a 1-km × 1-km grid that included the original 12-km vertices. This approach provides finer resolution, but cannot reproduce sub–12-km concentration peaks or troughs. Fourth, we performed the analyses without adjusting for season of conception to account for the possibility that conception season may adjust away all seasonal influences on the variation in the pollutants such that only spatial differences were left, which might be much more easily confounded by socioeconomic status (SES)–related factors. We also performed the analyses after additionally adjusting for smoking during pregnancy. Finally, to account for the potential overadjusting of urbanization due to its correlation with air pollutants, we performed a stratified analyses by urban–rural areas. All statistical analyses were conducted using SAS V9.3 (SAS Institute Inc., Cary, NC, USA).