Vitamin D, DNA Methylation, and Breast Cancer

Katie M. O'Brien; Dale P. Sandler; Zongli Xu; H. Karimi Kinyamu; Jack A. Taylor; Clarice R. Weinberg


Breast Cancer Res. 2018;20(70) 

In This Article


Study Sample

The Sister Study is a prospective cohort study of 50,884 US women (2003–2009).[40] At baseline, participants were 35–74 years old and had a sister who had been diagnosed with breast cancer but who had never had breast cancer themselves. Each completed a computer-assisted telephone interview, with in-home collection of anthropometric measurements and blood samples. Participants remain under active surveillance, with more than 90% responding to their most recent follow-up request through March 2015 (data release 4.1). When possible, we collected medical records from self-reported breast cancer cases (82%). Among those with medical records available, 99% of self-reported diagnoses were confirmed.

Participants for a DNA methylation substudy were previously sampled using a case-cohort design.[41,42] To minimize genetic variation due to racial heterogeneity, this sample was limited to non-Hispanic white women, including all such women who had available blood samples and a self-reported diagnosis of invasive breast cancer or ductal carcinoma in situ. The initial methylation sample included 1542 women who developed incident breast cancer between enrollment and March 2015, and a random sample of 1336 women drawn from the full cohort, 74 of whom developed breast cancer by March 2015.

The participants for our previous analysis of serum 25(OH)D and breast cancer[3] were selected to overlap with the case-cohort sample who had DNA methylation data. However, when looking at methylation and 25(OH)D together, we excluded 429 participants who did not have 25(OH)D measured and 102 participants with quality control-related concerns with regard to their DNA methylation (described below). In the end, we had 1070 cases and 1277 in the subcohort (46 of whom were also cases) who had both DNA methylation and serum 25(OH)D data available. All women provided written informed consent and the study was approved by the institutional review boards of the National Institute of Environmental Health Sciences and the Copernicus Group.

Serum 25(OH)D Assessment

Baseline serum was stored at −80 °C before being analyzed using liquid chromatography-mass spectrometry (LC/MS) at Heartland Assays, Inc. (Ames, IA). The three 25(OH)D metabolites—25(OH)D3, 25(OH)D2, and 3-epi-25(OH)D3—were assessed individually, but we summed their concentrations to estimate total 25(OH)D. We adjusted total 25(OH)D values for batch effects using a random effects model and for season of blood draw using LOESS regression. Further details are provided elsewhere.[3]

Methylation Analysis

We assessed DNA methylation at 485,512 CpGs (450 K HumanMethylation Beadchip; Illumina, Inc.) using whole blood samples collected from case-cohort participants. Briefly, we extracted 1 μg genomic DNA from whole blood and conducted bisulfite-conversion using the EZ DNA Methylation Kit (Zymo Research, Orange County, CA). Methylation analysis was carried out at the Center for Inherited Disease Research at Johns Hopkins University (Baltimore, MD). Data processing and quality control assessments were completed using the 'ENMIX' package (R v3.2.1),[43] and included correcting fluorescent dye-bias,[44] quantile normalization,[45] and reduction of background noise. We excluded 102 participants whose sample had > 5% low-quality methylation values, low average bisulfite intensity, or implausible methylation value distributions (final n = 1277 in subcohort and 1024 additional cases, as described above, plus 123 duplicate samples). We excluded CpGs if they were Illumina-designed single nucleotide polymorphism (SNP) probes, on the Y chromosome, had > 5% low-quality data, were within 2 base pairs of a common SNP, or had multimodal distributions. This left us with 423,500 CpGs. For each site, we calculated a β value based on each individual's proportion of unmethylated (U) and methylated (M) sites at a given locus: β = M/(U + M + 100).

As interperson variability can be low at some CpGs, we conducted additional screening to better ensure the reliability of our results. We calculated intraclass correlation coefficients (ICCs) to compare the technical variation (within-subject variability, assessed using duplicate samples) to the biologic variation (between-subject variability).[46] We observed that, for approximately 66% of CpGs, the ICC was less than 0.5, suggesting that there is little interindividual variability and some of the corresponding observed associations may not reflect true biologic differences. We have flagged these CpGs in our results.

Candidate Gene Selection

Candidate genes included VDR and RXRA, as well as the vitamin D binding protein gene (GC), and genes directly involved in vitamin D metabolism (DHCR7/NADSYN1, CYP24A1, CYP27B1, and CYP2R1). We selected any CpGs included on the 450 K HumanMethylation Beadchip (Illumina, Inc.) located within 2000 base pairs from the candidate gene's transcription start and end sites, as defined by University of California Santa Cruz Genome Browser (GRCh37/hg19; RefSeq notation).[47] We identified 198 eligible CpGs.

Statistical Analysis

25(OH)D and methylation of vitamin D-related genes in the subcohort. We assessed the relationship between serum 25(OH)D (continuous, ng/mL) and methylation (continuous, measured as the logit of β) at each of 198 CpGs in or near vitamin D-related genes using robust linear regression with M-estimation. This analysis was limited to the 1270 individuals in the subcohort who had complete information for the following covariates: age at blood draw (continuous), BMI (continuous; kg/m2), current smoking status (dichotomous), and alcohol use (never/former drinker, current drinker < 1 drink/day, or current drinker ≥ 1 drink per day). In addition to these covariates, we also adjusted for cell type proportions (CD8 T cells, CD4 T cells, natural killer cells, B cells, monocytes, or granulocytes versus other).[48]

25(OH)D-methylation interaction and breast cancer risk in the case-cohort. Next, we used the case-cohort sample to examine whether interactions between serum 25(OH)D and methylation of vitamin D-related genes were related to breast cancer incidence. This included an assessment of the relationship between methylation at each of the CpG sites in or near vitamin D-related genes and risk of breast cancer. For both sets of analyses, we used Cox proportional hazards models to account for the case-cohort design.[41,42] We adjusted for age at blood draw, BMI, smoking status, alcohol use, and cell type proportions, as well as education, current hormone therapy use and type, current hormonal birth control use, menopausal status, usual physical activity, history of osteoporosis, parity, and a BMI-menopausal status interaction term. For these candidate CpG locus analyses, we considered p < 0.05 to be statistically significant.

For the interaction analysis, the effect measures of interest were ratios of hazard ratios (RHRs). Here, the numerator of the RHR is the hazard ratio (HR) for the association between methylation (measured as 0.1 increments of logit(β)) and breast cancer among those with 25(OH)D levels > 38.0 ng/mL, and the denominator of the RHR is the HR for the association between methylation and breast cancer among those with 25(OH)D levels ≤ 38.0 ng/mL). Therefore, RHR values > 1.00 correspond to a higher estimated HR for the methylation-breast cancer association among those with 25(OH)D levels > 38.0 ng/mL and values < 1.00 correspond to a higher estimated methylation-breast cancer HR among those with 25(OH)D levels ≤ 38.0 ng/mL. The 25(OH)D cut-point was selected based on previous evidence that 38.0 ng/mL is relevant for predicting breast cancer risk.[3] These models also included all of the baseline covariates listed above for the methylation-breast cancer association analysis.

Epigenome-wide association study of 25(OH)D in subcohort or cases. We examined the association between serum 25(OH)D and DNA methylation in the subcohort for all 423,500 CpGs from the 450 K panel that passed quality control checks. Here, we corrected for multiple comparisons by calculating false discovery rate q values,[49] considering those with q < 0.05 to be likely to be true positives.

We next assessed the relationship between 25(OH)D and DNA methylation in an independent sample of participants who developed breast cancer within 5 years of enrollment, who were not part of the subcohort, and had the required covariate information ("cases"; n = 1024). Here, our goal was to identify CpGs where the 25(OH)D-methylation association differed by future breast cancer status. We compared the subcohort and case results by plotting the –log10p values multiplied by the direction of each tested association. We then calculated critical values for a test of the combined p values based on Fisher's method.[50] CpGs that had combined p values below identified thresholds were included in additional interaction analyses using the methods described above.