Fruit and Vegetable Consumption and Risk of Endometriosis

H. R. Harris; A. C. Eke; J. E. Chavarro; S. A. Missmer


Hum Reprod. 2018;33(4):715-727. 

In This Article

Materials and Methods

Study Population

The Nurses' Health Study II (NHS II) is an ongoing prospective cohort that was established in 1989 when 116 429 female registered nurses, aged 25–42 years, completed a baseline questionnaire that collected information on demographic and lifestyle factors, anthropometric variables and disease history. Follow-up questionnaires are sent biennially to participants with questions updating the information on incident disease risk factors. Further details on the study have been provided elsewhere (Solomon et al., 1997).

Follow-up for the current analyses began in 1991, when 97 527 NHS II participants returned the dietary assessment, and concluded in 2013. We excluded participants who had an implausible total energy intake (<800 or >4200 kcal/day) or left more than 70 food items blank on the 1991 food frequency questionnaire (FFQ) (n = 2356). Participants were also excluded if they reported a diagnosis of endometriosis (n = 5442), history of infertility (n = 10 975), or a cancer diagnosis (other than non-melanoma skin cancer) (n = 1221) prior to June 1991. The analytical cohort was limited to women who were premenopausal and had intact uteri as endometriosis rarely occurs incidentally among postmenopausal women or subsequent to a hysterectomy. After these exclusions, 70 835 premenopausal women with dietary information remained.

Ethical Approval

This study was approved by the Institutional Review Boards of the Harvard School of Public Health and Brigham and Women's Hospital, Boston, MA, USA. Implied consent was assumed upon completion and return of the questionnaires.

Dietary Assessment

Diet was assessed in 1991, 1995, 1999, 2003, 2007 and 2011 using a FFQ listing over 130 food items. Participants were asked how often, on average, they had consumed each type of food or beverage during the previous year. Nine responses were possible, ranging from never or less than once per month to six or more times per day. Participants were also asked to report whether they used other nutrient supplements, and to provide the brand and dose. Intakes of the nutrients of interest, were calculated by multiplying the portion size of a single serving of each food by its reported frequency of intake, then multiplying the total amount consumed by the nutrient content of the food, and summing the nutrient contributions of all food items using the US Department of Agriculture food composition data (Nutrient Data Laboratory ARS, 1999), while also taking dietary supplements into account.

The reproducibility and validity of the FFQ has been reported elsewhere (Salvini et al., 1989; Willett et al., 1985; Yuan et al., 2017). The FFQ has been shown to provide valid estimates of fruit, vegetable, and nutrient intake with deattenuated correlation coefficients for fruits and vegetables between the FFQ and 1-week diet records ranging from 0.16 for yellow squash to 0.80 for apples. The coefficients for most fruits and vegetables were above 0.40 (Salvini et al., 1989). Retinol activity equivalents (RAE), alpha-carotene, beta-carotene, lutein/zeaxanthin, lycopene and beta-cryptoxanthin had deattenuated correlation coefficients ranging from 0.57 to 0.72 (Yuan et al., 2017). Carotenoid intake has also been validated using plasma carotenoid levels with reported correlations between dietary intake and blood levels of 0.27 for beta-carotene, 0.48 for alpha-carotene, 0.32 for beta-cryptoxanthin, 0.21 for lycopene and 0.27 lutein (Michaud et al., 1998). Intakes of all nutrients were adjusted for total energy intake using the residual method (Willett, 2013).

Ascertainment and Definition of Endometriosis

Starting in 1993, participants were asked on each biennial questionnaire if they had 'ever had physician-diagnosed endometriosis', and, if so, the date of diagnosis and whether it had been confirmed by laparoscopy. The validity of self-reported endometriosis in this cohort has been described previously (Missmer et al., 2004). Briefly, a diagnosis of endometriosis was confirmed by medical records in 96% of those who reported laparoscopic confirmation. However, a review of the medical records of those without laparoscopic confirmation indicated a clinical diagnosis of endometriosis in only 54%. In addition, a diagnosis of endometriosis at the time of hysterectomy was confirmed in 80% of the cases, but endometriosis was the primary indication for hysterectomy in only 6% of those for whom an indication was available. Therefore, in order to minimize the magnitude of misclassification and prevent confounding by indication for hysterectomy, we restricted our definition of incident diagnosis of endometriosis to women who reported laparoscopic confirmation of their diagnosis.

Due to the complex relation between endometriosis and infertility within this restricted case definition, we examined risk factors by two 'subtypes' of endometriosis: women who never reported infertility (those with no past or concurrent infertility), and women with concurrent infertility. At baseline, the prevalence of infertility (defined as attempting to become pregnant for >1 year without success) was greater among women with laparoscopic confirmation (20%) than among those who were clinically diagnosed without laparoscopic confirmation (4%). This may result in the over-sampling of those with otherwise 'asymptomatic' endometriosis and also those who may have altered their diet due to infertility prior to enrollment. While pelvic pain information is not available in the NHS II, endometriosis case women with infertility will have a greater prevalence of being asymptomatic in terms of pelvic pain compared to those who never experienced infertility, because during this time period most underwent an exploratory laparoscopy to identify the cause of their infertility during which the endometriosis was discovered. Because endometriosis with infertility may have a higher prevalence of asymptomatic disease secondary to other primary causes of infertility, the etiology, and thus risk factors, for endometriosis with infertility could differ from those for endometriosis without concurrent infertility.

Statistical Analysis

Participants contributed follow-up time from the return of the 1991 questionnaire until self-report of laparoscopically confirmed endometriosis diagnosis, diagnosis of any cancer (except non-melanoma skin cancer), death, loss to follow-up, hysterectomy, menopause or until return of the 2013 questionnaire, whichever occurred first. In addition, women were censored at time of self-report of infertility, because infertility in this population is strongly correlated with diagnosis of endometriosis via laparoscopy. Therefore, the person-time denominator for the incidence rate consists of women with neither diagnosed endometriosis nor infertility.

We used Cox proportional hazards regression models with age and questionnaire period as the time scale to estimate incidence rate ratios (RR), and 95% CI using the lowest category of each food or nutrient intake as the reference. We examined the possibly non-linear relation between intake of selected fruit and vegetable groups and endometriosis with restricted cubic splines (Durrleman and Simon, 1989). In addition, as the temporal relation between these foods/nutrients and risk of endometriosis is uncertain, dietary intake was examined multiple ways: baseline intake (1991 FFQ), varying lag-time intake and cumulative average intake. The cumulative average method captures long-term dietary intake and reduces measurement error due to within-person variation over time (Hu et al., 1999). The varying lag-time intake allows us to examine dietary intake closer to endometriosis onset as there is often a lengthy delay between the emergence of clinical symptoms and definitive diagnosis. We examined lag times of 2–4 (simple update), 4–6 and 6–8 years. For example, for a lag time of 2–4 years before diagnosis we used dietary intake from the 1991 questionnaire for an endometriosis diagnosis reported from June 1991 to June 1995, intake from 1995 for diagnosis from 1995 to 1999, and so forth. For a lag time of 4–6 years before diagnosis, we used dietary intake reported on the 1991 questionnaire for diagnosis from 1995 to 1999, intake from 1995 for diagnosis from 1999 to 2003, and so forth. For a lag time of 6–8 years we used dietary intake from 1991 for a diagnosis from 1997 to 2001, and intake from 1995 for follow-up from 2001 to 2005. The cumulative average method was used in all analyses except when comparing analytic approaches.

Total caloric intake was included in both age-adjusted and multivariable models (Willett, 2013). Multivariable models were further adjusted for the following potential confounders: age at menarche, parity, length of menstrual cycle and BMI as these factors have previously been associated with endometriosis risk. Covariates were updated throughout the analysis whenever new information was available from the biennial questionnaires. Missing data were handled via the missing indicator method, with categories created for missing data included in the regression model (Miettinen, 1985). Income and marital status were also evaluated as potential confounders but did not materially influence the effect estimates so were not included in the final models. Tests for linear trend for the exposures of interest were performed by assigning the median value of each category to all participants in that group. Tests for heterogeneity comparing the effect estimates among endometriosis cases groups were calculated with a Wald statistic for each food/nutrient group.

We assessed the association for each food/nutrient group by smoking status, as previous studies have suggested that smoking may modify the effect of antioxidant intake, and smoking has been observed to modify the effect of fat intake on endometriosis risk within this cohort (Missmer et al., 2010). Participants were classified as ever or never smokers based on biennial questionnaire data. While smoking status data abstracted from medical records is often quite poor, self-reported smoking status, including among NHS participants, has been demonstrated previously to be highly reliable and valid (Al-Delaimy et al., 2002; Patrick et al., 1994). Effect modification was assessed with a likelihood ratio test that compared the model with the cross-product term between the exposure variable and smoking status with the model with main effects only. All tests of statistical significance were two-sided and all statistical analyses were performed using SAS Version 9.4 (SAS Institute Inc, Cary, NC, USA).