Effect of the Combined Oral Contraceptive Pill and/or Metformin in the Management of Polycystic Ovary Syndrome

A Systematic Review With Meta-Analyses

Helena Teede; Eliza C. Tassone; Terhi Piltonen; Jaideep Malhotra; Ben W. Mol; Alexia Peña; Selma F. Witchel; Anju Joham; Veryan McAllister; Daniela Romualdi; Mala Thondan; Michael Costello; Marie L. Misso


Clin Endocrinol. 2019;91(4):479-489. 

In This Article


This systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement checklist[15] and was prepared to inform clinical practice recommendations in the updated and expanded evidence-based guideline for the assessment and diagnosis of PCOS.[7–10] The rigorous methodology used for the development of the guideline is aligned with the National Health and Medical Research Council (NHMRC),[16] ESHRE[17] and GRADE methods[18] and has been described in detail in the resulting final guideline.[7]

Seven hundred clinical and academic opinion leaders and consumers worldwide participated in a five-round Delphi exercise to identify and prioritize key clinical questions in PCOS care.[14]

Here, we present the evidence for two clinical questions which were included in the international guideline:[7]

  1. Is the oral contraceptive pill alone or in combination effective for management of hormonal and clinical PCOS features in adolescents and adults with PCOS?

  2. Is metformin alone or in combination effective for management of hormonal and clinical PCOS features and weight in adolescents and adults with PCOS?

For these two questions, comparisons (as determined through the Delphi question development and prioritization process) and "in combination" include any of the following: lifestyle, anti-androgens or anti-obesity agents.

Selection Criteria

The Population, Intervention, Comparison, Outcome (PICO) framework was used to guide the selection criteria for each clinical question, and these were developed a priori by the multi-disciplinary guideline development group (see Appendix S1). Briefly, the population of interest was all females with PCOS (diagnosed by Rotterdam, NIH or AES) of any ethnicity, weight and age, incorporating a subgroup of adolescents (10-19 years); eligible interventions included oral contraceptive pill alone or in combination with metformin, lifestyle, anti-androgens and anti-obesity agents to address the first question, and for the second question, metformin alone or in combination with lifestyle, OCP, anti-androgens and anti-obesity agents. Eligible comparisons included placebo or any other eligible intervention (listed above) or combinations of those. Using the GRADE prescribed method and scale,[18] outcomes were prioritized as follows: critical for making a decision about the intervention (score 7-9), important (score 4-6) or of limited importance for making a decision (score 1-3; see Appendix S1). Briefly, the outcomes deemed critically important included irregular cycles, insulin resistance, weight, BMI, thromboembolic events and gastrointestinal effects.

Systematic Search for Evidence

A systematic search strategy was designed to identify the best available evidence to answer a suite of clinical questions developed by the team for the section of the guideline regarding medical treatment options for the features of PCOS (additional clinical questions for which the search applies can be found in the technical report for the guideline[19]). Consequently, the PRISMA flow diagram represents the search results across the suite of clinical questions included in the medical treatments section of the guideline, which was broader than the topics reported in this manuscript. A broad-ranging systematic search string for terms related to PCOS was developed to retrieve articles addressing women with PCOS in all cultural, geographical and socio-economic backgrounds, settings and life stage. This PCOS search string was combined with search terms relevant to the medical therapies outlined in the clinical questions and corresponding PICO.[19] The search strategy was limited to English language systematic reviews and randomized controlled trials, and there were no limits on year of publication (See Appendix S1). Crossover RCT data were included if the study conducted a washout period of ≥8 weeks. For the first clinical question, reviewed studies were limited to those published in the last 20 years, based on consensus decision, given changes in doses and formulations over time.


The following electronic databases were searched on 11 January 2017:

  • Ovid MEDLINE(R) 1946 to Present with Daily Update

  • Ovid MEDLINE(R) In-Process & Other Non-Indexed Citations <January 10, 2017>

  • Ovid MEDLINE(R) Epub Ahead of Print <January 10, 2017>

  • Embase Classic + Embase <1947 to 2017 January 10>

  • EBM Reviews, incorporating:

         ∘Cochrane Database of Systematic Reviews <2005 to January 10, 2017>

         ∘ACP Journal Club <1991 to December 2016>

         ∘Database of Abstracts of Reviews of Effects <1st Quarter 2015>

         ∘Cochrane Central Register of Controlled Trials <November 2016>

         ∘Cochrane Methodology Register <3rd Quarter 2012> Health Technology Assessment <4th Quarter 2016>

         ∘NHS Economic Evaluation Database <1st Quarter 2015>

  • PsycINFO <1806 to January Week 1 2017>


Inclusion of Evidence

To determine the literature to be assessed further, a reviewer (MM) scanned the titles, abstract sections and keywords of every record retrieved by the search strategy using the selection criteria described in the technical report of the guideline.[19] Full articles were retrieved for further assessment if the information given suggested that the study met the inclusion criteria. Studies were selected and appraised by a reviewer (MM) in consultation with a second author (HT), using the selection criteria established a priori. Where there was any doubt regarding these criteria from the information given in the title and abstract, the full article was retrieved for clarification.

Where existing systematic reviews were identified, the most current (within 5 years), comprehensive (with the most outcomes relevant to PICO) and high-quality systematic review that met the inclusion criteria was used. Additional systematic reviews that met benchmark criteria and PICO were used if it reported additional outcomes relevant to the PICO that were not addressed in the first, most comprehensive systematic review. Additional RCT(s) that met the PICO and were not included in the existing systematic review(s) were also used.

Assessment of Methodological Quality

Methodological quality, in terms of risk of bias, of each of the included studies was assessed twice, independently by two of four reviewers (ECT, EB, AW and MM,) using criteria developed a priori[20] for systematic reviews and RCTs. Individual quality items were investigated using a descriptive component approach that assessed selection bias, reporting bias, performance bias, potential confounding, attrition bias and appropriateness of the statistical analysis. Any disagreement or uncertainty was resolved by a discussion among the evidence reviewers to reach a consensus. Using this approach, each study was allocated a risk of bias rating of either low, moderate or high. Where there was more than one published article describing a study, all articles were used to complete one risk of bias assessment on the study.

Data Extraction

Data, according to the a priori selection criteria and outcome prioritization process, were double extracted from included studies, independently by two of four reviewers (ECT, EB, AW and MM). Information was collected on general study details (title, authors, reference/source, country, year of publication, setting), participants (age, sex, inclusion/exclusion criteria, withdrawals/losses to follow-up, subgroups), interventions, outcomes, results (point estimates and measures of variability, frequency counts for dichotomous variables, number of participants, intention-to-treat analysis) and validity results. Where data were reported across multiple articles for the same study, data were extracted to one form.

Data Synthesis

Meta-analyses were performed using Review Manager 5 by one author (MM). Due to clinical heterogeneity from differences in dose and timing of treatment, a random effects model was used for meta-analyses of the data. Mean differences were used to present the effect estimates for all meta-analyses with the exception of side effects presented as event rates; therefore, odds ratios were used. Subgroup analysis was conducted according to body mass index (BMI) since it is considered to cause variations in the outcomes in response to the therapies addressed here. Heterogeneity I2 > 50% was considered to be high and results interpreted with caution. Forest plots and funnel plots are presented here for outcomes which were rated as critical during consensus discussions for this guideline development group.[7] A complete set of forest plots and funnel plots can be found in the technical report for the International guideline.[19] Where it was not appropriate to conduct meta-analyses, study data are presented narratively.