Protocol and Registration
The protocol of this systematic review and meta-analysis was prospectively registered in the International Prospective Register of Systematic Reviews, PROSPERO (CRD42020178783) and reported following the 2020 Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement.
Only randomized controlled trials (RCTs) defined based on PICO (population, intervention, comparator and outcome) were included in this review. Eligibility criteria are presented in Table 1. Briefly, only RCTs included women aged 18 years and over and diagnosed with PCOS were eligible. RCTs that evaluated one pharmacological agent versus placebo, or comparing different pharmacological agents were eligible regardless of the design and methodology (open-labelled, double-blinded, parallel and crossover).
A literature search was performed in the medical databases; PubMed, EMBASE, MEDLINE, Scopus, Cochrane Library (CENTRAL) and Web of Science in April 2020 (L. Ö.). A search update in PubMed was conducted in March 2021 (L. Ö.), the search was not limited to specific dates. Search phrases were decided by professionals in the medical field (T. S. and M. A.) together with a medical librarian (L. Ö.). All search terms were searched in a combination of title, abstract and Medical Subject Headings (MeSH) for optimal literature retrieval (Supporting Information). A filter for English language was applied. The search strings were later used to search in open grey, EU clinical trials and registry ClinicalTrials.gov. The full search strategy is shown in the supplementary material. All records identified in the literature search were uploaded to the systematic review software Covidence for de-duplication and blinded screening followed by data extraction. All the selected references were managed by using EndNote. Cabell's Predatory Report was sought to ensure the non-predatory status of the included studies from open access journals.
Two independent reviewers (M. A. and N. S.) screened titles and abstracts of the retrieved studies with support of Covidence and assessed eligibility based on the inclusion/exclusion criteria. A full text evaluation was performed with agreement of both reviewers and disagreement was resolved by either consensus, discussion or by arbitration of a third reviewer (T. S.). Studies included nonpharmacological agents and observational studies were deemed ineligible and excluded. The study selection process together with the study identification, screening, and the reason for exclusion is shown in Figure 1.
Two independent reviewers (M. A. and N. S.) extracted information from the eligible studies. Information included countries of the RCTs, years of publications, design of the RCTs, type of the interventions and comparators, number of participants, duration of the RCTs, baseline aspects of the participants, and the reported outcomes. An overview of these characteristics is shown in Table 2. From all the reported outcomes TC, TGs, LDL-C, HDL-C and C-reactive protein (CRP) were included.
Risk of Bias Assessment in the Included Studies
The Cochrane collaboration's tool was used to assess for the risk of bias (RoB) as suggested by Higgins et al. The tool has six bias domains (selection bias, performance bias, detection bias, attrition bias, reporting bias and other bias). Each RCT was assessed against these domains by two independent reviewers (M. A. and N. S.). Any disagreement was resolved by mediation of a third reviewer (S. T.). This study followed the recommendations from the Cochrane handbook and graded RoB as 'high RoB', 'low RoB', or 'unclear RoB'. The magnitude of RoB for the included RCTs and the calculated RoB for each specific domain in the RCTs are presented in Figures 1 and 2 in the supplementary material.
The robustness of evidence for each chosen outcome (CRP, LDL, HDL, triglycerides, total cholesterol) was examined following the recommendations from the Grading of Recommendations, Assessments, Development and Evaluation (GRADE). The GRADEpro GDT software was consulted to value the quality of the outcomes and to generate 'Summary of findings table' in Table S1. Initially, four points were given for each outcome. The points were then reduced in each outcome based on the presence of the following; the overall RoB for each RCT, inconsistency (significant heterogeneity), indirectness (significant differences in the population, comparisons, and outcomes), imprecision [the size of the cohort, width and significance of the confidence intervals (CIs)]. Based on these factors the overall GRADE scores were recorded for the outcome of each comparison as a high grade (at least 4 points), moderate grade (3 points), low grade (2 points) and very low-grade (1 point or less). All the grades of evidence are presented in Table 1 in the supplementary material.
Data Analysis and Evidence Synthesis
The estimated pooled effects [mean difference (MD), standardized mean difference (SMD) and their 95% confidence intervals (95% CIs)] on the variation between the comparison and intervention groups were quantified by using the random-effect model. Where at least two effect estimates are reported a meta-analysis was conducted using the MD, inverse variance and random model presuming that the provided data for the continuous outcome variables were normally distributed and reported using the same measuring scales otherwise SMD was used. Highly biased data or data presented as ranges were not considered for the meta-analysis. Whereas means and standard deviation (SD) of the postintervention and changes from baseline values were included in the meta-analysis. Where data was reported as standard error (SE), CIs, p-values and t values, we used the RevMan calculator to transform them into means and SD. Where units of measurements were significantly varied, scales were converted to the most common measures. For RCTs with more than one intervention arm, we combined data from all arms based on the method recommended in the Cochrane Handbook's. The meta-analysis was carried out using the Review Manager software (RevMan 5.4; The Cochrane collaboration) and differences with two-tailed p values of ≤0.05 were considered statistically significant.
Assessment of Heterogeneity
Heterogeneity for the outcomes across each RCT was evaluated using the I2 test statistics. Heterogeneity was reported as (may not be important if I2 = 0%–40%), (might be moderate if I2 = 30%–60%), (may be substantial if I2 = 50%–90%) and (may be considerable if I2 = 75%–100%). For statistically significant heterogeneity, the source was examined by omitting the RCT that showed significant effect from the meta-analysis and the squared I 2 was re-examined. If significant heterogeneity still existed subgroup analysis was performed.
Subgroup analysis was conducted and RCTs were grouped according to the dosages (mg/μg), frequencies of administration [once a day (QD), twice a day (BID) and trice a day (TDS)], and duration (weeks or months) of the therapeutic interventions.
Clin Endocrinol. 2022;96(4):443-459. © 2022 Blackwell Publishing