Recommendations From the International Evidence-based Guideline for the Assessment and Management of Polycystic Ovary Syndrome

Helena J. Teede; Marie L. Misso; Michael F. Costello; Anuja Dokras; Joop Laven; Lisa Moran; Terhi Piltonen; Robert J. Norman; on behalf of the International PCOS


Hum Reprod. 2018;33(9):1602-1618. 

Materials and Methods

Best practice evidence-based guideline development methods were applied and are detailed in the full guideline and the technical reports and outlined in Figure 1 and available at (Misso and Teede, 2012). The process aligns with all elements of the AGREE-II tool for quality guideline assessment (Brouwers et al., 2010). This involved extensive evidence synthesis and the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) framework covering evidence quality, feasibility, acceptability, cost, implementation and ultimately recommendation strength (GRADE working group). Evidence synthesis methods are outlined in the full guideline and followed best practice (NHMRC, 2007, 2009; Brouwers et al., 2010; GRADE working group). Categories include evidence-based or consensus recommendations with accompanying clinical practice points (Table I).

Figure 1.

The steps in developing an evidence-based guideline. GDG = guideline development group; PICO = P: patient, problem or population, I: intervention, C: comparison, control or comparator, O: outcome. Reprinted with permissions from Misso and Teede (2012).

Terms include 'should', 'could' and 'should not' are informed by the nature of the recommendation (evidence or consensus), the GRADE framework, and quality of the evidence and are independent descriptors reflecting the judgment of the multidisciplinary GDG, including consumers. They refer to overall interpretation and practical application of the recommendation, balancing benefits and harms. 'Should' is used where benefits of the recommendation exceed harms, and where the recommendation can be trusted to guide practice. 'Could' is used where either the quality of evidence was limited or the available studies demonstrate little clear advantage of one approach over another, or the balance of benefits to harm was unclear. 'Should not' is used where there is either a lack of appropriate evidence, or the harms may outweigh the benefits.

The GRADE of the recommendation is determined by the GDG based on comprehensive structured consideration of all elements of the GRADE framework (GRADE working group), including desirable effects, undesirable effects, balance of effects, resource requirements and cost effectiveness, equity, acceptability and feasibility, and includes:

*Conditional recommendation against the option;
**conditional recommendation for either the option or the comparison;
***conditional recommendation for the option; and
****strong recommendation for the option.

Quality of the evidence is categorized according to the number and design of studies addressing the outcome; judgments about the quality of the studies and/or synthesized evidence, such as risk of bias, inconsistency, indirectness, imprecision and any other considerations that may influence the quality of the evidence; key statistical data; and classification of the importance of the outcomes (Table II). The quality of evidence reflects the extent of confidence in an estimate of the effect to support a particular recommendation (GRADE working group) and was largely determined by the expert evidence synthesis team.

GRADE acknowledges that evidence quality is a continuum; any discrete categorization involves a degree of arbitrariness. Nevertheless, the advantages of simplicity, transparency and clarity outweigh these limitations (GRADE working group).