Current Measures for the Evaluation of Acne Severity

Jerry KL Tan


Expert Rev Dermatol. 2008;3(5):595-603. 

In This Article

Methods of Grading Acne Severity

Evaluation of acne severity has been undertaken from two polar perspectives: elemental or reductionistic (in which severity is based upon the quantification of specific lesion types); and holistic (in which the gestalt of entire presentation is considered and then categorized based on a pre-established repertoire of severity presentations). Recognition of the complexities in severity determination of acne has led the US FDA to recommend using both as coprimary end points.[8] They acknowledge that lesion counts alone may be inaccurate owing to the exclusion of other factors associated with the pleiomorphic nature of acne. Furthermore, the disadvantages of lesion counting include a lower precision in actual evaluation studies and impracticality in the clinical setting. In the vernacular of the regulatory research paradigm, the Investigator's Global Assessment (IGA) is the physician's overall or global assessment of the condition. To maintain the gestalt of this measure, which accounts for admixture of lesion types, their quality and quantity and the extent and density of involvement (numerical ranges for lesion types) were discouraged.

Acne lesion counting was first published as a measure of acne severity in 1966 in the conduct of a clinical drug trial.[9] It has since endured as a primary outcome measure of severity in clinical research studies. In this application, the specificity of counting is valuable, as acne treatments may have a greater effect on certain lesion types. The decisional process in counting specific lesions is binary and provides a continuous set of variables particularly suited to statistical testing and the research paradigm.

In the counting procedure, primary acne lesions are evaluated and accounted for independently: comedones, papules/pustules and nodules/cysts. Demarcation zones for the face extend from the anterior hairline (or approximation thereof with balding) to the temporal fringe and along the preauricular sulcus to the jawline and chin. In addition to proper lighting, patient positioning and prior facial skin preparation (removal of makeup for women, gentle shaving to minimize irritation for men), the use of a facial template to organize facial regions into sectors, such as the forehead, each cheek, nose and perioral region, may be helpful. While palpation of lesions is allowed - for example, to discriminate between macular erythema from inflammatory papules - magnification is not.

Theoretical limitations in lesion counting as an index of overall acne severity include the complexity in accounting for the interplay between different lesion types, numbers, distribution and density. In particular, the clinical relevance to overall severity of varying lesion types and counts are inadequately defined. Such a determination would require the ability to study the effect of simultaneous changes in both type and number of lesions. Despite the apparent simplicity and objectivity of lesion counting, judgment and subjectivity are frequently necessary.[10] Finally, the time required to conduct lesion counts decreases practicality and the likelihood of uptake in usual clinical practice.

The reliability of lesion counting has been evaluated in two previous studies. In a study involving 12 raters (three physicians and nine nurses) and 12 acne subjects, intralass correlation coefficients (ICCs) were used as a measure of rater reliability. ICC values approximating 1.0 indicated excellent reliability, while values less than 0.75 were considered less precise.[11] Inter-rater reliability estimates were 0.52 for comedone counts and 0.76 for inflammatory papule/pustule counts. However, intra-rater ICCs ranged from 0.74 to 0.98 for comedone counts and 0.73 to 0.98 for papule/pustule counts.[12] Thus, lesion counts were more reliable if conducted by the same rater.

A more recent study involving 11 dermatologists and six acne subjects corroborated these findings.[13] In this study, the raters were separated into two groups to determine the effect of a formal training session on lesion counting and acne-severity grading. One group was trained prior to the first of two subject-evaluation sessions, while the second group was trained only after the first subject-evaluation session. The group trained prior to subject evaluations demonstrated inter-rater reliability estimates of 0.68 and 0.72 for noninflammatory and inflammatory lesions, respectively. Corresponding mean intra-rater reliability estimates were 0.83 and 0.79. The training sessions improved inter-rater reliability in noninflammatory counts and increased the proportion of raters with good reliability (ICC ≥ 0.75) in all three outcome measures (including global assessments). Practice also improved reliability in all three outcome measures. Thus, training dermatologists has a demonstrable effect on reliability of lesion counting, as does practice.

Global assessment scales assimilate the totality of the clinical presentation into a single category of severity. Severity categories are established upon a prior experiential repertoire, based on photographic or descriptive text. Global methods are particularly suited to clinical practice owing to their practicality. In clinical investigations, global assessments are a coprimary end point of efficacy as they are considered to be of greater clinical relevance than lesion counts alone.

The prerequisites of an ideal global acne scale include a restricted number of categories, sufficient detail in descriptions to reduce observer variability, relevance of severity levels for treatment selection, static measurements with no reference to a prior level of severity, universality for use in practice and investigations, correlation with lesion counts[14] responsivity to change, comprehensiveness for common areas of involvement, such as the face, chest and back, and practicality.[15]

Despite the availability of more than 25 grading systems for acne,[6] the lack of a single, standardized system consistently used in practice and research reflects their inability to fulfill these attributes. While an historical account of earlier acne grading scales has been published elsewhere,[16] the current focus is on global grading systems developed since the consensus conference in 1991 ( Table 1 ). A classification proposal developing from this conference was a three-category system for inflammatory acne, where mild was comprised of few to several papules/pustules; moderate of several to many papules/pustules and few to several nodules; and severe of numerous or extensive papules/pustules and many nodules.[7] Noninflammatory lesions did not comprise this scale, nor was a separate scale for such lesions provided. No specific directives were provided in the application of this scale to the face, chest and back, or whether it was to be applied in the aggregate to all these regions.

The Leeds Revised Acne Grading System, published in 1998, provides a photographic standard for acne grading of the face, back and chest.[17] This system is comprised of 15 facial grades (three solely for comedonal acne) and eight each for the chest and back. These representations were selected from over 1000 photographs by an expert panel of three dermatologists and four acne assessors. The photographs were ranked on four further occasions as a means of content validation by the authors. The varying representations of severity and the large number of categories within each region, however, make this system cumbersome to apply in clinical practice. Furthermore, this system does not adequately differentiate those with the lowest acne grades, while categories of extreme acne severity are over-represented.[15]

The Global Acne Grading System (GAGS) is a quantitative scoring system in which the total severity score is derived from summation of six regional subscores.[18] Each is derived by multiplying the factor for each region (factor for forehead and each cheek is 2, chin and nose is 1 and chest and upper back is 3) by the most heavily weighted lesion within each region (1 for ≥ one comedone, 2 for ≥ one papule, 3 for ≥ one pustule and 4 for ≥ one nodule). The regional factors were derived from consideration of surface area, distribution and density of pilosebaceous units. As yet, this system has not been validated against other global scales or lesion counts, nor evaluated for reliability.

The grading scale for overall severity by Allen and Smith Jr has been the template for global assessments in many acne trials as an Investigators' Global Assessment (IGA) scale. They provided text descriptions of five categories but allowed for nine acne grades similar to the scale of Cook et al., who also included photographic standards.[11] The system proposed by Allen and Smith Jr, however, was based solely on descriptive text, not on photographs, and also added the dimension of increasing extent of facial involvement. They further demonstrated that the severity scale correlated with inflammatory and noninflammatory lesion counts.[19] Although limited to facial acne, this system has subsequently been expanded for application to acne on the chest and back.[15] Demarcation zones for the chest were the suprasternal notch laterally to shoulders superiorly and the level of the xiphoid process inferiorly; while the back was demarcated by the base of the neck, laterally to shoulders and inferiorly by the costal margin. Each of these regions was then individually graded for acne with the categorical grading scale. A high level of correlation was demonstrated compared with the Leeds Revised Acne Grading System. Comparing both systems at all three sites, acne graded by the global system approximated a normal distribution and more definitively distinguished the clear/almost clear from mild categories. This is a critical issue in defining treatment success in clinical treatment trials. The reliability of this system as applied to facial acne has been demonstrated previously in which trained dermatologists demonstrated inter-rater reliability estimates of 0.65 in the first patient evaluation session and 0.77 for the second.[13] A similar six-category global scale was found to be more reliable than other global scales including the three-category scale proposed by the consensus conference and the Leeds scale.[12] This validated system fulfills many of the attributes recommended for an ideal global system, including a restricted number of categories to facilitate practicality, static evaluations, comprehensiveness to enhance content validity, reliability, practicality and universality with prior inclusion as an outcome measure in clinical drug trials.

A recent proposal by the US FDA for a five-category global system may provide even greater reliability as the descriptive text is more explicit.[8] In this scale, the five categories ranged from:

  • Clear, indicating no inflammatory or noninflammatory lesions;

  • Almost clear, rare noninflammatory lesions with no more than one papules/pustule;

  • Mild, some noninflammatory lesions, no more than a few papules/pustules but no nodules;

  • Moderate, up to many noninflammatory lesions, may have some inflammatory lesions, but no more than one small nodule;

  • Severe, up to many noninflammatory and inflammatory lesions, but no more than a few nodules.

A recent study established a global facial acne severity scale (mild, moderate, severe and very severe) by use of intuitive (categories not specifically predefined) severity grades.[20] Dermatologists initially graded half-face severity of 244 acne patients and also counted lesions. Their judgments on severity grades were then compared with those of an expert panel of three dermatologists who evaluated half-face photographs of the same patients. Concordance of severity judgments between the initial rater and the entire panel of three raters was 45%, while concordance with at least two panel raters was 69%. Correlation of severity grades with lesion counts was highest for inflammatory papules and pustules but not for comedones, nodules or cysts. In those cases for which severity grading was unanimous, correlation with the numerical range of inflammatory papule and pustules was determined and then evaluated to provide a clear delineation of categories. Classification of acne severity based on inflammatory papule/pustule counts on half-face evaluation was determined to be 0-5 for mild, 6-20 for moderate, 21-50 for severe and more than 50 for very severe. Finally, half-face photographs from the consensus grading were selected to visually represent the four severity grades.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.
Post as: