Development of a New Patient-reported Outcome Measure to Evaluate Treatments for Acne and Acne Scarring: The ACNE-Q

A.F. Klassen; S. Lipner; M. O'Malley; N.M. Longmire; S.J. Cano; T. Breitkopf; C. Rae; Y.L. Zhang; A.L. Pusic


The British Journal of Dermatology. 2019;181(6):1207-1215. 

In This Article


Phase I: Qualitative Research

Concept Elicitation. The qualitative sample included 13 females and eight males aged 13–25 years. Qualitative analysis led to the identification of three top-level domains: appearance concerns; psychosocial concerns; and acne symptoms. Within each domain, we developed major and minor themes. For example, appearance had major themes of acne, acne scars and skin. Important subthemes for acne and acne scars included the amount (e.g. a few, everywhere, all over), size (e.g. small, large), type (e.g. deep, white- or blackheads), contour (e.g. lumpy, bumpy, indents), colour (e.g. red, skin colour), location, noticeability, scenarios (e.g. photos, morning/end of day, up close) and qualitative descriptions (e.g. looks unattractive, bad, ugly). Skin codes were more positive as the acne went away (e.g. clear, healthy, smooth) and negative for breakouts and acne scars (e.g. red, bumpy, oily/dry skin). Below we describe the scales we developed.

Appearance Appraisal: The item pool was used to develop five scales that measure appearance of acne located on the face, back and chest, acne scars and facial skin. For each scale, the instructions asked participants to answer based on how their acne/acne scars look now. The acne/acne scar scales ask 'How much are you bothered by…' and provide four response options: not at all; a little; quite a bit; and very much. The facial skin scale asks 'How much do you like…', with the same four response options provided.

Appearance-related Distress: The psychosocial item pool was used to develop an appearance-related distress scale. The scale includes a series of statements that measure how often someone behaves (e.g. cover up or hide, avoid having photo taken, avoid going out) or feels (e.g. unhappy, self-conscious, upset) about how they look. Four response options are provided (never, sometimes, often, always). The time period for recall is based on the past week.

Symptoms: The item pool was used to develop a scale measuring acne-specific symptoms (e.g. pain, itch, irritation, etc.). Participants who only had acne scars were not invited to complete this scale. Instructions asked respondents to answer based on how their acne feels now. If they had acne in more than one area (e.g. face, chest), instructions asked them to answer thinking of the area that bothers them most. Four response options were provided: not at all; a little bit; quite a bit; and very much.

Cognitive Interviews

The preliminary scales were shown to six females and four males aged 15–26 years, who had participated in the initial interviews. At the start of the first round, the ACNE-Q consisted of 125 items in seven scales (see Table 2). Feedback from participants led to 91 items being retained without change, 19 items revised, 15 items removed and three items added. At the end of the second round, feedback from participants resulted in 109 of 113 items retained without change. The response options were left unchanged.

Expert Review. Feedback was obtained from 16 of 27 invited clinical experts (12 dermatologists, three plastic surgeons, one skin consultant) from five countries (Australia, Canada, France, Italy, U.S.A.). Final changes to the scales resulted in two items being revised, six items being removed and two new items added (see Table 2). No changes were made to the instructions or response options. At the end of scale review, the ACNE-Q consisted of 105 items.

Phase II: Field Test

Table 3 shows the sample characteristics of the 256 field-test participants who provided a total of 303 assessments. The sample included more females (71·1%) than males (28·9%). Age ranged from 12 to 52 years (mean ± SD 23·1 ± 8·4). Most of the patients reported that they had both acne and acne scars.

RMT analysis resulted in a reduction from 105 items to 73 items. Items were dropped if they showed poor item fit, disordered thresholds and/or high residual correlation with another item in their scale. The 73 items forming the seven scales all had ordered thresholds. Appendix S1 (see Supporting Information) shows the item fit statistics. Item fit was within –2·5 to +2·5 for 63 of 73 items and all items had nonsignificant χ2 P-values after Bonferroni adjustment. Item residual correlations were noted for three pairs of items in three scales (r = 0·37 facial acne; r = 0·32 chest acne; r = 0·32 back acne). Subtests showed marginal impact on scale reliability (< 0·01 drop in the person separation index).

Table 4 shows the scale-level findings. The data fit the Rasch model with nonsignificant χ2-values for the five appearance scales, with some misfit for the symptoms and appearance-related distress. Reliability was high, with person separation index values with and without extremes and Cronbach α-values > 0·90 for the appearance scales but was < 0·90 for the appearance-related distress and symptom scales.

For the TRT study, 130 participants who completed the ACNE-Q agreed to participate in the TRT, and 38 (29·2%) completed one or more ACNE-Q scales a second time. The TRT questionnaire was completed between 7 and 21 days (mean ± SD 9·2 ± 3·2) after the first assessment. Thirteen participants indicated that there had been a change in their acne in the interval between assessments. ICC values were ≥ 0·81 for six scales. The back acne scale was lower (0·56). When the group who reported change in their acne or acne scars was dropped from the analysis, the ICC values improved for six scales, including the back acne scale.

The proportion of participants who scored within each scale's range of measurement ranged from 81·9% for the back acne scale to 93·1% for the facial acne scale (see Table 4). These figures are further illustrated in Appendix S2 (see Supporting Information), which shows the distribution of person measurement and item locations for the seven scales and provides evidence that the scales were targeted to measuring each concept of interest experienced by the sample.

Missing data ranged from 2·5% to 13·2%. The back and chest acne scales had the most missing data. These scales were the last two in the survey booklet. ACNE-Q scales were easy to read, with scale-level Flesch–Kincaid readability statistics ranging from 1·0 to 3·6 (see Table 4), with most items (n = 55, 75·3%) below a grade 3 reading level (range 0–9·5).

Correlations are shown in Table 5. As hypothesized, lower scores on the five appearance scales correlated with worse symptom scores and more appearance-related distress. ACNE-Q scale scores did not correlate with age or ethnicity. Lower scores for back acne and more appearance-related distress were associated with female sex. Having acne cover more facial areas was associated with lower scores on the facial acne, skin and symptoms scales, and more appearance-related distress. These findings were true also for having acne scars cover more facial areas, in addition to having lower scores on the acne scar scale.