Effectiveness of Disease-Specific Cognitive Behavioral Therapy on Anxiety, Depression, and Quality of Life in Youth With Inflammatory Bowel Disease

A Randomized Controlled Trial

Luuk Stapersma, MSC; Gertrude van den Brink, MD; Jan van der Ende, MSC; Eva M. Szigethy, MD, PHD; Ruud Beukers, MD, PHD; Thea A. Korpershoek, MANP; Sabine D. M. Theuns-Valks, MD; Manon H. J. Hillegers, MD, PHD; Johanna C. Escher, MD, PHD; Elisabeth M. W. J. Utens, PHD


J Pediatr Psychol. 2018;43(9):967-980. 

In This Article


Design and Procedure

This multicenter parallel group RCT was designed according to the CONSORT guidelines for trials of nonpharmacologic treatments (Boutron et al., 2017). The trial had two arms. Patients in the experimental group received a disease-specific CBT protocol (Primary and Secondary Control Enhancement Training for Physical Illness; PASCET-PI; Szigethy et al., 2007) added to standard medical care. The control group received standard medical care (care-as-usual, CAU) only, as this resembles the current care best. Initially, only patients 10–20 years old were included. A few months after the start of the recruitment, patients of age 21–25 years were also included, to include more patients in young adulthood as well, to cover the transition phase. The research protocol was approved by the Medical Ethics Committee of the Erasmus MC and confirmed by the ethical boards of all participating hospitals. The study was registered with ClinicalTrials.gov as study number NCT02265588.

After having provided written informed consent, patients (and parents) completed validated psychological instruments at two points in time (see Outcome Measures). At baseline, patients completed online questionnaires (at home) and a clinical interview (by phone; no longer than 2 weeks before the start of the PASCET-PI). The immediate post(-treatment) assessment was similar to the baseline and was performed approximately 3 months after baseline, no later than 2 weeks after completing the PASCET-PI. Timing and method of assessments were the same in the experimental and control groups.


Step 1: Inclusion Baseline Screening. Included for baseline screening for symptoms of anxiety and depression were adolescents and young adults of age 10–25 years with a confirmed diagnosis of IBD (CD, UC, or inflammatory bowel disease-unclassified [IBD-U]; Figure 1). Between October 2014 and October 2016, patients were consecutively recruited from the pediatric or (pediatric) gastroenterology departments of two academic hospitals and four community hospitals. The centers were medium-sized to large hospitals from mixed rural and urban regions. Parents participated for all patients 17 years or younger, whereas parental participation for patients of age 18–20 years was voluntary. Exclusion criteria were (1) intellectual disability; (2) current treatment for mental health problems (pharmacological and/or psychological); (3) insufficient mastery of the Dutch language; (4) a diagnosis of selective mutism, bipolar disorder, schizophrenia/psychotic disorder, autism spectrum disorders, obsessive-compulsive disorder, posttraumatic or acute stress-disorder, or substance use disorder (parent- or self-reported or from medical file); (5) CBT in the past year (at least eight sessions); and (6) participation in another interventional study, all assessed by the treating physician using medical files (unless otherwise specified).

Figure 1.

CONSORT study flowchart.

Step 2: Inclusion RCT. Only youth with subclinical anxiety and/or depressive symptoms were included in the RCT. Patients with clinical anxiety and/or depression were excluded, as we deemed it unethical to randomize them.

Subclinical anxiety and/or depressive symptoms were defined as a score equal or above the cutoff of age-appropriate questionnaires, but not meeting criteria for clinical anxiety and/or depression (see next paragraph). For anxiety, the Screen for Child Anxiety Related Emotional Disorders (SCARED; 10–20 years; cutoff ≥26 for boys and ≥30 for girls; Bodden, Bögels, & Muris, 2009) and the Hospital Anxiety and Depression Scale—Anxiety Scale (HADS-A; 21–25 years; cutoff ≥8; De Croon, Nieuwenhuijsen, Hugenholtz, & Van Dijk, 2005) were used. For depression, the Child Depression Inventory (CDI; 10–17 years; cutoff ≥13; Timbremont, Braet, & Roelofs, 2008) and the Beck Depression Inventory—second edition (BDI-II; 18–25 years; cutoff ≥14; Van der Does, 2002) were used.

Clinical anxiety and/or depression was defined as follows: for patients who scored on or above the cutoffs for elevated symptoms of anxiety and/or depression, the psychologist-investigator (LS) administered a clinical interview. The Anxiety Disorders Interview Schedule for Children (ADIS-C; Siebelink & Treffers, 2001) was delivered by telephone to patients and, if applicable, parents. In the ADIS-C, if a patient meets criteria for a clinical disorder, a clinician's severity rating (CSR, a 0–8 rating of symptom severity and functional impairment) is assigned by the interviewer. In addition, severity of anxiety and/or depressive symptoms was rated by the interviewer using age-appropriate rating scales. For anxiety, the Pediatric Anxiety Rating Scale (PARS; 10–20 years; cutoff ≥18; Ginsburg, Keeton, Drazdowski, & Riddle, 2011) and Hamilton Anxiety Rating Scale (HAM-A; 21–25 years; cutoff ≥15; Hamilton, 1959; Matza, Morlock, Sexton, Malley, & Feltner, 2010) were used. For depression, the Child Depression Rating Scale-revised (CDRS-R; 10–12 years; cutoff ≥40; Poznanski et al., 1984), Adolescent Depression Rating Scale (ADRS; 13–20 years; cutoff ≥20; Revah-Levy, Birmaher, Gasquet, & Falissard, 2007), and the Hamilton Depression Rating Scale (HAM-D; 21–25 years; cutoff ≥17; Hamilton 1960, Zimmerman, Martinez, Young, Chelminski, & Dalrymple, 2013) were used. If patients met criteria for an anxiety or depressive disorder on the ADIS-C (i.e., a CSR of at least 4) and score equal to or above the clinical cutoff on the rating scale, this indicates a clinical anxiety or depressive disorder. These patients were excluded and received immediate referral to mental health care. Within the group of patients included in the RCT (all with subclinical anxiety and/or depressive symptoms, n = 70), a subdivision was made based on the ADIS-C. If patients had one or more CSRs of at least 4 (but scored below the cutoff on either the CDRS, ADRS, HAM-D, PARS, or HAM-A), they were considered "high" subclinical; if not, they were considered "low" subclinical.


Patients with subclinical anxiety and/or depressive symptoms (but not clinical anxiety and/or depression) were randomized to PASCET-PI and CAU versus CAU alone, with a ratio of 1: 1 An independent biostatistician provided a computer-generated blocked randomization list with randomly chosen block sizes (with a maximum of six) and stratification by center using the blockrand package in the R software package, thereby providing numbered envelopes per center. Patients were enrolled by one of the investigators (GB). To prevent drop-out, before randomization, it was thoroughly checked with the patients whether they would be motivated enough to complete the CBT. For example, they were asked about their motivation and concerns regarding traveling and time investment or regarding discussing private information.


The PASCET-PI is a disease-specific CBT protocol, developed for adolescents with IBD and depression. Disease-specific components encompass the illness narrative (i.e., perceptions and experience of having IBD), disease-specific psychoeducation, techniques for pain and immune functioning, social skills training, and emphasis on IBD-related cognitions and behavior (Szigethy et al., 2007). Parents receive psychoeducation about coaching their child to cope with IBD.

In the current study, the PASCET-PI contained 10 weekly individual sessions, delivered in 3 months. Conform the protocol, six of these sessions were face-to-face, the remaining four sessions were by phone at a prearranged moment (to advance adherence and lower the treatment burden). In addition, three family sessions (for patients and their parents) were held (only for patients ≤ 20 years), and following the weekly sessions, three monthly individual booster sessions were held by telephone (this was after the immediate post[-treatment] assessment). As the original PASCET-PI was developed for depression, therapists were instructed how to make the exercises more anxiety-tailored, an anxiety hierarchy and step-by-step exercise was added, and an extra anxiety handout was provided to the patients. For patients of age 21–25 years, the practice book was made more age-appropriate. See Van den Brink & Stapersma et al. (2016) or Appendix 1 for a more detailed description of this Dutch modification of the PASCET-PI. The therapy was provided by all licensed (healthcare/CBT) psychologists, who received onsite training from the developer (ES) and performed the therapy in their own hospital or center. To ensure treatment integrity, monthly supervision was provided by EU (clinical psychologist/professor) and audiotaped sessions were rated by EU and five master-level Psychology students. Of all sessions, 30% was rated on adherence by at least one rater, and of that 30%, half was evaluated by at least two raters (i.e. 15% of all sessions). Audiotapes were randomly selected to be rated by two of the raters. However, which pair of two raters rated the sessions varied strongly, so there were too few standardized pairs of raters to use, for example, intraclass correlation. Therefore, interrater agreement was globally calculated using Pearson's correlation between two data columns with (1) all first ratings and (2) all second ratings for all patients and sessions combined. CAU consisted of regular medical appointments with the (pediatric) gastroenterologist every 3 months, consisting of a 15-min consultation discussing overall well-being, disease activity, results of diagnostics tests, medication use, and future diagnostic/treatment plans.

Outcome Measures (Online Questionnaires)

Demographic data were assessed with a general questionnaire, based on a semi-structured interview (Utens, van Rijen, Erdman, & Verhulst, 2000). Socioeconomic status was based on occupational level from parents or, if they lived on their own, patients themselves. It was divided into low, middle, and high (Statistics Netherlands, 2010). Ethnicity was based on mother's country of birth or if the mother was born in the Netherlands, the father's country of birth (Statistics Netherlands, 2000). Disease characteristics were extracted from the medical charts.

Symptoms of anxiety were assessed with the SCARED (for 10–20 years) and the anxiety scale of the HADS (for 21–25 years). Both are self-report questionnaires. The SCARED has 69 items with three response categories (0–2; total score = 0–138). It contains five subscales: General Anxiety Disorder, Separation Anxiety Disorder, Specific Phobia, Panic Disorder, and Social Phobia (Bodden et al., 2009; Muris, Bodden, Hale, Birmaher, & Mayer, 2011). The Anxiety scale of the HADS has seven items with four response categories (0–3; total score = 0–21; De Croon et al., 2005). Internal consistency was .86 and .92 for the SCARED and .54 and .77 for the HADS-A at baseline and follow-up, respectively (Cronbach's α).

Symptoms of depression were assessed using the CDI (for 10–17 years) and the BDI-II (for 18–25 years) self-report symptom scales. The CDI has 27 items with three response categories (0–2; total score = 0–54; Timbremont et al., 2008). The BDI-II has 21 items with four response categories (0–3; total score = 0–63; Van der Does, 2002). Internal consistency was .70 and .77 for the CDI and .54 and .83 for the BDI-II at baseline and follow-up, respectively.

HRQOL was assessed with the self-reports IMPACT-III (10–20 years) and Inflammatory Bowel Disease Questionnaire (IBDQ; 21–25 years). The IMPACT-III has 35 items, scored 1–5 (total score = 35–175; Otley et al., 2002). The IBDQ contains 32 items, scored 1–5 (total score = 32–160; De Boer, Wijker, Bartelsman, & de Haes, 1995). A higher score of both instruments indicates better quality of life. Internal consistency was .86 and .89 for the IMPACT-III and .71 and .92 for the IBDQ at baseline and follow-up, respectively.

Clinical disease activity was assessed with four validated clinical disease activity measures. For patients of 10–20 years with CD, the short Pediatric Crohn's Disease Activity Index (Kappelman et al., 2011) was used, whereas for patients with UC and IBD-U, the Pediatric Ulcerative Colitis Activity Index was used (Turner et al., 2007). For patients of age 21–25 years with CD, the Crohn's Disease Activity Index (Best, Becktel, Singleton, & Kern, 1976) was used, whereas for patients with UC and IBD-U, the partial Mayo score (Schroeder, Tremaine, & Ilstrup, 1987) was used. All indices were scored by the physician during the medical visit and provide four categories of clinical disease activity: remission, mild, moderate, and severe.

Social validity questions were included in the online questionnaire to gain insight into how the patients in the CBT group (and, if applicable, parents) evaluated the PASCET-PI. For this study, we chose to assess three relevant aspects of social validity: satisfaction, usefulness, and recommendation. Patients and/or parents awarded three items with 0–10 points (0 ="Not at all" to 10 = "Very much") regarding (1) their satisfaction with the protocol, (2) how useful it was for them, and (3) whether they would recommend it to other patients.


The interviewer (LS) and treating physicians were blinded for the result of randomization (they were not informed and had no access to files containing this information). Patients could not be blinded. They were explicitly asked not to discuss the group they were randomized into with their physician.

Statistical Analysis

Descriptive statistics were computed for demographic and disease characteristics. Independent t-tests and chi-square tests were used to assess differences between these variables in the two groups at baseline. An intention-to-treat principle was applied in the analyses.

For each participant, we calculated a Reliable Change Index (RCI; Jacobson & Truax, 1991) value for anxiety and depression (but not for HRQOL, as no data on test–retest reliability were available for the HRQOL instruments, which are necessary to calculate the RCI). By calculating RCIs, we were able to combine all participants in one analysis. The RCI is calculated using the standard error of measurement of the pretest and the test–retest reliability of the instrument. The RCI can have three possible values: reliably improved, no reliable change, and reliably deteriorated. See Appendix 2 for the details of calculating the RCI variable. A chi-square test was used to compare the RCI values between the two groups, using complete cases (n = 68). For exploratory analyses, we first used six linear mixed models (which take into account missing data) to compare change between the groups from baseline with directly after CBT for anxiety (SCARED or HADS-A), depression (CDI or BDI-II), and HRQOL (IMPACT-III or IBDQ). Time, group (PASCET-PI vs. CAU), and the interaction between time and group were included as fixed factors. We repeated these linear mixed models in subgroups to examine the influence of gender and disease type. The influence of age is incorporated in the first set of linear mixed models, as the questionnaires for the specific age-group were used. Using an identity covariance structure, random intercepts were estimated for each participant. No random slopes could be specified, because we only had two time points. Restricted maximum likelihood was applied as the estimation method. A p-value of <.05 was considered statistically significant. Reported Cohen's ds represent the effect size between groups at follow-up. For the SCARED, HADS-A, CDI, and BDI-II, a negative effect size is in favor of CBT; for the IMPACT-III and IBDQ, a positive effect size is in favor of CBT. Data were analyzed using SPSS version 21.

Sample Size and Power

Considering literature regarding effectiveness of CBT for anxiety and depressive symptoms in youth without a somatic disease, as well as earlier studies of CBT in youth with IBD (Szigethy et al., 2014), we expected medium to large effects on anxiety symptoms (Reynolds, Wilson, Austin, & Hooper, 2012) and medium effects for depressive symptoms (Weisz, McCarty, & Valeri, 2006). This corresponds to φ > 0.40 for anxiety symptoms and to φ > 0.30 for depressive symptoms. For the chi-square tests for anxiety and depressive symptoms, this means that a total of 70 patients provides us with sufficient power for the anxiety outcomes (>85%) and with medium power for the depression outcomes (>60%).