Effect of Cognitively Stimulating Activities on Symptom Management of Delirium Superimposed on Dementia

A Randomized Controlled Trial

Ann Kolanowski, PhD; Donna Fick, PhD; Mark Litaker, PhD; Paula Mulhall, RN; Linda Clare, PhD; Nikki Hill, PhD; Jacqueline Mogle, PhD; Malaz Boustani, MD; David Gill, MD; Andrea Yevchak-Sillner, PhD


J Am Geriatr Soc. 2016;64(12):2424-2432. 

In This Article


Individuals admitted to PAC settings after a hospitalization between February 2011 and October 2014 were enrolled in this single-blind clinical trial (N = 283). Follow-up was completed in January 2015. Written consent was obtained before enrollment. The Pennsylvania State University institutional review board approved the study (ClinicalTrials.gov identifier: NCT01267682), and a data safety and monitoring committee convened annually.


Participants were recruited at admission to one of eight PAC settings in Pennsylvania. Trained research nurses screened interested individuals for eligibility within 72 hours of admission.

Eligible participants were aged 65 and older, were community dwelling before hospitalization, had a knowledgeable informant, and had mild to moderate dementia and full or subsyndromal delirium. Individuals with full or subsyndromal delirium were enrolled because of the poor outcomes observed in both.[19] The presence of dementia was based on a score of 3 or greater on the Modified Blessed Dementia Rating Scale[20] and a Clinical Dementia Rating (CDR)[21] score from 0.5 to 2.0. The presence of delirium was based on the presence of two or more positive features according to the Confusion Assessment Method (CAM).[22] A panel of three experts adjudicated all dementia and delirium diagnoses (behavioral neurologist (DG), neuropsychologist (LC), geriatrician (MB)).

Exclusion criteria included having any neurological or neurosurgical disease associated with cognitive impairment, including Parkinson's disease with Lewy bodies, acute stroke, Huntington's disease, normal-pressure hydrocephalus, seizure disorder, subdural hematoma, head trauma, or known structural brain abnormalities; being nonverbal; life expectancy of 6 months or less; acute major depression or psychosis; and severe hearing or vision impairment.

Participants were randomly assigned to cognitive stimulation (intervention) or usual care (control). Randomization was concealed until after enrollment and was conducted using SAS version 9.3 (SAS Institute, Inc., Cary, NC) using randomly permuted block sizes of two, four, and six participants to ensure approximately balanced intervention group sizes over the study and to control for possible temporal effects. Trained assessors, blind to randomization, measured all outcomes. Blinding was maintained by keeping assessment and intervention teams separate in the clinical area and during research team meetings. At the completion of the study, assessors were asked whether there were any instances when blinding was broken; none were reported.


The study had three phases: baseline, intervention, and follow-up. The research nurse (PM) conducted baseline assessments at enrollment. Information was collected on demographic characteristics; mental status (Mini-Mental State Examination, a 30-item cognitive screen[23]); medical diagnoses; number of prescribed medications, including those with anticholinergic properties identified using the Anticholinergic Cognitive Burden Scale[24] and from the medical chart; and apolipoprotein E genotyping by extracting deoxyribonucleic acid from buccal swabs[25] and, for the intervention group, assessment of activity preferences. The intervention period began within 24 hours of baseline and continued for 30 days or until discharge. Daily assessments of delirium, cognitive function, and physical function were completed in both groups, and daily intervention sessions were conducted with the intervention group. Two separate teams of trained research assistants were used; one team delivered the intervention, and the other team conducted the outcome assessments. The research nurse followed up with the participant's legally authorized representative over the telephone 3 months after admission.


The protocol for the intervention has been published and includes a training video illustrating implementation.[26] The goal of the intervention was to elicit active engagement in simple activities that provide cognitive stimulation in a nonregimented way and promote processing supportive of function in the domains of attention, memory, orientation, and executive function. The principal investigator (AK) prescribed the activities in consultation with the neuropsychologist (LC) and the research nurse (PM). Fifteen activities of increasing difficulty were individually selected for each participant based on assessments of their leisure interests, physical function, and mental status. For example, if the participant was an older man with mild cognitive impairments and no visual impairments but some difficulty hearing and was a former high school chemistry teacher who enjoyed woodworking and movies, the following activities might be prescribed and delivered using a hearing amplifier: name three gases and three metals from the periodic table, copy a block design, and identify famous faces. Interests and abilities were matched to provide intrinsic motivation for engagement and to capture attention,[27] the most prominent domain that delirium affects. Activities were selected from a large database of activities previously tested in older adults with dementia.[28] The advantage of activities such as word searches and puzzles is that they offer stimulation in multiple cognitive domains and unobtrusively provide the opportunity for cognitive processing. Participants participated in the intervention, which trained research assistants delivered, in individual sessions for up to 30 minutes each day, 5 days per week, for 30 days or until discharge. The dosage was based on studies that have demonstrated the efficacy of daily 20-minute recreational therapy for the behavioral symptoms of dementia[27] and pilot work with individuals with DSD.[17]

Before the activity session, the interventionists corrected any potentially confounding conditions (e.g., poor lighting, noise). During the session, they used principles to maximize cognitive processing: active participation, oral encouragement, variability in tasks, and increase in level of difficulty as success occurred with simpler tasks.[29] At the completion of the session, activities attempted, time on task, and level of participation were recorded.

Trained research assistants (never members of the assessment team) conducted treatment fidelity checks on 10% of all intervention sessions. These research assistants unobtrusively observed the assigned interventionist during the delivery of the intervention and rated adherence to critical aspects of treatment delivery such as used correct activities for participant, addressed extraneous or environmental factors that might influenced delivery of the intervention (noise, poor lighting), and used the system of least-restrictive prompts to engage the participant. The project director (PM) and principal investigators (AK, DMF) conducted booster sessions for interventionists every 3 months.

Because management of DSD varies according to practitioner, weekly medical chart reviews were conducted to characterize usual care: therapies attended, number of medications received, and number of documented nursing interventions delivered for delirium or confusion.


Interrater reliability was checked on 10% of the outcome measures. The project director (PM) and principal investigators (AK, DMF) conducted assessor booster sessions every 3 months. The primary outcomes were delirium duration and severity; secondary outcomes were cognitive and physical function.

Delirium duration was measured using the CAM,[22] which has been validated in individuals with dementia and has a sensitivity of 94% to 100% and a specificity of 90% to 95%.[22] A weighted kappa of 0.88 was obtained for interrater reliability. The CAM includes the four features of acute and fluctuating course, inattention, disorganized thinking, and impaired level of consciousness. In this study, the presence of two or more features indicated subsyndromal delirium, and the presence of Features 1 and 2 plus 3 or 4 indicated full delirium. Delirium duration was assessed in two ways: time to first delirium remission (number of days until reaching two consecutive days with a CAM score of 0 or 1)[30] and percentage of total days delirium free (CAM score of 0 or 1).

Delirium severity was measured using the Delirium Rating Scale (DRS),[31] a 13-item clinician-rated scale validated in individuals with delirium and those with dementia[32] that has good sensitivity and specificity and high interrater reliability (intraclass correlation coefficient (ICC) = 0.97). In this study an ICC of 0.72 was obtained for interrater reliability. DRS scores range from 0 to 39, with higher scores indicating greater severity.

Cognitive function was measured using three instruments. Attention was measured using Digit Span Forward (range 0–16). Memory (range 0–3) and orientation (range 0–7) were measured using the corresponding items from the Montreal Cognitive Assessment (MoCA).[33] A weighted kappa of 0.95 was obtained for the attention measure, 0.96 for memory, and 0.97 for orientation. Executive function and constructional praxis were measured using the CLOX,[34] a clock drawing task that indicates impairment in executive function (CLOX 1) and discriminates it from nonexecutive constructional failure (CLOX 2). A weighted kappa of 0.92 was obtained for the CLOX 1 and 0.93 for the CLOX 2.

Physical function was measured using the Barthel Index,[35] an ordinal scale for assessing activities of daily living in individuals undergoing inpatient rehabilitation; scores range from 0 (totally dependent) to 100 (fully independent). An ICC of 0.87 was obtained for interrater reliability.

Other outcomes were length of stay and discharge disposition at 3-month follow-up (community, nursing home, death). The study was not powered to detect differences in these outcomes.

Statistical Analysis

Power to detect a difference in mean levels of severity and duration of delirium between the groups was estimated a priori, assuming a total sample size of 256 participants after attrition and adjusting for cluster effects due to multiple observations being made on the same subjects.

Assuming 30 observations per subject and an ICC of 0.25, this sample size would provide greater than 99% power to detect a medium effect size, or 0.5 times the within-group standard deviation (SD). The difference detectable with 80% power would be 0.185 SDs.

Within-group SDs and effect sizes were estimated from pilot data with up to 30 days of observation on 16 subjects. Based on these pilot data, the DRS had an effect size of 0.45 and the CAM of 0.47. It was determined that a sample size of 256 would provide at least 94% power to detect a difference in the DRS and 96% in the CAM.

The intention-to-treat principle was used for analysis, and the statistician (ML) was blind to group assignment until all analyses were complete. Descriptive statistics including frequencies, means, and standard deviations were calculated separately according to intervention group. Cross-sectional analyses of participant characteristics and other variables with a single observation per participant were compared between groups using analysis of variance (ANOVA) for continuous variables and chi-square analysis for categorical variables. Sample distributions were examined for all analysis variables. Variables showing substantial deviation from normality were rank-transformed for analysis.

Variables with a single observation per subject were compared between groups using ANOVA. Normality was evaluated using analysis of residuals. Length of stay was a single observation per subject (number of days participant received PAC) and was not normally distributed (intervention skew = 3.67, control skew = 2.7). To account for the variable type and nonnormal distribution, group comparison of length of stay was analyzed using a negative binomial mixed model because of the significant contribution of facility to these values (χ 2(1) = 71.77, P < .001). For variables with multiple observations per subject, groups were compared using mixed linear models to account for correlations between repeated measurements made on the same individuals. Categorical dependent variables with multiple observations per participant were analyzed using generalized linear models to implement mixed-model logistic regression analysis. To evaluate the sensitivity of the results to this analytical choice, the primary analyses were repeated with the inclusion of a facility term as a blocking variable in the model. The results showed slight changes in p-values for treatment group comparisons, but no statistical significance was changed. The analyses were also adjusted for the baseline difference in CLOX 2; unadjusted and adjusted results are reported. Time to resolution was evaluated using the Kaplan-Meier product limit survival estimator. The log-rank test was used for between-group comparisons.

Means and SDs and least-squares (marginal) means and standard errors are presented. SDs calculated for variables with more than one observation per individual include between- and within-subject variation.