Measuring Depression in the Elderly: Which Scale is Best?

Holroyd, MD, Anita H. Clayton, MD

In This Article

Overview of Currently Used Depression Scales

Hamilton Rating Scale for Depression

The HAM-D was developed as a measure of treatment outcome rather than a screening or diagnostic tool for depression.[8] Although the HAM-D was not designed to diagnose depression, it is commonly used as a screening scale, particularly in the context of clinical trials to try to identify participants with depressive disorders. The HAM-D is a 21-item rating scaled used to systematize clinical observations of features related to depression. Ten items are ranked on a scale from 0 to 4; 9 items are ranked 0 to 2; and 2 items are ranked 0 to 3. Typically, a break score of 18-20 is used to differentiate persons with probable depressive disorder.[9,10] The HAM-D is completed by a trained observer after a 30-minute clinical interview that assesses symptoms of depression. Although the HAM-D is the most widely used observer scale for the assessment of depression,[11] it has not been well validated in the geriatric population.[12] Problems noted with the HAM-D include a heterogeneous factor analytic structure; an emphasis on behavioral symptoms and somatic complaints that neglects self-reported feelings of distress; and an intermingling of frequency and intensity of symptoms in scoring.[11]

Zung Self-Rating Depression Scale

The SDS was initially developed as a self-rating scale.[13] It has been used widely in epidemiological studies and consists of 20 items. One criticism of the SDS is that it uses graded responses (ie, never, sometimes, usually, always) that may be confusing to elderly patients; thus, they may require some assistance from the examiner or others to complete the form.[14] Another problem with the test is that the mean score for elders is significantly higher than that for younger subjects, with many normal elders assessed as false-positives.[15] For example, Zung suggested a classification cutoff score of 40 for depression which would lead to a sensitivity of 88%, but a false-positive rate of 44%.[16] Further, the SDS often misses depression in the elderly if depression takes the form of multiple somatic complaints.[17] Because of these problems, several authors have suggested that the SDS not be used for either research or clinical assessment of geriatric depression.[18,19] Sensitivity and specificity figures for the SDS using a cutoff of 60 revealed a sensitivity range of 58% to 76% and a specificity range of 82% to 86%.[20,21] Despite concerns over the use of the SDS in the elderly, it continues to be used in research, especially in Europe,[22] where it has been noted to reveal sex and age differences in the scale's factor structure in the elderly population.[23] A short form (12 items) of the SDS has been developed, but there has been little validation of this test in depressed elderly.[24,25]

Montgomery-Asberg Depression Rating Scale

The MADRS is particularly sensitive to measuring change in symptoms with treatment over time.[26] The MADRS is an observer-rated scale that is based on a clinical interview that moves from broad questions to more detailed ones. There are 10 questions, but each question has 6 possible ratings and covers core symptoms of depression, such as sadness; sleep difficulties; changes in appetite and concentration; and pessimistic and suicidal thoughts. It does not assess somatic symptoms, which may be important in the elderly population. Although the MADRS has been well validated and compared against other rating scales in younger populations,[11,27] it has not been validated sufficiently in the geriatric population.[28,29]

Geriatric Depression Scale

The GDS was developed with the recognition that depressive symptoms in elderly patients require an instrument designed to discriminate the pattern of depressive symptoms from the general characteristics of the elderly population.[14,30] It originally contained 100 items, but this number was condensed to 30 questions that indicate presence of depression. The scale was designed as a self-administered test, although it has been used in observer-administered formats as well. One advantage of the test is the "yes/no" question format, which may be more acceptable in the elderly population. It was initially validated among patients hospitalized for depression and among normal elderly living in the community without complaints of depression or history of psychiatric illness.[30] A cutoff score of 11 on the GDS yields an 84% sensitivity rate and a 95% specificity rate, whereas a cutoff score of 14 yields a slightly lower sensitivity rate of 80%, but a 100% specificity rate.[30] Thus, it has been suggested that scores of 0-10 be viewed as in the normal range and scores of 11 or more being a possible indicator of depression.[14] During the development of the GDS, it was noted that vegetative symptoms failed to differentiate depressed and nondepressed elders, thus these symptoms are largely not assessed by the GDS.[14]

The GDS has been well studied in various geriatric populations unlike the other instruments discussed. It has been found to be a valid measure of depression in elderly medical inpatients.[31,32] For example, a sample of 128 men aged 70 years and older showed that a GDS cutoff of 11 had a sensitivity of 92% and a specificity of 89%.[33] It has also been validated in geriatric medical outpatients[34] and in day-treatment patients.[35]

Among nursing home patients, the validity of the GDS appears to be dependent on the degree of cognitive impairment.[36] Earlier studies done in nursing home and other long-term care facilities claimed validity of the GDS in this population.[37,38,39] However, these studies were methodologically flawed in that they did not include all residents in these facilities, excluding cognitively impaired patients. For example, Lescher,[38] excluded 50% of potential subjects because of cognitive impairment among his nursing home studies. Hickie and Snowdon[39] also specifically excluded patients with any evidence of dementia and delirium. Therefore, these studies were, validating the GDS in institutionalized elderly without cognitive impairment. A study by Parmelee and colleagues,[37] did not specify details of cognitive impairment and, notably, only 51% of the 806 participants were able to complete all 30 items. Inability to complete the GDS was said to correlate with cognitive impairment. However, a study of institutionalized patients by Kafonek and colleagues[40] revealed that a GDS that used a cutoff of 13 was only 47% sensitive and 75% specific in screening for depression. The GDS was not felt to be a useful screen for depression in this population. In this study, patients had low cognitive scores, and 9 of 37 cognitively impaired subjects had difficulty responding "yes" or "no" to many questions. One study showed a mean Mini-Mental State Examination (MMSE)[41] score of only 4.8 out of 30 among this group. The authors suggested that the GDS may be an adequate screening tool in mildly demented subjects but not in moderately to severely demented subjects. Support for this position came from a study of nursing home residents[42] that described a 2-step procedure for selecting subjects with MMSE scores greater than or equal to 15 and then giving the GDS. This procedure significantly improved the ability of the GDS to detect depression in nursing home residents. In this study, a cutoff score of 10 or greater was used on the GDS to indicate depression, and when participants of all cognitive levels were included (n = 66), the GDS had a sensitivity of 63% and a specificity of 83%. When those with a MMSE score greater or equal to 15 were included (n = 44), the sensitivity and specificity improved to 84% and 91%, respectively.[42] The use of the GDS in geriatric outpatients with cognitive impairment has also shown it to be an accurate screening test in cognitively intact populations; however, the GDS does not maintain its validity in populations that contain large numbers of cognitively impaired patients.[43,44] In one study, the GDS maintained validity in cognitively impaired patients (MMSE score, 17.1).[44]

The GDS is available in several languages,[45] and it has been found to maintain its reliability and validity when administered by telephone,[46] which may be useful in a variety of epidemiological and clinical settings. A collateral source version of the GDS has been developed, although not extensively tested, which may prove useful as a screening instrument in those with aphasia, other communication deficits, or cognitive impairment.[47]

GDS short form. A short form of the GDS (15 items) has also been developed.[48] The short form takes an average of 5-7 minutes to complete and is composed of the 15 items from the original GDS that had the highest correlation with depressive symptoms. The long-form of the GDS and the short-form are highly correlated (r = 0.84, P < .001).

The GDS short form has been validated in a geriatric affective disorder outpatient clinic (N = 116; average age 75.7 years). Using an optimal cutoff score of 5-6, the short-form GDS showed a sensitivity of 85% and specificity of 74%.[49] In a comparison of the short form and the original long form in a sample of psychiatric inpatients, the short form proved to be highly correlative (r = 0.84).[50] The authors determined that overall the short form was an adequate substitute for the long form.[50]

The GDS short form has also been assessed for tolerability in outpatient family practice settings, as were 10-, 4-, and 1-item versions.[51] In this study by D'Ath and colleagues,[51] the GDS-4 had lower internal consistency than the GDS -15, but missed only 5 of 46 depressed patients in this sample. It was felt to be useful as a minimal screening procedure for detecting depression in elderly, primary care patients, especially among practitioners who feel that the 15-item GDS is too long. There has not been further validation of these shorter scales in other studies.

GDS as a measurement of change or improvement in depression. There have been fewer studies designed to assess GDS as a tool for measuring change or improvement in depression.[48,52] A small study of 30 community dwelling elderly aged 60 years and older compared the GDS with the Beck Depression Inventory (BDI) short form and found the GDS to be as sensitive as the BDI in measuring changes in depression over time when using the HAM-D as an index. However, given the small sample size, further research is needed to demonstrate the sensitivity of the GDS to changes in depression.[52]

GDS vs HAM-D. Unfortunately, there have been few studies comparing the GDS with the most widely used observer scale for the assessment of depression, the HAM-D. One study compared the GDS with the HAM-D in 30 patients with dementia and used a psychiatric assessment of depression for comparison.[12] This study demonstrated that the GDS was superior to the HAM-D for detecting depression.[12] It was felt that because many of the patients had moderate dementia, the GDS, which requires simple "yes" or "no" answers may be superior to the HAM-D, which (although it is an observer-rated scale) places the responsibility on the patient to come up with certain affect labels and feeling statements. It was also felt that manifestations of dementia, such as affective blunting, may make it difficult to use the HAM-D in this population. Further, the GDS elicits present-state answers, rather than requiring subjects to use their memory, as required by the HAM-D -- for example, "how has your mood been in the past two weeks?"

A study by Brink and colleagues that compared the GDS with the HAM-D revealed a cutoff score of 11 on the HAM-D yielded sensitivity and specificity rates (86% and 80%, respectively that were similar to those for GDS with a cutoff score of 11 (sensitivity 84%, specificity 95%).[14] A small study of 14 patients (mean age, 66 years) with a diagnoses of generalized anxiety disorder showed that the GDS was much more sensitive in eliciting depressive symptoms than the HAM-D.[9]

Depression Scales for Patients With Dementia

Certain scales have been developed for use in demented populations, which use outside informants (caregivers, nursing home staff) to provide history and reliable symptom reporting. As noted above, a collateral source form of the GDS has been developed for use in the cognitively impaired, although it has not been validated in a demented population. Studies have suggested that information gathered by outside sources reveals more depressive symptoms than dementia patients admit themselves.[53] One study that instructed caregivers to fill out traditional depression scales showed that caregivers are reliable surrogate reporters of depressive symptoms in patients with Alzheimer's disease.[54] The best validated scale for dementia patients is the Cornell Scale for Depression in Dementia (CSDD).[55] The CSDD is an interviewer-administered scale that uses information both from the patient and an outside informant. The scale has correlated well with depression as classified by the Research Diagnostic Criteria .[55] Factor structure analysis reveals 4 to 5 factors that are assessed by the CSDD, including general depression, biologic rhythm disturbances, agitation/psychosis, and negative symptoms.[56] However, even the CSDD has been better validated in patients with mild to moderate dementia, compared with patients with severe dementia.[57,58] The CSDD has been used in aphasic patients and compared with Research Diagnostic Criteria.[59]


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.