Measuring Depression in the Elderly: Which Scale is Best?

Holroyd, MD, Anita H. Clayton, MD

In This Article

Assessing the Depression Rating Scales

Validity and Reliability

The validity and reliability of the evaluation instrument is central to the selection of the appropriate method of assessment. Validity is the degree to which a scale truly measures the symptom it is intended to assess, whereas reliability is the degree to which measures are consistent and reproducible.

Face or content validity is checked by reviewing the questions on the instrument to determine if they actually assess the characteristics that the instrument was designed to assess. Validity may also be checked by "convergent validity" -- the correlation between 2 different instruments that are designed to measure the same thing.

Sensitivity and specificity. Validity studies must also include information on sensitivity and specificity. Sensitivity is the extent to which patients who truly have a certain symptom(s) are accurately classified. Specificity is the extent to which patients without the symptom(s) are accurately classified. Thus, rating scales must be tested in a population that contains members who have and do not have the symptom or condition to be assessed. Specificity and sensitivity results will depend, in part, on which sample population is being tested and the way in which the screen is administered. For example, population characteristics, such as age, level of education, or cultural background, may affect the results of a test.[6]

The screening instrument must be reliable so that the clinician can trust that the observed measurement differences are related to the patient's condition rather than to the unreliability of the scale. As a rule, instruments used to make decisions on individual patients as either case detection or assessing change during the course of treatment require higher reliability scores around the 0.90 level, whereas group comparisons for comparing outcomes (eg, after a medication treatment trial) will only require reliability scores in the 0.50-0.70 range.[7]

Clinical Setting for Test Administration

The clinical setting in which the instrument will be administered is important because time constraints may limit the choice of instruments. In addition, the clinician should determine whether a patient self-report scale or an observer scale or interviewer-administered scale is more appropriate.

Interviewer-administered scales. Although interviewer-administered scales ensure completeness of reporting -- which may be important in elderly persons as well as in clinical research trials -- they may possibly contain interviewer biases that could influence the results. Thus, interview-administered questionnaires should be used by those trained to administer the instrument in an attempt to decrease the personal bias in the interpretation or phrasing of questions.

Self-reporting scales. Self-reporting scales are most cost effective in terms of staff time. However, some fragile elderly persons may need help from family members to complete such questionnaires. The help received from a third party may result in data that do not truly reflect the opinions of the elderly person. In addition, persons who are poorly motivated or depressed may have a tendency not to comply with the assessment procedure. Finally, shame and fear of stigma may lead to underreporting of symptoms of self-administered scales. Some scales are designed to be filled out by informants, such as family or friends, so that patients who are unable to fill out questionnaires (eg, cognitively impaired persons) can be assessed.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.