DSM-5: Validity vs Reliability
This year's American Psychiatric Association (APA) annual meeting was probably the last before the publication of the Diagnostic and Statistical Manual of Mental Disorders, fifth edition (DSM-5), scheduled for May of next year. Hence, there was a sense of tense uncertainty in the many sessions addressing potential DSM-5 revisions.
DSM-5 Task Force Vice Chair Darrel Regier headed a symposium reviewing results of field trials on the reliability of proposed DSM-5 criteria. The trials were meant to assess whether clinicians can use the proposed criteria consistently and provided kappa values for the individual proposals.
Kappa values reflect the agreement in a rating by 2 different persons, after correction for chance agreement. From a statistical perspective, kappa values greater than 0.5 are generally considered good. As an example, 70% agreement between raters translates to a kappa value of 0.4.
Results of the field trials showed good agreement for such disorders as major neurocognitive disorder, autism spectrum disorders, and post-traumatic stress disorder, with kappa values of 0.78, 0.69, and 0.67, respectively. However, poor kappa values, in the range of 0.20-0.40, were reported for commonly diagnosed conditions, such as generalized anxiety disorder and major depressive disorder. All of the observed kappa values in the DSM-5 field trials translate to agreement between clinicians of around 50%.
Is this good or bad? A recent editorial[1] by DSM-5 leaders makes comparisons with other medical settings, and the claim is that most medical diagnoses involve diagnostic kappa values similar to those in the DSM-5 field trials. I spoke with prominent psychiatrists at this year's meeting who were involved in some of these DSM studies and discussions; they expressed unhappiness with the kappa values in DSM-5 field trials, and some pointed out that kappa values in the DSM-III were higher.
So, the reliability of DSM-5 criteria seems to have declined compared to DSM-III. Is this a problem? It might be, but it might not be.
Reliability only means that we agree. It doesn't mean that we agree on what is right. Validity is a separate issue. It could be that criteria are changed so that they are more valid -- that is, actually true -- but this could increase unreliability; raters might have to use, for instance, some criteria that are less objective and hence less replicable.
We will see. DSM-5 might be more valid but less reliable than DSM-IV and DSM-III. If so, that's progress, in a way.
It is also important to think about other medical studies with low reliability. We should be careful about criticizing certain diagnoses, such as bipolar disorder (as some have[2]), without an awareness that this is the case for almost all our diagnoses. The problem of reliability is a general one, not a problem about claimed "overdiagnosis" of some conditions.
In my view, it is definitely time for a new edition of DSM; we can't pretend that something written almost 2 decades ago is anywhere near up to date, with a generation of new research. Some of the proposed changes in DSM-5 -- for example, the inclusion of antidepressant-induced mania as part of bipolar disorder; the inclusion of dimensions for axis II personality conditions; and the removal of nosologically nonspecific axis II diagnoses, such as "histrionic" personality -- are consistent with an update based on convincing new research. But other changes, such as the wish to discourage the diagnosis of childhood bipolar disorder by making up a new category based on limited data (temper dysregulation disorder), merely repeat the mistakes of DSM-IV. Making up diagnoses because we don't like others is not a scientifically sound way to revise a profession's diagnostic system, and it won't serve us well for the next 20 years.
Medscape Psychiatry © 2012 WebMD, LLC
Cite this: Nassir Ghaemi. DSM-5: Finding a Middle Ground - Medscape - Jun 01, 2012.
Comments