Post Hoc Power Analysis: An Idea Whose Time Has Passed?

Marc Levine, PhD, and Mary H. H. Ensom, Pharm.D., FASHP, FCCP, Faculty of Pharmaceutical Sciences, University of British Columbia, and the Department of Pharmacy, Children's and Women's Health Centre of British Columbia, Vancouver, British Columbia, Canada

Pharmacotherapy. 2001;21(4) 

In This Article

Abstract and Introduction

Using a hypothetical scenario typifying the experience that authors have when submitting manuscripts that report results of negative clinical trials, the pitfalls of a post hoc analysis are illustrated. We used the same scenario to explain how confidence intervals are used in interpreting results of clinical trials. We showed that confidence intervals better inform readers about the possibility of an inadequate sample size than do post hoc power calculations.

Over the past 30 years, randomized, controlled, clinical trials have become the standard for generating the best evidence for the efficacy of drug therapy. By the late 1970s, however, it became obvious that investigators frequently were reporting results of clinical trials in which too few subjects were included to determine whether treatments under investigation were superior to placebo. Such studies lacked sufficient statistical power to detect potentially clinically important effects. One survey[1] noted this problem and sampled 71 clinical trials from the literature that had negative findings. The authors calculated that 67 of the 71 trials had a greater than 10% risk of missing a true therapeutic improvement of 25%, and that 34 of these negative trials had confidence intervals consistent with a 50% improvement. These and similar observations[2,3,4] led to a general recognition that clinical trials needed to be designed with great attention paid to statistical power.

Before a clinical trial is started, its power is defined as the probability of detecting an effect as large or larger than that used in the design of the trial when such an effect exists. Put another way, the power of a statistical test used to analyze the data of a trial is "the probability that the test will lead to rejection of the null hypothesis in favor of the alternative when the null hypothesis is indeed false."[5] In recent years, many research grant committees and professional journals began to expect investigators to provide details of their sample size estimate when outlining the methods of a clinical trial. With this change in practice, however, it became necessary to develop a rational approach to the interpretation of the results of negative trials (i.e., those in which the null hypothesis is not rejected). With the increasing use of power analysis in the a priori estimation of sample size, a logical extension of this concept was to recommend that negative trials be evaluated with a post hoc approach to power analysis. Authors and readers were encouraged to address the question, "What was the power of the study to detect the observed effect?"[6] It then became common for manuscript reviewers and journal editors to insist on a post hoc power analysis when authors submitted manuscripts on studies with negative results. This is still largely the case, despite that post hoc power analysis was shown by a few authors to be incorrect and misleading.[6,7] (As anecdotal evidence, recent reviewers' comments on a research manuscript requested, "What was the power of this study to detect a real difference?" and "the power of the tests to detect differences should be given and discussed.")

We believe that post hoc power analysis should not be applied to the results of negative trials, and we encourage the rational use of confidence intervals in both the design and interpretation of results of clinical trials.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.
Post as: