The Curse of Sports Illustrated Hits Medical Research

Andrew J. Vickers, PhD


December 18, 2012

Get yourself on the cover of Sports Illustrated, the legend goes, and you are at high risk for a slump or a season-ending injury. As a typical example, the Kansas City Chiefs were featured in November 2003 after starting the season 9-0. They subsequently went 4-3, lost home field advantage, and were eliminated in the playoffs.

The curse of Sports Illustrated is a high-profile example of what statisticians call "regression to the mean." Take any observation and then wait around for a bit and come back for another look; whatever it was will look a little bit more average the second time around. For instance, imagine that you ask a class of students to throw 2 dice. Now focus on the handful of students who got a double 6 and ask them to throw again. Chances are that their second roll will equal a lower number than their first: their dice-throwing results would have regressed to the mean.

Obviously, if you asked all students to throw again, their second throw is no more or less likely to be close to the mean than the first. That a subgroup was picked out for being special is what made the second throws so disappointing. The 2003 Kansas City Chiefs wouldn't have been on the cover of Sports Illustrated had they not had a 9-0 start, but the fact that they were 9-0 made it almost inevitable that subsequent results would be worse: The 2003 Chiefs were never going to be a team that won 100% of their games.

But picking something out because it is exceptional is exactly what we do in medical research. "Following up exciting findings" is good science, but it is also a recipe for regression to the mean. A trial comes out with exciting results -- a dramatic reduction in mortality, say. The paper is published in a high-profile journal, which sends out a press release leading to widespread media coverage. The authors apply for a large grant to undertake a replication study which, given the important implications of the initial trial, is funded straightaway. Just as in the dice-throwing example, we pick out a subgroup of results to replicate and then are disappointed by the results. And dice-throwing is an entirely apt analogy because chance plays a large role in clinical research.

Take the case of a drug for acute stroke that reduces the risk for death from 20% to 14%, a clinically important 30% relative risk reduction. Imagine that a researcher conducts a pilot trial with 100 patients. Given the play of chance, it wouldn't be at all unusual if only 3 of 50 patients taking the drug died. If the death rate in the placebo group were unchanged, this would constitute a dramatic 70% relative decrease in the risk for death. But given a true event rate of 14%, it also wouldn't be unusual if 11 of 50 patients in the drug group were to die -- that is, slightly higher mortality than placebo. It is obvious which of these 2 pilot results would lead to a further replication study and subsequent regression to the mean.

In a recent paper in the Journal of the American Medical Association (JAMA), Pereira and colleagues[1] systematically demonstrate this effect using a large sample of clinical trial meta-analyses. They found numerous examples whereby an initial trial showing dramatic results was followed by subsequent trials with less positive findings, concluding that "most large treatment effects emerge from small studies, and when additional trials are performed, the effect sizes become typically much smaller." This sounds very much like regression to the mean and the sort of thing that a wise old professor might respond with when asked about an exciting new trial result.

The second conclusion from the JAMA study is perhaps more problematic: "Well-validated large effects are uncommon and pertain to nonfatal outcomes." There is something comforting about the counterintuitive: It is much more interesting to say that most medical research is nonsense and that medicine doesn't work half the way we like to think it does, than to sound like the narrator of a 1950s propaganda film about the "miracles of modern medicine." But it is quite a stretch to go from "we don't see very large effects on mortality in meta-analyses of randomized trials" to "we don't have very large effects on mortality." If you don't believe me, go to a playground -- 200 years ago. Then, about 1 in 5 children did not live to see their 10th birthday. Today, it is about 1 in 100, an odds ratio of 0.04. Alternatively, compare, if you like, the mortality rates in the Civil War vs the Iraq War.

Yes, we should be concerned about the conduct of medical research, and yes, regression to the mean entails that we should be very cautious about initial dramatic findings. But that is no reason for cynicism about the benefits of good medicine and good medical research.