Meta-analyses: 5 Things to Know

Christopher Labos, MD, CM, MSc


November 05, 2018

Christopher Labos, cardiologist and epidemiologist

Contrary to popular belief, good meta-analyses are neither quick nor easy.[1] To be done correctly, meta-analyses require meticulous preparation and a thorough, detailed review of the literature. In recent decades, the number of systematic reviews and meta-analyses being published has skyrocketed, increasing over 2000%.[2] Unfortunately, the rise in popularity has been matched with a decline in overall quality[2,3] and some cases of outright fraud.

But more worrisome is that the plethora of meta-analyses published often contradict one another and confuse rather than clarify the existing evidence base.[2] Although meta-analyses are generally given more prominence than randomized trials in the hierarchy of evidence,[4] it is important to acknowledge that meta-analyses are only as good as the techniques that produce them. A better understanding of these techniques will help consumers of the medical literature better judge which meta-analyses are valid and which are best ignored.

1. Meta-analysis Versus Systematic Review

The terms "systematic review" and "meta-analysis" are often, and erroneously, used interchangeably. A systematic review is essentially a thorough and structured search of the literature followed by a critical appraisal of the data. A meta-analysis, on the other hand, is the process of combing and synthesizing your results into an overall summary estimate. One can, and sometime should, perform a systematic review without performing a meta-analysis. A meta-analysis is the final step of a systematic review, and what most people do not understand is that it is also optional.[5]

What makes a meta-analysis reliable and accurate is the quality of the studies you put into it. The "garbage in, garbage out" paradigm is indeed an apt one. If someone decides to perform a meta-analysis, the critical reader needs to look at the search strategy, the quality and heterogeneity of the data, and how that heterogeneity was dealt with.

2. How Good Was the Search Strategy?

Before thinking about the meta-analysis itself, one should ask how well the authors performed the search strategy. An author list that includes a medical librarian will probably produce a more comprehensive search algorithm that is less likely to miss relevant studies.[6] Also, searching multiple databases, reviewing trial registries, hand-searching key journals, and reviewing conference proceedings to check for unpublished literature are indicative of a more thorough search strategy.[7]

Publication bias, the tendency for negative results to be excluded from the literature, can severely hamper any meta-analysis. There are ways to test for publication bias, such as funnel plots, but these tests can be underpowered when there are few studies in the review (Figure 1).[8] Thus, it is important not to be misled and acknowledge that some degree of publication bias may be present even if tests are negative.

Figure 1. Three fictional funnel plots. A (top left). The symmetry of the funnel plot argues against any significant publication bias. B (top right). The exclusion of certain smaller studies creates asymmetry in the funnel plot and suggests that publication bias is present in this systematic review. C (bottom left). There is a suggestion of asymmetry, but too few studies were included to be certain whether publication bias is present. A formal Egger test would be nonsignificant (and unhelpful) in this case. In studies with a limited number of studies, tests for publication bias can be underpowered and yield false-negative results.

3. Measuring Heterogeneity

If you are satisfied that the literature search was comprehensive and complete, then the next thing to assess is the quality of the data. One key consideration when reading a meta-analysis is determining whether the results are consistent or whether they show a large degree of heterogeneity.

In a general sense, "heterogeneity" refers to the diversity and differences between studies. However, when researchers speak of heterogeneity in the context of meta-analysis, they are speaking of statistical heterogeneity, whereas most readers are concerned with clinical or methodological heterogeneity.[9]

Statistical heterogeneity is generally measured using the I2 statistic, a measure between zero and 100% that assesses how much of the study variation is due to differences in the study outcomes as opposed to chance. What is often neglected, though, is that statistical heterogeneity and I 2 measure the variation in the study outcomes, not in the study characteristics. Clinical heterogeneity is what occurs when different studies are done in different patient populations (men versus women, patients with versus those without diabetes, primary versus secondary prevention, humans versus animals), and methodological heterogeneity is what occurs when different studies have different study designs (randomized versus observational, adjusted for different variables, used different endpoints) (Figure 2).

Figure 2. Forest plots from two fictional meta-analyses. A (left). This plot demonstrates significant heterogeneity in the included studies, which translates into the wide confidence intervals and therefore some uncertainty in the overall estimate. Not performing a meta-analysis or exploring the reasons for heterogeneity would be indicated in this case. B (right). This plot demonstrates very little heterogeneity, and therefore the overall estimate has narrower confidence intervals. However, a low I 2 statistic does not provide any information about whether the five studies in this meta-analysis were done in similar patient populations or were conducted using a similar methodology.

Deciding whether there is too much heterogeneity or whether the studies in a systematic review are similar enough to meta-analyze is no mean task. Contrary to popular belief, there is no simple test to determine whether the study data are consistent enough for a meta-analysis. Although the I 2 statistic is sometimes used to judge the accuracy of a meta-analytic result, this practice may not be appropriate.

In a series of simulation studies, Melsen and colleagues[10] demonstrated that the I 2 statistic was not necessarily useful in predicting how accurate a meta-analysis was. Its usefulness depended to a large degree on the amount of clinical and methodological heterogeneity—with high levels of heterogeneity, even low I 2 values have low predictive capacity. Melsen and colleagues concluded that if the I 2 statistic is high, one should be cautious about performing a meta-analysis; however, a low I 2 statistic should not be interpreted as a green light to proceed.

4. Meta-analyses Versus Randomized Trials

Given the inherent problems synthesizing data with meta-analyses, a skeptical reader may be inclined to ask whether they should put more trust in meta-analyses or in randomized trials. Worryingly, a review by Ioannidis and colleagues[11] found that meta-analyses and large trials can disagree up to 23% of the time.

Unfortunately, there are no hard and fast rules for choosing between the two. Meta-analyses are only as good as the included studies, and a single randomized trial may be contradicted by subsequent research.[12] However, given the potential for false-positive results, meta-analyses of a series of trials seem to be preferable to using a single trial to the judge the efficacy of treatment.[13] Indeed, even using just two trials instead of just one seems to reduce the error rate substantially.

5. Resolving the Conflict

One way to resolve conflicting results in meta-analyses is to explore and understand the sources of heterogeneity. The Cochrane Collaboration suggests several strategies for dealing with heterogeneity. One technique is subgroup analysis. In one example, subgroup analyses helped explain why a meta-analysis and a large trial disagreed on the benefit of using calcium to prevent preeclampsia.[14] The overall meta-analysis contained significant heterogeneity because it included women at high and low risk and a mix of placebo-controlled trials and studies without placebo controls. When the meta-analysis was restricted to placebo-controlled studies of low-risk patients, it generated a result consistent with the large trial that calcium supplementation did not prevent preeclampsia.

Other, more advanced statistical techniques also exist. Meta-regression is often used to account for differences in continuous variables, such as age. A meta-analysis can determine that the effectiveness of carotid stenting can vary with the age of patients.[15] They can also help determine how the benefits of statins vary with the degree of LDL-C lowering,[16] or whether the benefit of aspirin varies by dose.[17]

An increasingly popular but much more laborious option is the use of individual patient-level data meta-analyses. In these types of analyses, researchers use the original raw data from studies rather than the summary results from the published paper. This allows them to reanalyze the data in the same way across all studies, which can help reduce a significant amount of the clinical and methodological heterogeneity. Apart from being labor-intensive, these types of meta-analyses also require access to the raw data sets of every study included, which is not always possible.[18]

Other, more advanced techniques, such as cumulative meta-analyses and network meta-analyses, can account for more subtle sources of heterogeneity, such as the evolution of techniques and treatments over time, and can allow comparisons between multiple treatments and interventions.


In the end, meta-analyses themselves are neither good nor bad and are not better or worse than randomized trials. Like every type of study, the quality of a meta-analysis is dependent on the amount of effort put into it. The thoroughness of the literature search, the quality of the data, and the degree to which researchers attempt to explain and account for heterogeneity all determine how much faith one can put into its summary estimate.

Although often thought of as a quick and easy way to get a publication, meta-analyses require a great deal of thought and planning to do properly. If done right, they can help answer some of the more contentious questions in the literature. If done wrong, they may not be worth the paper they are printed on.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.