Systematic Review and Meta-Analysis of Bariatric Surgery for Pediatric Obesity

Jonathan R. Treadwell, PhD; Fang Sun, MD, PhD; Karen Schoelles, MD, SM


Annals of Surgery. 2008;248(5):763-776. 

In This Article


Experienced medical librarians searched 15 databases, including PubMed and EMBASE, for relevant studies. We also examined the bibliographies from identified studies, reviews and gray literature. The last search was conducted on December 31, 2007.

All patients must have been age ≤21 at the time of surgery. The study must have appeared as an English language article in a peer-reviewed journal. The surgical procedure must have been a procedure currently performed in the United States, and if more than 15% of patients in the study had different bariatric procedures (eg, 50% RYGB and 50% LAGB), data must have been separated by procedure. The study must have reported data on weight, BMI, comorbidity resolution, quality of life, and/or survival. We only included data that were based on at least 3 patients who represented at least 50% of pediatric surgical patients. For weight or BMI data, we only considered data at least one year after surgery, but there was no minimum follow-up for other outcomes. For quality-of-life outcomes, the study must have measured quality of life before and after surgery using a previously validated instrument. Data on any nonsurgical control groups were included only if the patients receiving nonsurgical treatment were sufficiently similar to surgical patients. If there were multiple reports from the same surgical center, we avoided double-counting patients by including data and outcomes that were based on the largest number of patients and still meeting the other inclusion criteria.

We used studies' reports of BMI as a key outcome. Body fat is more accurately measured using hydrodensitometry or dual-energy x-ray absorptiometry (DXA),[41] but these methods are highly labor-intensive and costly. BMI, however, only requires measurements of height and weight. Field et al (2003)[41] found that among 596 children and adolescents, BMI explained 72% of the variance in body fat (corresponding to a Pearson r correlation of 0.85). Furthermore, the CDC have stated that BMI is a reliable indicator of body fatness in most children and teens.[42] These observations suggest that in pediatric patients, BMI is a reasonably accurate surrogate for body fatness, thus we used BMI as an outcome measure.

Meta-analyses of the mean change in BMI were conducted using the random-effects method of DerSimonian and Laird.[43] Because patients had already undergone unsuccessful attempts at weight loss prior to surgery, our first set of analyses assumed that patients would not have lost weight without surgery. This assumption was tested in sensitivity analyses in which we investigated alternative assumptions that, without surgery, patients would experience modest weight loss (up to 3.2 BMI units, which was the BMI reduction in the nonsurgical study by Berkowitz et al[24]). We measured heterogeneity with the I2 statistic, with I2 ≥ 50% defining substantial heterogeneity.[44]

For weight loss, a clinically significant amount was defined as 7% of body weight, because patients who lose this amount of weight have been shown by other researchers to yield substantial reductions in medical comorbidities of obesity (eg, diabetes).[45,46]

For meta-analysis of before-after studies of change in BMI, the computation of an effect size requires a patient-level correlation between presurgical BMIs and postsurgical BMIs. Five studies reported such individual patient data, so we calculated the correlation for each of these studies, and then performed a random-effects meta-analysis of these correlations. We then used the summary correlation (0.60) as an imputed correlation in studies that had not provided individual patient data. In subsequent robustness tests, we used the 95% confidence bounds of this correlation (0.36 and 0.76) to determine sensitivity to the choice of correlation.

Other sensitivity analyses included the removal of one study at a time to determine whether the conclusion was driven by any single study; cumulative meta-analysis to determine sensitivity to publication date; assessment of the width of the confidence interval around a summary effect size to determine the robustness of a quantitative estimate; and removal of studies with less than 75% follow-up to determine sensitivity of conclusions to the inclusion of studies with 50%-74% follow-up.

We evaluated the overall stability and strength of the evidence for weight loss and comorbidity resolution after bariatric surgery using a formal rating system.[47] The system incorporates the quality, quantity, consistency, robustness of the evidence, as well as the magnitude of observed effects. Quality refers to the degree of potential bias in the design or conduct of studies. Quantity refers to the number of studies and the number of enrolled patients. Consistency addresses the degree of agreement among the results of available studies. Robustness involves the constancy of conclusions in the face of minor hypothetical alterations in the data. Magnitude of effect concerns the quantitative amount of benefit that patients experience after treatment.

Our system employs decision points that collectively yield an overall category that describes the strength of the evidence for a quantitative estimate and qualitative conclusion as strong, moderate, weak, or insufficient. The qualitative conclusion addresses the question, Does it work? The quantitative estimate addresses the question, How well does it work? This distinction allows flexibility in ratings of different aspects of the evidence. For example, an evidence base can be considered weak in terms of the precise quantitative estimate of effect (eg, if estimates vary widely among studies), but strong or moderate with respect to the qualitative conclusion (eg, if all studies nevertheless demonstrate the same direction of effect).

To rate the quality of case series of bariatric surgery, we considered 6 criteria: (1) whether the study was prospective; (2) whether the study had included consecutive patients; (3) whether the outcome assessment was performed by an independent party; (4) whether the study was not funded by a financially interested party; (5) whether the outcome was objective; and (6) whether the data for the outcome contained at least 85% of the pertinent included patients. We assessed the quality of a given study separately for the different outcomes and timepoints reported by that study, because some criteria (eg, 85% completion) can vary by outcome or timepoint.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.