Optimal Combinations of Ultrasound-based and Serum Markers of Disease Severity in Patients with Chronic Hepatitis C

J. F. L. Cobbold; M. M. E. Crossey; P. Colman; R. D. Goldin; P. S. Murphy; N. Patel


J Viral Hepat. 2010;17(8):537-545. 

This study directly compares the diagnostic performance of APRI, TE, ELF and HTT. For the diagnosis of cirrhosis, ELF performed as well as TE, but better than APRI. ELF performed as well as both TE and APRI for the diagnosis of moderate-to-severe fibrosis from mild disease. The diagnostic accuracy for HTT was substantially lower than that of the other markers. In addition, HTT did not contribute significantly to the marker combinations, despite a modest increase in accuracy. HTT was also the least reliable test, with an ICC of 0.8 (compared to reliability in excess of 90% for the other markers).

This study has combined markers on the basis of cost and practicality. As increasing numbers of biomarkers have been shown to predict liver disease severity, such considerations will help determine which markers enter clinical practice or are employed in clinical trials.

The use of a three-group ordinal regression analysis enabled the interactions between markers to be investigated in combinations. It should be noted that a single cheap test, the APRI score, correctly predicted fibrosis severity in 92% of those whose severity was correctly predicted by a two-test model, including either ELF or TE, and 88% of those correctly predicted by a three-test model, including both TE and ELF. Scrutiny of the multi-marker models allowed an assessment of the interplay between tests when predicting the histological stage. When a component of the combination was significant, its removal from the model significantly altered the goodness-of-fit of that particular model and those tests were considered complementary. If there was no significant change on removal of a test, it was considered redundant. For example, both ELF and TE significantly enhanced a two-test model of fibrosis containing the APRI score, while neither ELF nor TE contributed significantly to the three-test model provided by the other two tests (Table 6). This does not imply that either test is not a predictor of fibrosis, but that its inclusion in that model did not provide additional information to the other tests; that is, there was a degree of redundancy.

As this was a head-to-head analysis without missing data, systematic bias between tests was minimized. The tight exclusion criteria and well-matched patient groups minimized potentially confounding factors. The analysis approach enabled the contribution of each test to the accumulated combinations to be quantified.

In the context of the published literature, the diagnostic accuracy achieved for the tests using ROC curves for the diagnosis of cirrhosis from less severe fibrosis is comparable to that found by previous studies using TE and/or APRI, while the diagnostic accuracy for the separation of mild from more severe fibrosis exceeded 80% for the APRI score, ELF panel and TE, but only just exceeded 70% for HTT, which was also comparable to other studies.[23–26] Previous publications have proposed algorithms where agreement or concordance between tests of similar diagnostic accuracy lent confidence to the interpretation of a binary result and proposed avoidance of diagnostic liver biopsy. Where results were discordant, biopsy was proposed.[23] This current study provides evidence on whether tests are complementary or redundant. This may enhance the building of optimal combinations of tests. If tests are complementary, agreement will reflect the disease state. If tests are redundant, agreement may simply reflect that the tests are measuring a similar parameter, which may or may not be closely related to the disease state or reference standard.

While this study aimed to quantify the contributions of each marker to models of disease severity, some features are qualitative. In particular, HTT may be performed at the time of abdominal ultrasound examination, which is a part of routine assessment of patients with chronic liver disease, reducing the additional costs. Examination of the dynamic enhancement of hepatic parenchyma increases the detection of hepatocellular carcinoma,[27] which is a major complication in patients with cirrhosis, but is also seen in pre-cirrhotic disease in patients with chronic hepatitis B virus infection. Such a technique may, therefore, be an appropriate addition to APRI in those without access to TE or ELF markers, and in cohorts of patients where imaging of the liver is required. However, the evidence from this study demonstrates its inferiority for the assessment of fibrosis compared to the other markers studied.

This study was limited by the use of liver biopsy with inherent sampling variability,[1,3] itself a surrogate marker of disease, as a reference standard. Sampling variability was minimized by exclusion of samples <10 mm in length. Inter-observer variability was minimized by the review of all biopsy specimens by an experienced hepato-histopathologist. There was a median delay from biopsy to investigation of 4 months in this cohort. However, there was no difference in delay between groups. In addition, studies have demonstrated that in chronic hepatitis C, it is unlikely for a significant change in fibrosis stage to be observed within 3–5 years.[7]

Biopsy criteria in many centres lead to the underrepresentation of patients with mild disease, where histology is the reference standard compared to the patient population as a whole. Moreover, a significant minority of patients with normal transaminase tests, who are less likely to undergo biopsy, are found to have a substantial degree of fibrosis.[28] Patients with clinically overt or decompensated cirrhosis of known aetiology do not usually require biopsy for clinical purposes and hence the full spectrum of disease is unlikely to have been fully represented, leading to potential selection bias in the current study population, which may affect ROC curves and derivatives.[29] However, the ordinal regression model presented is robust to such effects, as strict cut-off values and assumptions about disease prevalence are not used.

The cohort examined in this study is large for a study examining nonserological techniques, yet small compared to a number of studies of TE and serum markers. As such, extrapolation of the results to create clinical algorithms was not considered appropriate. TE and ELF were found to perform equally in this study. A larger study may have the power to detect a difference in the diagnostic performance of these tests if such a difference exists. While this study provides novel information about the relative diagnostic performance of four markers, singly and in combination in the context of cost and practicality, a larger scale cost-benefit analysis would allow further appraisal of these tests for widespread incorporation into clinical usage. Consideration should also be made of potential access to hardware, appropriately trained personnel or external laboratory services. Finally, for noninvasive techniques to replace liver biopsy, attempts to predict clinically important endpoints prospectively using marker combinations are warranted.

To conclude, this impartial head-to-head analysis of the diagnostic performance of two serum and two imaging-based markers of fibrosis shows that ELF and TE have a high diagnostic accuracy for the prediction of fibrosis and that HTT performs less well than the other tests. A combination of APRI with either ELF or TE effectively predicts fibrosis stage using a three-group model of fibrosis, but that, in combinations of three or more tests leads to redundancy of information. This study may provide a basis for future clinically based cost-benefit analyses to assess optimal combinations of markers.


