Annual Mycobacterium tuberculosis Infection Risk and Interpretation of Clustering Statistics

Emilia Vynnycky, Martien W. Borgdorff, Dick van Soolingen, Paul E.M. Fine

Disclosures

Emerging Infectious Diseases. 2003;9(2) 

In This Article

Discussion

The availability of DNA fingerprinting techniques has led to a large number of studies that measure clustering of isolates from tuberculosis cases.[1,2,3,16] Most of these studies have been conducted in industrialized settings and have found relatively low levels of clustering (30% to 40%) and decreases in clustering with age. Our analyses indicate that those findings have been influenced strongly by the large secular decline in the annual risk for infection that occurred in industrialized settings during the 20th century and that very different findings are expected in settings where the annual risk for infection has changed little over time. The clustering predicted is high (>60% for 2-year periods) in such settings, similar for all age groups, and may nevertheless still underestimate the extent of disease that is due to recent transmission.

Our conclusions are based on a model of the transmission dynamics of M. tuberculosis that includes several simplifications. The most obvious is our assumption that the risks for disease, given infection in settings in which the infection risk is high, are the same as those estimated for industrialized populations. HIV influences these risks,[17,18] although its effect on clustering is not yet understood.[14] Another simplification is our assumption that the half-life of DNA fingerprint patterns is identical for strains involved in active disease and in latent infection. If latent infections are associated with a slow rate of genetic change of the bacilli, our assumption would have led to an underestimate of clustering but would not have affected our conclusions for settings in which the annual risk for infection has remained unchanged over time, where only a small proportion of disease is attributed to reactivation of a latent infection (Figure 3). The effect of this assumption on clustering estimates for the Netherlands is discussed elsewhere.[6]

Our finding that the overall amount of clustering in populations with a low (constant) annual infection risk should be similar to that observed in populations with a high (constant) infection risk may appear paradoxical. Our finding follows from the fact that in such populations any decline in the proportion of disease attributable to recent primary infection with age is compensated by increases in the proportion attributable to recent reinfection with age (Figure 3). As a result, both the overall and age-specific predicted proportions of disease attributable to recent transmission in these populations are very similar; this finding leads to predictions that the overall and age-specific levels of clustering in these settings would also be similar.

Previous model-based analyses[6] have indicated that in industrialized settings such as the Netherlands clustering among young case-patients will underestimate the extent of disease attributable to recent transmission (because some sources of infection have onset outside the study period and because DNA fingerprint patterns can change between infection and disease onset), and clustering among old case-patients may overestimate recent transmission (because clustering among older case-patients is more likely to be attributable to their being sources of infection rather than their being recently reinfected). These analyses extend those findings and indicate that in settings in which the annual risk for infection has not changed much over time, the overall level of clustering in any given age group is likely to underestimate the extent of recent transmission (Figure 5). This underestimate follows from the fact that in these settings, most disease in all age groups is attributable to recent transmission, and some patients will have been infected or reinfected immediately before the study started and thus may not be in a cluster.

These analyses provide the first estimates of the positive and negative predictive values of clustering. Overall, these analyses highlight the fact that in settings in which the annual risk for infection has not changed greatly over time, most clustered case-patients are likely to have been recently infected or reinfected (i.e., the positive predictive value of clustering is high) (Figure 6). This finding suggests that in such settings, application of the "n-1" rule,[2] which assumes that each cluster comprises an index case attributable to reactivation and the other cases result (in)directly from that case, will lead to even more unreliable estimates of the extent of recent transmission than those based on the "n" rule. Similarly, estimates of the proportion of disease attributable to reactivation will be unreliable if they are based on the proportion of patients who fail to be in a cluster in a given period.

Our analyses demonstrate that the properties and interpretation of clustering statistics depend strongly on the trend and magnitude in the annual risk for infection and thus will vary between settings. For example, in settings in which the annual risk for infection has remained unchanged at either a high or a low level, the age differential in clustering is likely to be small, in contrast with that in industrialized settings, and clustering is likely to underestimate the extent of recent transmission in all age groups. Given the growing importance of clustering studies, which, to date have been conducted in populations in which the annual risk for infection declined dramatically over time and is currently very low, these insights are important for an improved understanding of the natural history of tuberculosis.

Comments

3090D553-9492-4563-8681-AD288FA52ACE

processing....