Insights into Antibiotic Resistance Through Metagenomic Approaches

Robert Schmieder; Robert Edwards


Future Microbiol. 2012;7(1):73-89. 

In This Article

Ongoing Challenges for Detecting Antibiotic Resistance in Environmental Samples

As we have discussed, metagenomics can be used to identify antibiotic resistance genes in the environment, and has increased our understanding of the sources and roles of these genes in nature. However, there are problems associated with metagenomics, and in the next sections we discuss some of those limitations.

Functional Metagenomics

Functional screening, where fragments are cloned and expressed, may be hampered by the clone and insert size compared to the total metagenome size. Antibiotic resistance can be encoded by multiple genes that are required to work together (e.g., vancomycin resistance), by a single gene (e.g., bla, the gene encoding β-lactamase), or by a point mutation in housekeeping genes (e.g., gyrA, the gene that encodes DNA gyrase). Each of these results in difficulties and challenges for metagenomic screens.

The results of functional metagenomics are dependent upon each gene's ability to be expressed in surrogate hosts, typically E. coli.[88] Resistance genes are regulated by genetic elements that may not be recognized by the surrogate host's gene expression machinery, the codon usage may not be appropriate for expression, post-translational modifications may be missing, or the expressed protein may not fold correctly. Heterologous gene expression can also result in false positives since the foreign gene may interact in novel ways with the cellular machinery.

The range of media types, antibiotic concentrations and incubation methods used to measure resistance levels make it difficult to compare results between environments.[18] Diaz-Torres et al., for example, found that there are problems with the expression of certain tetracycline resistance genes found in the human oral microbiome when E. coli was used as a host.[51]

In addition to heterologous gene expression issues, genes and their encoded proteins are adapted to different environmental conditions that might preclude efficient enzyme activity at 37°C in standard laboratory media. For example, genes extracted from Alaskan soil microbes that live at temperatures much lower than 30°C,[22,59] and insect midguts, where microbes live in an environment with a pH of 12.4.[29] It is possible that many of the genes in functional metagenome libraries are not expressed in E. coli, resulting in an underestimate of the frequency of resistance determinants in environmental samples. Some resistance genes have been found using a wider range of expression systems and hosts, but global sequencing approaches indicate a large discrepancy between predicted and detected resistance genes.[89] The limitations of E. coli-based screening call for the broader use of a wider range of hosts including not only Gram-negative, but also Gram-positive host species.

Another study was not able to identify streptomycin-resistant clones in functional screens, despite their presence in the metagenomic library verified using PCR and culturable bacteria.[23] In addition to unidentified resistance genes, a large number of resistant clones may be false positives.[63]

Sequence-based Metagenomics

The biggest challenge with sequence-based metagenomics is the large number of sequences that show no significant similarity to previously sequenced genes or organisms; without known reference sequences, resistance genes cannot be easily identified in the metagenomes. The strong selection for antibiotic resistance alleles results in convergent evolution – the adaption of very different genes to perform the same function.[16,90,91] We noted above that many resistance genes identified in functional screens have low similarity to known genes, but with sequence-based approaches we are generally limited to only identify things we already know.

The current sequence-based metagenomic approaches need to be evaluated based on the complexity of technical procedures, robustness, accuracy and cost. The preparation of a sample library requires multiple molecular biology steps and, depending on the technology, up to 4 days to complete. The range of data volumes leads to processing times from a few minutes to multiple hours, emphasizing the need for sufficient computation power. The data analysis requires both expertise in bioinformatics and a more advanced informatics infrastructure.

High-throughput sequencing technologies generate data that currently challenge data storage, management and processing, demanding access to supercomputing resources or cloud-based computing services for efficient handling. Advances are needed in data transfer and management, standardization of data formats, and integration of different types of data.[92] The amount of data is even challenging national data warehouses, such as the Sequence Read Archive, which announced that it will limit data archiving to a specific subset of next-generation sequence data starting from October 2011.

The sequence-based identification of a resistance gene requires separate functional confirmation. The metagenomic sequences only suggest the presence of an enzyme that may encode antibiotic resistance, but do not necessarily confirm that a gene is functionally expressed or that it does not encode an alternative function in its host. With the advancement of mRNA extraction from environmental samples, metatranscriptomics may become the main method for the detection of functional resistance genes.[93]

Whether the methodology employed in microbiota characterizations are faithfully reproducing the community composition, and not distorting it due to experimental bias or sequencing artifacts is still an open question. However, as sequencing becomes cheaper, researchers will be able to sequence deeper and at multiple time points to address possible biases, and to answer the important questions about the sources, evolution, and effects of antibiotic resistance genes that we have touched on here.