Transmitted HIV-1 Drug Resistance in a Large International Cohort Using Next-Generation Sequencing

Results From the Strategic Timing of Antiretroviral Treatment (START) Study

JD Baxter; D Dunn; A Tostevin; RL Marvig; M Bennedbæk; A Cozzi-Lepri; S Sharma; MJ Kozal; M Gompels; AN Pinto; J Lundgren


HIV Medicine. 2021;22(5):360-371. 

In This Article


Study Population

The START trial, conducted by the International Network for Strategic Initiatives in Global HIV Trials (INSIGHT), enrolled participants between April 2009 and December 2013. The study design and data collection plan have previously been reported.[14] A plasma sample, taken up to 60 days prior to enrolment, was obtained from participants consenting to the storage of specimens. Samples were stored centrally at the INSIGHT laboratory repository in Cinnaminson, New Jersey, prior to shipment to the Centre for Genomic Medicine, Rigshospitalet, Copenhagen, for NGS analysis. NGS was attempted on all samples with HIV RNA > 1000 copies/mL at study entry.

Sample Preparation, Amplification of Viral RNA and Sequencing

The plasma samples were thawed from −80ºC freezers at room temperature. Plasma (500 μL) was transferred to new RNAse-free tubes and centrifuged at 2000 g for 15 min. The supernatant was extracted and centrifuged at 21 000 g for 75 min and 360 μL of the top supernatant was discarded. Viral RNA was extracted using QIAamp viral RNA extraction kit (Qiagen, Hilden, Germany) on a QIAcube robot using the manufacturer's guidelines. Reverse transcription polymerase chain reaction (RT-PCR) was used to amplify two amplicons from the viral RNA. The primer sequences (available in Table S5) were designed by Gall et al..[15] The reverse transcription and amplification were performed using SuperScript III One-Step RT-PCR System with Platinum Taq High Fidelity (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer's instructions with 10 μL viral RNA used for each amplicon. The PCR products were purified using Ampure XP (Agencourt, Beckman Coulter, Brea, CA, USA) PCR purification according to the manufacturer's instructions.

The two amplicons were pooled for each sample prior to library preparation. Libraries were prepared using Nextera XT (Illumina) according to manufacturer's protocol, except that 1.5 × the library normalization beads was used in the final normalization step. DNA libraries were sequenced on an Illumina MiSeq machine using a MiSeq 150-cycle V3 reagent kit (Illumina, San Diego, CA, USA).

HIV-1 Subtyping

Subtypes were assigned to samples for which the consensus sequence information was available for at least 90% of either of the two amplicons, or at least 50% of both. The consensus sequence was analysed with REGA HIV-1 Subtyping Tool version 3.[16] Output was manually inspected to check for the presence of subtype specific sequences within the given consensus sequence. Samples were assigned to be either a pure subtype (A–D, F, G) or a recombinant subtype in cases where the genome showed presence of sequences specific to more than one pure subtype.

Identification of Drug Resistance Mutations With Virvarseq

Sequence reads (FASTQ files) were analysed with VirVarSeq v.20140929, which calls variants at the codon level.[17] VirVarSeq was run with HIV-1 HXB2 as reference and with default settings, except that soft-clipping as defined by the aligner were ignored and without the mixture model step (as recommended by Huber et al.[18]). From the output, we extracted amino acid frequencies in the pol gene from amino acid position 1 to 935 where positions 1–99 encode protease (PR) protein, positions 100–659 encode RT protein, and positions 660–935 partially encode integrase (IN) protein (our amplicon did not cover position 936–947).

Definition of Transmitted Drug Resistance and Phenotypic Drug Susceptibility

As in a previous paper from START reporting the results of locally performed Sanger sequencing, TDR was based on the WHO 2009 surveillance list with the addition of RT mutations T215N and E138K.[13,19] INSTI mutations, which are not included on this list, were defined as those on the Stanford HIVdb surveillance DRM list, namely T66AIK, E92Q, F121Y, G140ACS, Y143CHR, S147G, Q148HKR and N155HS.[20] Interpretation of phenotypic drug susceptibility was standardized using the Stanford HIVdb algorithm v.8.6 which defines drug resistance as none, potential low level, low level, intermediate or high.[21] To achieve consistency with WHO resistance reports, predicted potential low level is not reported. It is noted that the Stanford HIVdb algorithm considers a much wider range of mutations than considered by the WHO 2009 surveillance list (including the integrase gene) and these additional TDR DRMs detected by NGS were included for predicted phenotypic drug susceptibility.

Sequencing Depth and Thresholds for Calling DRMs

Sequence read coverage depth varied markedly across the sequenced amplicons (highest in PR, intermediate in RT, lowest in IN). We stipulated a minimum read depth of 200 across the region spanning all relevant mutations within each gene. For WHO surveillance mutations this comprised codons 23–90 of PR, codons 41–230 of RT, and codons 66–155 of IN; for Stanford predicted phenotypic drug susceptibility this comprised codons 10–90 of PR, codons 41–348 of RT and codons 51–263 of IN. This resulted in different denominators for different drug classes, which were therefore analysed separately. DRMs detected at three thresholds by NGS are reported: > 2%, 5% and 20% of the viral population (the latter comparable to the detection threshold for Sanger sequencing).[11,18]

Statistical Methods

If two or more mutations were present in the analysis of drug class-specific TDR by detection threshold (Figure 1), the highest frequency was used in the analysis. Fisher's exact test (two-sided) was used to test the association between geographical region and whether TDR variants were observed at 2–5%, 5–20% or > 20%. Logistic regression analysis was used to examine predictors of drug class-specific TDR. Odds ratios were adjusted a priori for the effects of geographical region, calendar year of enrolment and age. Subtype was not included in these models, as independent effects of subtype and geographical region could not be estimated due to the very strong association between these two variables. All statistical analyses were conducted using STATA v.15 (StataCorp, Houston, TX, USA). A P-value < 0.05 was deemed significant.

Figure 1.

Prevalence of transmitted drug resistance by detection threshold and geographical region. Data also shown in tabular form in Table S1. WHO, World Health Organization; DRM, drug resistance mutation; NRTI, nucleoside reverse transcriptase inhibitor; NNRTI, non-NRTI; PI, protease inhibitor; INSTI, integrase strand transfer inhibitor.