Prospective Study of Cancer Genetic Variants

Variation in Rate of Reclassification by Ancestry

Thomas P. Slavin; Lily R. Van Tongeren; Carolyn E. Behrendt; Ilana Solomon; Christina Rybak; Bita Nehoray; Lili Kuzmich; Mariana Niell-Swiller; Kathleen R. Blazer; Shu Tao; Kai Yang; Julie O. Culver; Sharon Sand; Danielle Castillo; Josef Herzog; Stacy W. Gray; Jeffrey N. Weitzel;


J Natl Cancer Inst. 2018;110(10):1059-1066. 

In This Article


Study Population

The study was approved by the Institutional Review Board at City of Hope. Data were obtained from individuals who were referred for cancer genetic risk assessment and underwent genetic testing at one of two Southern California sites (City of Hope and Olive View Medical Center) from September 1996 through December 2016 and who gave informed consent to participate in the Clinical Cancer Genomics Community Research Network registry.

Eligible for study were variants of actionable and commonly evaluated genes in hereditary cancer predisposition. In particular, because benign variants were not subject to reclassification, only nonbenign variants (pathogenic, likely pathogenic, VUS, likely benign) were studied. The number of genetic variants studied per person was unrestricted. Characteristics collected per variant were gene, mutation, laboratory, self-reported maternal and paternal ancestry, sex, and dates and results of initial classification and any subsequent reclassifications. Terminology for classifying variants was standardized across laboratories per Richards et al..[2] Variants listed as requiring special interpretation were grouped in the same category as VUS. Self-reported ancestry (defined as at least one grandparent) was assigned to mutually exclusive categories in the following sequence: any African, any Native American, any Chinese, including Taiwanese, any other Asian (Filipino, Japanese, Korean, Indian, Pakistani, Indonesian, Thai, Cambodian, Vietnamese, Nepali, Samoan), any Ashkenazi, any Hispanic, any Middle Eastern (Iranian, Armenian, Syrian, Lebanese, Egyptian, Turkish), non-Hispanic European. Within these categories, individuals were noted as having mixed ancestry if they reported ancestry from at least one additional category. A single laboratory (referred to as "Lab A") issued most initial classifications and reclassifications under study. Corresponding data from the other 33 testing laboratories were sparse, necessitating their being grouped together ("Other Labs") for analysis.

Variants were reclassified by the commercial laboratory that had performed the original classification. Throughout the study period, variant reclassifications were recorded in the study database as they were received, assigned to the date on which the amended electronic or paper report was received from the laboratory. Prospective follow-up for reclassification was terminated either by study closure at the end of February 2017, by reclassification of the genetic variant to benign, or by receipt of a third reclassification (after which no further reclassification was observed). Follow-up was not terminated by the death of the individual who had been tested, because laboratories issue variant reclassifications regardless of the individual's vital status.

Statistical Analysis

Duplicate variants were excluded from the study at both the family and the ancestry group levels. First, when members of a family were tested by the same laboratory for the same variant ("within-family duplicates"), only the first such variant from the family was retained for study. Then retained variants were put into random order, and any nonfamily duplicate variants (same ancestry, laboratory, and classification) were excluded. Rate of reclassification was defined as the number of reclassifications divided by total observation time.

All statistical tests were two-sided. The primary hypothesis was that rate of reclassification differed between individual minority ancestries and the referent category, non-Hispanic European ancestry. This hypothesis was tested separately for variants from BRCA1/2 and non-BRCA1/2 genes using generalized linear modeling (log-linked, with Poisson distribution) of reclassification rate. Each model used a generalized estimating equation to take into account intralaboratory correlation and considered as potential confounding factors laboratory, individual genes with at least 50 observations apiece, classification of variant (handled as a time-dependent variable that changed at the time of each reclassification, initiating a new observation), year of current classification, and sex. A covariate or interaction term was retained in the model if it improved the fit to the observed data. When the model retained year of current classification, a quadratic term (year squared) and an interaction term for year by ancestry were considered also.

The study's overall type I error was limited to 5% by evaluating for statistical significance only those associations related to the primary hypothesis, namely the main effects of seven individual ancestries (relative to the referent ancestry, non-Hispanic European) and, when year of current classification was retained in the model, the seven potential interactions between individual nonreferent ancestries and year. Statistical testing was adjusted for multiple (n = 21) hypotheses using Holm-Bonferroni adjustment.[10]

For the first of two secondary aims, time to first reclassification was plotted graphically using the Kaplan-Meier method, by category of initial classification. For the other secondary aim, variants that underwent reclassification were categorized by net change (from original classification to most recent reclassification), with upgrade defined as any shift toward greater pathogenicity. Among variants at risk of upgrade (ie, not already pathogenic), multivariable logistic regression was used to test the following characteristics for independent association with net upgrade: original classification, type of mutation (splice site vs all others), gene (BRCA1/2 vs non-BRCA1/2), year of original testing, ancestry, and sex. For this secondary analysis, P values were unadjusted for multiple hypothesis testing.