Diversity Missing in Genetic Studies

Ricki Lewis, PhD

March 21, 2019

Overrepresentation of people of European ancestry in genetic and genomic investigations is exacerbating healthcare inequities, according to a commentary published online today in Cell.

"[T]he lack of ethnic diversity in human genomic studies means that our ability to translate genetic research into clinical practice or public health policy may be dangerously incomplete, or worse, mistaken," write Giorgio Sirugo, MD, PhD, and Sarah Tishkoff, PhD, of the University of Pennsylvania, Philadelphia; and Scott M. Williams, PhD, of Case Western Reserve University in Cleveland, Ohio.

David Curtis, MD, PhD, from University College London Genetics Institute, echoes that point. "We have genetic knowledge that we can apply to white European subjects, which can impact on their healthcare, and we are unable to provide the same clinical benefits to people with other ancestries," he said. "This situation is, frankly, indefensible and needs to be addressed."

The Seeds of Skewed Representation

When the idea to sequence a human genome began percolating among researchers and funding agencies in the late 1980s, a parallel set of investigators was contemplating cataloging genetic diversity across human populations. This idea emerged as the brainchild of prominent Stanford University geneticist Luigi Luca Cavalli-Sforza, MD. In 1991, he conceived the Human Genome Diversity Project (HGDP) — distinct from the Human Genome Project — to counter the largely European data behind the linkage maps that anchored the first sequenced human genomes. After more than a half-century in the field of genetics, Cavalli-Sforza died last August at age 96.

In 2002, as the first human genomes were being published, researchers launched the International HapMap Project. HapMap uses single nucleotide polymorphisms (SNPs), which are places in genomes where people vary at one base to identify blocks of the human genome that tend to be inherited together and thus reflect ancestry.

Around the same time, researchers started to use genome-wide association studies (GWAS) to systematically identify SNP blocks that track or associate with a given trait or illness. GWAS quickly became a standard tool to identify genes behind traits and diseases. Even though GWAS would ideally include ethnically diverse patient populations to zero in on causative genes in different genetic backgrounds, that's where the bias began, said Tishkoff.

"Part of the challenge of collecting samples from ethnically diverse populations was the funding situation. When you applied for funding to do a GWAS, a reviewer might be concerned that a genetically small population wouldn't generate enough statistical power, so they discouraged including ethnically diverse populations. In this way, the missing diversity is both a research failing and a health inequity," Tishkoff told Medscape Medical News.

In other words, because, statistically speaking, signals from a small group would be drowned out by those from the larger population, some funders didn't want to devote limited resources to include participants from diverse groups.

Curtis agrees. "With hindsight it's easy to see why we arrived at this position. We were seeking to identify genes that caused illness, and the power to identify such genes is higher if one has an ethnically homogeneous sample," he said.

The genomes of Africans are genetically heterogeneous because they go back the farthest and their blocks of linked SNPs have had more time to break up over the generations. European genomes, by comparison, offer a simpler sameness against which mutations stand out.

Assumptions of the degree to which people differ may also have played into the current missing diversity, Curtis added. "To an overwhelming extent, humans share their biology, and it was reasonable to suppose that a gene conferring risk in one population would also do so in others," he said. But that isn't always so, Curtis noted, due to gene–gene interactions and gene–environment interactions.

In the current commentary, Sirugo and colleagues analyzed data from the GWAS Catalog through January 2019 and constructed all too-telling pie charts. When they slice the pie to represent the proportion of studies by ancestry, they find that over half of GWAS were done using DNA from people of European descent (52%), compared with 21% Asian, and 10% African. When they slice the pie representing the proportion of individuals included in all GWAS, the skewing is even more dramatic: 78% European, 10% Asian, and 2% African.

Inequities Impact Diagnosis, Treatment, and Drug Choice

The authors describe several examples of clinically relevant differences among people of different ancestries:

  • Heart failure in African Americans may be misdiagnosed, in terms of underlying cause, because a mutation (in the ATTR-CM gene tied to transthyretin amyloid cardiomyopathy) that 4% of this population carries is rare in other groups. Yet, that distinction could influence drug selection.

  • Cystic fibrosis is underdiagnosed among African Americans because their most common mutations have not been as well-studied as those prevalent among Europeans. Because new CF drugs are targeted to mutations, fuller investigation is warranted for more patients to benefit.

  • Four genotypes commonly considered to determine dosing for warfarin have a much greater effect on the variability in rate of drug metabolism among Europeans than among those of African ancestry.

In addition, the authors note examples where environmental or dietary factors interact with genes to create clinically relevant findings in some ethnic or racial groups, all of which would be missed in European-dominated genetic studies.

Efforts to Counter the Eurocentric Focus

Attempts to broaden the ancestries represented in genetic studies and DNA databases are coming from several areas.

Consumer genetic testing companies are expanding the diversity of their databases. 23andme, for example, last year launched its Populations Collaborations program to offer free "spit kits" to researchers working with underrepresented populations.

Biobanks are also important. "Well-characterized biobanks that include ethnically diverse individuals linked to extensive health records can be used to interrogate genetic risk of disease, translating into better health care for all populations. These initiatives will require the political will to improve funding and infrastructure for studying genomic and phenotypic diversity in global populations," said Sirugo in a news release.

In the meantime, Tishkoff cautions clinicians to question different types of studies that might underrepresent human diversity. Meta-analyses can bury studies on small populations, she said. GWAS, and the polygenic risk scores that are algorithmically derived from them, may also minimize or ignore some population groups, she and Curtis agree. 

In their commentary, Sirugo and colleagues cite a study published in Nature Communications in January that illustrates how investigations can be inclusive. Tuomas Oskari Kilpeläinen, PhD, from the NNF Center for Basic Metabolic Research at the University of Copenhagen, Denmark, and colleagues considered Europeans, Africans, Asians, and Hispanics in their work on genes that interact with physical activity to control blood lipid levels.

"Our study included approximately 250,000 participants, of whom 60,000 were of non-European ancestry. We identified four genetic loci interacting with physical activity, two of which would have been missed if not including diverse ancestries, because the variants were either nonexistent or too rare to be identified in European-ancestry participants alone," Kilpeläinen told Medscape Medical News.

"This underscores how important it is to include diverse ancestries in genetic studies to uncover genetic differences and use the information correctly in diagnostics and therapeutics," Kilpeläinen added.

The researchers and commentators have disclosed no relevant financial relationships.

Cell. Published online March 21, 2019. Full text

Follow Medscape on FacebookTwitter, Instagram, and YouTube

Comments

3090D553-9492-4563-8681-AD288FA52ACE
Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.
Post as:

processing....