The Effect of HapMap on Cardiovascular Research and Clinical Practice

Kimberly A Skelding; Glenn S Gerhard; Robert D Simari; David R Holmes Jr


Nat Clin Pract Cardiovasc Med. 2007;4(3):136-142. 

In This Article

What Is the HapMap?

The HapMap is a map, resource, tool and a catalogue of common haplotypes. A haplotype is a group of genetic variants, or genotypes closely linked on one chromosome and inherited as a unit. Haplotypes can reside within genes and specify alleles, alternative forms of a gene, which can result in quantitatively or qualitatively different gene products and, consequently, different phenotypes. A single allele is inherited from each parent. A person is homozygous if the allele on each chromosome is identical, and is heterozygous if the alleles are different. The HapMap catalogues the locations of common haplotypes throughout the genome, details what these variants are, and describes how they are distributed across populations. In contrast to the Human Genome Project, which generated a 'reference' sequence of all chromosomes from only a few individuals, the HapMap is the logical next phase to begin the characterization of the human genome by determining the extent of genetic variability.[23,24,36]

The HapMap project was launched in 2002 through an international consortium[1,27] and includes four population samples: 30 trios (a trio is a mother-father-child group and is a classic genetic collection) of North American descent (from Utah with ancestry from northern and western Europe; CEPH/Utah), 30 trios of Nigerian descent (from the Yoruba in Ibadan, Nigeria), 48 unrelated Japanese (from Tokyo, Japan) and 48 unrelated Chinese individuals (Han Chinese from Beijing).[1,3] The study of this important but limited number of samples provides insight into 'common' genetic variability and will need to be extended into further populations to become more widely generalizable.

In the first phase of the project, a map was developed that identified evenly spaced SNPs (spaced approximately every 5 kb) that had a minor allele frequency greater than 5%. SNPs that are evenly spaced throughout the genetic map allow analysis of linkage disequilibrium (LD) patterns and identification of haplotype blocks. LD is a somewhat counterintuitive term that refers to the extent to which alleles within a population are found together. In a large population over time, alleles achieve equilibrium due to the recombination by which they are randomly inherited. If alleles are in disequilibrium, this signifies that they are not randomly inherited but are inherited together as a unit, generally referred to as a haplotype block. Benefits associated with the identification of haplotype blocks rather than analysis of individual SNPs include the ability to reduce the amount of tests needed to find a genomic area of interest and the ability to use a marker SNP to find a disease-causing SNP. These SNPs are usually inherited along with adjacent SNPs on the same chromosome or haplotype block. As a result of the coinheritance of many of the SNPs that are in linkage disequilibrium and part of a haplotype block, a variant base unique to the set of SNPs can often be used to serve as a marker or 'tag' for the entire SNP group (Figure 1). For example, when a genotype at one SNP is known and the association between this SNP and another is high (i.e. they are in LD), we can, therefore, predict the genotype at the second SNP without performing genotyping. As many SNPs could be in LD, testing for the presence of only one of these allows for a decrease in the number of SNPs used to identify the genetic cause of disease.[23,36] This technique, therefore, decreases both time and cost without compromising the information gained. LD across the human genome has been characterized using the HapMap data.[37,38,39] It is low near the ends of each chromosome, which indicates that there is much recombination in these areas. By contrast, LD increases near the centromeres. Interestingly, regions of strong LD tend to have fewer guanine and cytosine bases and do not have as many genetic polymorphisms as other regions. Some classes of genes, such those involved in immune response and sensory perception, are typically located in regions of low LD; while other classes of genes, including those involved in DNA and RNA metabolism, response to DNA damage, and the cell cycle, are largely located in regions of high LD.

Figure 1.

SNPs and haplotype blocks. (A) SNPs. Shown is a short stretch of DNA from four versions of the same chromosome region in different people. Most of the DNA sequence is identical in these chromosomes, but three bases are shown where variation occurs. Each SNP has two possible alleles; the first SNP in panel A has the alleles cytosine and thymine. (B) Haplotypes. A haplotype is made up of a particular combination of alleles at nearby SNPs. Shown here are the observed genotypes for 20 SNPs that extend across 6,000 bases of DNA. Only the variable bases are shown, which include the three SNPs that are shown in panel A. For this region, most of the chromosomes in a population survey turn out to have haplotypes 1-4. (C) Tag SNPs. Genotyping just the three tag SNPs out of the 20 SNPs is sufficient to identify these four haplotypes uniquely. For instance, if a particular chromosome has the pattern A-T-C at these three tag SNPs, this pattern matches the pattern determined for haplotype 1. Abbreviation: SNP, single nucleotide polymorphism. Reproduced with permission from The International HapMap Consortium (2003) The International HapMap Project. Nature426: 789-796.

Evolutionary insights have also been made. For example, data from the HapMap have been used to support the hypothesis that as domestication of dairy animals increased milk consumption, variations in lactase gene expression that enabled humans to retain the ability to digest lactose in adulthood were selectively favored and quickly rose to high frequency in human populations.[40,41]

In addition, identification of tag SNPs associated with a given disease allows the HapMap to be used to quickly find candidate genes that can be studied further, such as in the development of animal models (e.g. knockout mice) in which diagnostic and therapeutic techniques can be tested and refined in preparation for phase I human clinical trials.

Analysis of the HapMap data has also been used to determine estimates of genetic population structure.[41] As expected, a high similarity was observed between the Han Chinese cohort from Beijing and the Japanese cohort from Tokyo. Interestingly, discrete regions of dissimilarity were found between these two populations, as well as between the Yoruba cohort from Nigeria and Whites of European descent, which could be explored to identify potential candidate genes underlying the differences in cardiovascular disease phenotypes among various ethnic populations.[42,43]


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.
Post as: