Prevalence in the United States of Selected Candidate Gene Variants: Third National Health and Nutrition Examination Survey, 1991-1994

Man-huei Chang; Mary Lou Lindegren; Mary A. Butler; Stephen J. Chanock; Nicole F. Dowling; Margaret Gallagher; Ramal Moonesinghe; Cynthia A. Moore; Renée M. Ned; Mary R. Reichler; Christopher L. Sanders; Robert Welch; Ajay Yesupriya; Muin J. Khoury; CDC/NCI NHANES III Genomics Working Group


Am J Epidemiol. 2009;169(1):54-66. 

In This Article

Abstract and Introduction


Population-based allele frequencies and genotype prevalence are important for measuring the contribution of genetic variation to human disease susceptibility, progression, and outcomes. Population-based prevalence estimates also provide the basis for epidemiologic studies of gene-disease associations, for estimating population attributable risk, and for informing health policy and clinical and public health practice. However, such prevalence estimates for genotypes important to public health remain undetermined for the major racial and ethnic groups in the US population. DNA was collected from 7,159 participants aged 12 years or older in Phase 2 (1991-1994) of the Third National Health and Nutrition Examination Survey (NHANES III). Certain age and minority groups were oversampled in this weighted, population-based US survey. Estimates of allele frequency and genotype prevalence for 90 variants in 50 genes chosen for their potential public health significance were calculated by age, sex, and race/ethnicity among non-Hispanic whites, non-Hispanic blacks, and Mexican Americans. These nationally representative data on allele frequency and genotype prevalence provide a valuable resource for future epidemiologic studies in public health in the United States.


Completion of the human genome sequence[1,2,3] and recent advances in the analysis of genome-wide associations for several common diseases[4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20] are generating tremendous opportunities for epidemiologic studies to evaluate the role of genetic variants in the etiology of common human diseases. Identification of allelic variants has accelerated as a result of the cataloging and mapping of single nucleotide polymorphisms (SNPs) throughout the genome by the International HapMap Project[21,22,23] and characterization of the scope of structural variation, including copy number variants, in the genome.[24,25,26,27] Application of these advances to improve public health requires assessing the frequency of these variants in distinct populations, identifying diseases influenced by these variants, determining the magnitude of the associated risks, and elucidating gene-gene and gene-environment interactions. Although the number of published investigations in these areas of human genome epidemiology has increased rapidly, with publication of more than 6,000 reports yearly,[28] methodological issues have made it difficult to integrate the evidence and, thus, to easily translate the findings into public health improvements.[29,30,31]

Early studies of genotype prevalence used samples that were convenient to obtain, and minimal information was provided on the selection of participants.[31] In addition, most estimates were calculated from data on small study populations, which limited the accuracy of estimates of allele frequency and genotype prevalence. Furthermore, frequencies for most genetic polymorphisms have been measured only in select US racial and ethnic groups and have not been presented by age group or by sex. Although select polymorphism frequencies have been reported in large populations,[32,33] these studies were community based or controls from larger case-control studies. In contrast, data on genetic variants can be obtained from large, well-designed, epidemiologically well-characterized, and population-based US surveys such as the Third National Health and Nutrition Examination Survey (NHANES III).[34,35] These data are a unique and unparalleled resource for epidemiologic research to assess genetic variation in the population, gene-disease associations, interactions between gene-gene and gene-environment factors, and population-attributable risk for genetic variants.

In particular, NHANES III offers the opportunity to assess genetic variation among major racial and ethnic groups in the United States, for whom multiple health disparities exist.[36,37,38,39,40] Health disparities result from the complex interactions of social, environmental, behavioral, and genetic influences in a diverse population.[36,41,42] Public health strategies to address health disparities are more likely to be effective when they are based on sound integration of such risk information at the population level. NHANES III is a paradigm for complex analysis of unbiased, population-based data on social, environmental, behavioral, and biologic characteristics -- including genetic variation -- in relation to health status.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.
Post as: