Recent Insights into the Pathogenesis of Colorectal Cancer

Ajay Goel; Clement Richard Boland


Curr Opin Gastroenterol. 2012;26(1):47-52. 

In This Article

Genome-wide Association Studies

Genome-wide association studies (GWAS) provide a powerful approach for high-throughput identification of common, low penetrance alleles that can modify the risk for multiple diseases including CRC. These minor but common variations in coding or noncoding DNA sequences are referred to as SNPs. At one end of the disease spectrum, there are syndromic familial CRC diseases that are caused by rare mutations in high-penetrance genes such as adenomatous polyposis coli (APC) (causing familial adenomatous polyposis) and the DNA MMR genes that cause Lynch syndrome. However, these mutations explain only about 3–4% of CRC. The human genome project and extensive linkage analysis suggest that the principal genes causing these high-penetrance diseases have essentially been identified. Other familial clusters of CRC are currently thought to arise through the influence of common, low-penetrance genetic factors.

GWAS technology allows linkage analysis of hundreds of thousands of SNPs simultaneously, making it possible to determine linkage-using sets of tagged SNPs that correspond to common variants in a genome. This permits determination of disease associations without a priori knowledge of the location or function of the DNA sequence.[3] These data have allowed an unprecedented opportunity to better understand the role of common genetic variants in the cause of cancer and other diseases. Although the GWAS concept has existed for years, it was not until late 2007 that the first common low-penetrance susceptibility variant was associated with the risk of CRC.[4] Since then, 10 variants have been linked to CRC and replicated in multiple studies through genotype analysis of tens of thousands of individuals, as summarized in Table 1.[4–17] The initial data indicate that these variants exert relatively minor effects on cancer risk by themselves; however, combinations of multiple variants correlated with environmental exposures offer a promising possibility to develop robust predictive models for CRC risk stratification.

Identification of Novel Susceptibility Loci by Genome-wide Association Studies in Colorectal Cancer

Most of the published GWAS on CRC have been undertaken by British[4–6,8] and Canadian[5] researchers. These studies were performed in two phases. Typically, the first phase used modest sample sizes (~1000 patients and controls), and these identified six novel CRC susceptibility loci mapping to 8q23,[9] 8q24,[5] 10p14,[9] 11q23,[6] 15q13[7] and 18q21.[6,8,18] Although statistically significant, these common sequence variants had relatively modest effects on CRC risk [odds ratios (ORs) were no more than ≈1.2]. These initial studies were followed by a meta-analysis[10] and a second phase of studies with tens of thousands of patients and controls, which identified four new risk loci mapping to chromosomes 14q22, 16q22, 19q13 and 20p12; however, these had even smaller effect sizes (ORs ≈1.1). Most of these data have been successfully replicated in multiple independent studies, and susceptibility loci mapping to 8q23,[15] 8q24,[12,13,16] 11q23[14,15] and 18q21[12,17] have been validated in different populations.

Although the discovery of these novel susceptibility loci for CRC generated enthusiasm, some investigators began to question the causal role and biological significance of these variants, as none of these loci was located within or near a coding (exonic) sequence. All loci were found in noncoding introns, some so devoid of possible coding sequences or transcriptional activity that they were referred to as gene deserts. For instance, the variants on chromosome 8q24 that were associated with CRC and other tumors are 330 kb away from the nearest gene. Moreover, five of the 10 SNPs identified tag linkage disequilibrium blocks that include or are near genes of the transforming growth factor-beta super family signaling pathway, including SMAD7, gremlin 1 (GREM1), BMP2, BMP4 and rhophilin-like protein (RHPN2). These data suggest that, although these susceptibility variants generally have a modest effect on CRC risk, they might be associated with functional effects that are large, if a combination of critical variants were to be present in any individual.[14]

Functional Evidence for Genome-wide Association Studies Identified Loci in Colorectal Cancer

In a significant development, two independent research groups made seminal discoveries that offer insight into the functional role of one of the three 8q24 variants, rs6983267. First, the haplotype containing the rs6983267 G allele is found in 50% of Europeans and nearly 100% of Africans; so, it is quite common. Homozygosity for the G allele of this SNP increases CRC risk 1.5-fold (a relatively weak effect), but this allele shows relative copy number increase during tumor development.[19••,20••] The novel finding is that this region acts as a transcriptional enhancer and contains a sequence that can enhance Wnt signaling, a key pathway in CRC. Furthermore, regulatory elements of MYC are located within the gene desert on chromosome 8q24.[20••] This region also preferentially binds the transcription factor T cell factor 7-like 2 (TCF7L2), which is a key participant in the Wnt signaling cascade.[19••,20••] These data illustrate the utility of GWAS and shed important insights into the connection between the SNP on 8q24-activated Wnt signaling, increased MYC expression and CRC. This concept will certainly be exploited in future studies with other SNPs to better understand the pathogenesis of many common diseases.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.
Post as: