Epidemiology of Inflammatory Bowel Disease
Although inflammatory bowel disease (IBD) is found worldwide, the majority of studies have focused on white populations in North America and Europe. The incidence (or number of new cases per year) and prevalence (total number of cases in the population) rates of IBD vary across populations and geographic locations. A number of publications have reviewed the differences in observed rates that have historically been attributed to social and economic development, industrialization, and a general conversion to the Western lifestyle. Although individual studies suggest different outcomes with regard to the incidence of IBD (from rates having hit a plateau to others suggesting both increases and decreases), collectively, these reports suggest that rates are increasing or at the very least stable across the populations that have been studied.
The differences in these incidence and prevalence rates are in part because of genetic differences across populations. A recent review by Molodecky and colleagues showed the highest prevalence of IBD in Canada and Europe, with the lowest prevalence in Asia. These prevalence rates indicate that IBD affects as many as 1 in 200 individuals across Europe and as many as 1 in 300 individuals in North America (1 in 400 for ulcerative colitis [UC] and 1 in 300 for Crohn disease [CD]). The incidence rates in North America were among the highest reported (0 to 19.2 per 100,000 for UC and 0 to 20.2 for CD). While the incidence of IBD continues to be highest in whites and individuals of Jewish descent, the rates across Hispanic and Asian populations appear to be on the rise. Additional studies across populations of developing countries are needed to further understand these rates the world over.
Additional factors, such as age and gender, must also be considered when discussing the incidence and prevalence of IBD and the overall influence of genetic factors. IBD is generally considered a disease of early adulthood with a primary peak incidence range between 15 and 30 years of life, creating a substantial impact on a patient’s long-term productivity and general well being. There have also been reports of a secondary incidence peak in 50- to 70-year-olds, but these findings have much less broad support across populations. The earlier age of onset seen in familial forms of both CD and UC when compared with sporadic cases provides further suggestive evidence of a strong genetic component. Taken on the whole, reports are inconsistent on whether there is a significant gender difference seen in IBD. In CD there seems to be a greater prevalence in females, particularly in familial cases, whereas in UC there may be a slight increase in males. Although these gender ratios may at some level represent an epigenetic effect on IBD pathology, they appear to be highly dependent on age, population, and geographic region.
The Complex Landscape of IBD Genetics
There is overwhelming evidence for the role of genetics in IBD as evidenced by initial reports of familial clustering. An early report by Orholm and colleagues noted a 10-fold increased risk of CD to first-degree relatives of CD, and an 8-fold increased risk of UC to first-degree relatives of UC. Furthermore, this group noted the likely genetic overlap between CD and UC, as relatives of CD or UC probands were at increased risk for both diseases when compared with the general population. Another clear measure of the strong genetic influence in IBD stems from numerous twin studies. The most recent review combines data from previous studies and highlights the possibility that genetics may play a stronger role in CD (monozygotic twin [MZ] concordance rate ∼30% vs dizygotic [DZ] twin concordance rate ∼4%) when compared with UC (MZ ∼15% vs DZ ∼4%). Though these concordance rates implicate a robust genetic component, the fact that they are not absolute indicates that genes alone are not sufficient to cause disease.
As we understand from numerous studies across complex genetic disorders, genetic background is not independent of environmental influences. Environmental factors (eg, gut flora, dietary changes, pollution exposure, microbial exposures, lifestyle changes, smoking, and geography) are likely to have a strong influence on the underlying genetic susceptibility.
Even though there is considerable evidence for genetic influence, IBD does not typically follow a simple Mendelian model of inheritance within a family. The only exception is that rare, autosomal recessive mutations found in the interleukin-10 (IL-10) receptor and IL-10 cytokine have been shown to be sufficient to cause severe forms of CD in infants.
The last 15 years have seen a tremendous degree of progress regarding the identification of genetic loci involved in IBD. This has happened in part because of technological advances and growth in the genetic approaches used to identify these genes. Multiple linkage studies in the mid to late 1990s, utilizing multiple affected families, identified a handful of genomic regions of interest. One of these regions on chromosome 16, when combined with candidate gene approaches, led to the identification of the first IBD susceptibility gene, namely, NOD2 (nucleotide oligomerization domain 2). As the genetic aspect continues to evolve, the last 5 years have seen the largest growth in the number of genetic loci identified in IBD. This is largely because of the expanse of large consortia using genome-wide association study (GWAS) approaches. These so-called GWASs have been performed almost exclusively in North American and European white populations and have used genotyping arrays of hundreds of thousands of single-nucleotide polymorphisms (SNPs), which are spread throughout the genome. Combined, these studies have identified approximately 100 genetic regions (71 CD, 47 UC, and 28 across both) demonstrating a level of genome-wide significant association to IBD. The large overlap of genetic loci seen across CD and UC are consistent with expectations based on clinical and epidemiologic predictions and will likely provide key insight into disease pathophysiology. The genetic loci identified for IBD point to a number of relevant biological pathways, including the IL-23 pathway suggesting a role in the maintenance of intestinal immune homeostasis, IL-10 signaling, and overall leukocyte trafficking. Despite the apparent overlap, there do appear to be some distinctions that are emerging. The genetic variation relevant in CD continues to point toward the body’s mismanagement of microbe recognition and processing of intracellular bacteria by the innate immune system with a more specific focus on regulation of autophagy. Meanwhile, the story for UC has a slightly different focus. UC continues to have a noticeably stronger association to the human leukocyte antigen class II genes compared with CD, suggesting that genes across the major histocompatibility complex confer a stronger risk in UC. Genes identified thus far for UC appear to focus on intestinal barrier integrity and function.
Despite this vast expanse in the number of known loci from just a few years early, NOD2 continues to have the strongest individual effect on risk of IBD. Moreover, these approximately 100 genes collectively account for a very small proportion of the genetic heritability of either CD or UC, with only about 23% and about 16%, respectively, of the genetic contribution defined.
Although it is clear that GWASs have provided invaluable insight into the genetic contributions to IBD, they fall short of their initial promise to identify strong genetic effects through genetic tagging via common variation. The hallmark of these large commercial genome-wide screening arrays, now at the level of 2 to 5 million SNPs, has been to provide the most common (based on allele frequency) markers that best tag the known variation across the genome. These panels have implicitly focused on testing the “common-disease common-variant” hypothesis, which predicts that common alleles will be found to be in and of themselves disease causing.
Recent technologic advances in genomic sequencing, so-called “next-generation sequencing,” have helped pave the way for the identification of rarer genetic variation that may manifest itself in common diseases. The identification and characterization of these variants will help to test the “common-disease multiple rare-variant” hypothesis, which states that susceptibility to common diseases is determined by a large number of rare variants of stronger effect. Franke and colleagues highlight the relevance of this hypothesis in IBD as it relates to rare genetic variation of stronger effect within NOD2 . They note that the most associated SNP within their analysis only explains just 0.8% of genetic variance, whereas the 3 NOD2 coding mutations (noted as mutations because these variants have shown functional effects) themselves account for nearly 5% of the heritability of CD. They further highlight that if this same situation were relevant to even a portion of the nearly 100 genes, there would be a much more significant portion of the overall heritability explained. These findings help to highlight the need to characterize further the genetic regions that we have already identified. The latest work by Rivas and colleagues further emphasizes the benefit to deep resequencing of currently known IBD loci. They find a number of additional independent risk factors in known IBD genes (including NOD2 , IL23R , CARD9 ) and additional associations to coding variants in other previously identified IBD risk loci that are predictive of direct functional consequence.
Because technologic advances are allowing for whole-genome sequencing at a near cost-effective level, the future of genetics and genomics is now a reality that most researchers and clinicians could not imagine just a few years ago. Although the technology and our current genetic approaches have by many accounts been very successful, we still have quite a way to go to explain the complex genetic landscape of IBD.
The Complex Landscape of IBD Genetics
There is overwhelming evidence for the role of genetics in IBD as evidenced by initial reports of familial clustering. An early report by Orholm and colleagues noted a 10-fold increased risk of CD to first-degree relatives of CD, and an 8-fold increased risk of UC to first-degree relatives of UC. Furthermore, this group noted the likely genetic overlap between CD and UC, as relatives of CD or UC probands were at increased risk for both diseases when compared with the general population. Another clear measure of the strong genetic influence in IBD stems from numerous twin studies. The most recent review combines data from previous studies and highlights the possibility that genetics may play a stronger role in CD (monozygotic twin [MZ] concordance rate ∼30% vs dizygotic [DZ] twin concordance rate ∼4%) when compared with UC (MZ ∼15% vs DZ ∼4%). Though these concordance rates implicate a robust genetic component, the fact that they are not absolute indicates that genes alone are not sufficient to cause disease.
As we understand from numerous studies across complex genetic disorders, genetic background is not independent of environmental influences. Environmental factors (eg, gut flora, dietary changes, pollution exposure, microbial exposures, lifestyle changes, smoking, and geography) are likely to have a strong influence on the underlying genetic susceptibility.
Even though there is considerable evidence for genetic influence, IBD does not typically follow a simple Mendelian model of inheritance within a family. The only exception is that rare, autosomal recessive mutations found in the interleukin-10 (IL-10) receptor and IL-10 cytokine have been shown to be sufficient to cause severe forms of CD in infants.
The last 15 years have seen a tremendous degree of progress regarding the identification of genetic loci involved in IBD. This has happened in part because of technological advances and growth in the genetic approaches used to identify these genes. Multiple linkage studies in the mid to late 1990s, utilizing multiple affected families, identified a handful of genomic regions of interest. One of these regions on chromosome 16, when combined with candidate gene approaches, led to the identification of the first IBD susceptibility gene, namely, NOD2 (nucleotide oligomerization domain 2). As the genetic aspect continues to evolve, the last 5 years have seen the largest growth in the number of genetic loci identified in IBD. This is largely because of the expanse of large consortia using genome-wide association study (GWAS) approaches. These so-called GWASs have been performed almost exclusively in North American and European white populations and have used genotyping arrays of hundreds of thousands of single-nucleotide polymorphisms (SNPs), which are spread throughout the genome. Combined, these studies have identified approximately 100 genetic regions (71 CD, 47 UC, and 28 across both) demonstrating a level of genome-wide significant association to IBD. The large overlap of genetic loci seen across CD and UC are consistent with expectations based on clinical and epidemiologic predictions and will likely provide key insight into disease pathophysiology. The genetic loci identified for IBD point to a number of relevant biological pathways, including the IL-23 pathway suggesting a role in the maintenance of intestinal immune homeostasis, IL-10 signaling, and overall leukocyte trafficking. Despite the apparent overlap, there do appear to be some distinctions that are emerging. The genetic variation relevant in CD continues to point toward the body’s mismanagement of microbe recognition and processing of intracellular bacteria by the innate immune system with a more specific focus on regulation of autophagy. Meanwhile, the story for UC has a slightly different focus. UC continues to have a noticeably stronger association to the human leukocyte antigen class II genes compared with CD, suggesting that genes across the major histocompatibility complex confer a stronger risk in UC. Genes identified thus far for UC appear to focus on intestinal barrier integrity and function.
Despite this vast expanse in the number of known loci from just a few years early, NOD2 continues to have the strongest individual effect on risk of IBD. Moreover, these approximately 100 genes collectively account for a very small proportion of the genetic heritability of either CD or UC, with only about 23% and about 16%, respectively, of the genetic contribution defined.
Although it is clear that GWASs have provided invaluable insight into the genetic contributions to IBD, they fall short of their initial promise to identify strong genetic effects through genetic tagging via common variation. The hallmark of these large commercial genome-wide screening arrays, now at the level of 2 to 5 million SNPs, has been to provide the most common (based on allele frequency) markers that best tag the known variation across the genome. These panels have implicitly focused on testing the “common-disease common-variant” hypothesis, which predicts that common alleles will be found to be in and of themselves disease causing.
Recent technologic advances in genomic sequencing, so-called “next-generation sequencing,” have helped pave the way for the identification of rarer genetic variation that may manifest itself in common diseases. The identification and characterization of these variants will help to test the “common-disease multiple rare-variant” hypothesis, which states that susceptibility to common diseases is determined by a large number of rare variants of stronger effect. Franke and colleagues highlight the relevance of this hypothesis in IBD as it relates to rare genetic variation of stronger effect within NOD2 . They note that the most associated SNP within their analysis only explains just 0.8% of genetic variance, whereas the 3 NOD2 coding mutations (noted as mutations because these variants have shown functional effects) themselves account for nearly 5% of the heritability of CD. They further highlight that if this same situation were relevant to even a portion of the nearly 100 genes, there would be a much more significant portion of the overall heritability explained. These findings help to highlight the need to characterize further the genetic regions that we have already identified. The latest work by Rivas and colleagues further emphasizes the benefit to deep resequencing of currently known IBD loci. They find a number of additional independent risk factors in known IBD genes (including NOD2 , IL23R , CARD9 ) and additional associations to coding variants in other previously identified IBD risk loci that are predictive of direct functional consequence.
Because technologic advances are allowing for whole-genome sequencing at a near cost-effective level, the future of genetics and genomics is now a reality that most researchers and clinicians could not imagine just a few years ago. Although the technology and our current genetic approaches have by many accounts been very successful, we still have quite a way to go to explain the complex genetic landscape of IBD.