Genetics of Primary Sclerosing Cholangitis

Chromosome

Plausible risk gene

Study

MMEL1, TNFRSF14

Folseraas et al. (2012) [56]

BCL2L11

Melum et al. (2011) [59]

CD28, CTLA4

Liu et al. (2013) [27]

CCL20

Ellinghaus et al. (2016) [45]

GPR35

Ellinghaus et al. (2013) [60]

MST1

Melum et al. (2011) [59]

NFKB1

Ellinghaus et al. (2016) [45]

IL2, IL21

Liu et al. (2013) [27]

BACH2

Liu et al. (2013) [27]

The HLA complex

Karlsen et al. (2010) [32]

IL2RA

Srivastava et al. (2012) [61]

SIK2

Liu et al. (2013) [27]

HDAC7

Liu et al. (2013) [27]

RFX4, RIC8B

Ellinghaus et al. (2016) [45]

SH2B3, ATXN2

Liu et al. (2013) [27]

CLEC16A, SOCS1

Ellinghaus et al. (2016) [45]

TCF4

Ellinghaus et al. (2013)

CD226

Liu et al. (2013) [27]

PRKD2, STRN4

Liu et al. (2013) [27]

PSMG1

Liu et al. (2013) [27]

The risk gene annotation at each locus is based on circumstantial evidence, and no conclusive reports exist linking a PSC-associated genetic variant to distinct disease mechanisms. Such studies are urgently needed and often hampered by limited genetic insight of the risk loci (i.e. often there are more than one gene at associated loci)

Table 8.2

Primary sclerosing cholangitis susceptibility loci identified by two independent analytical assessments but not reaching the formal genome-wide significance threshold (P ≤ 5 × 10⁻⁸)

	Liu et al. (2013) [27]		Ellinghaus et al. (2016) [45]
Chromosome	Lead SNP	Gene	Lead SNP	Gene
2	rs12479056	PUS10, REL	rs7608910	PUS10
2	rs11676348	TGR5, ARPC2, CXCR1/2	rs11676348	CXCR2
8	rs10956390 rs13255292 rs2977035	PVT1, MIRs 1204–1208	rs2042011	RN7SKP226
10	rs7923837	HHEX	rs2497318	EIF2S2P3
10	rs10883371	NKX2–3	rs10748781	NKX2–3
11	rs694739	PRDX5	rs559928	NA
16	rs7404095	PRKCB	rs7404095	PRKCB
18	rs2847297	PTPN2	rs12968719	PTPN2
19	rs601338	FUT2	rs679574	FUT2
21	rs11203203	UBASH3A	rs1893592	UBASH3A

SNP, single nucleotide polymorphism (a genetic marker used in genome-wide association studies). The risk gene annotation at each susceptibility loci is done by circumstantial evidence and not causal or conclusive factors; hence, they differ on some instances between the two studies

GWAS and Liver Disease Genetics

The first successful GWAS was published in 2005 [4], and the following decade saw a flourishing and widespread application of the successful study design, leading to the identification of more than 1,000 risk loci in a variety of human complex traits, in more than 2,000 original publications. A GWAS is in simple terms a case-control association analysis, comparing the frequencies of genetic variants spread throughout the genome between two groups, patients and healthy controls: a GWAS is a scientific experiment, requiring a clear hypothesis and a well-defined phenotype and appropriate interpretation. The impact of any genetic association, wherein at multiple loci the allele frequency differs between cases and controls, must reflect the study design, as well as the population studied. Hence disease risk and disease severity, for example, are distinct questions answered in different ways. Risk loci (susceptibility loci) are determined as chromosomal regions (sometimes within single genes, susceptibility genes) where there is a statistically significant difference in the occurrence of particular variants observed in the patients (Fig. 8.1). Notably it may not be possible to always confidently assign a gene to an identified risk locus, and caution is therefore needed in making immediate biological interpretation of findings, not least because of the complexity of genetic interactions now recognised across the genome. Equally readers should be very sceptical of any study in the current era that adopts outdated single gene/variant analyses, unless it is apparent that appropriate validation cohorts are included in such candidate gene studies.

Fig. 8.1

The importance of HLA associations in PSC and other autoimmune diseases. The figure shows a selection of the so-called Manhattan plots in genome-wide association studies. The X axis of the plots shows the chromosome and position and the Y axis the significance level of association testing at each position. The purpose of showing the figure, with primary sclerosing cholangitis as the centre plot, is to highlight the immense predominance of the HLA associations at chromosome 6 (plotted in red). Similar HLA associations can be seen to a variable extent in a multitude of other diseases, most strongly in prototypical autoimmune diseases. The non-HLA associations are plotted in black (Further information on individual gene studies can be found at https://www.ebi.ac.uk/gwas/)

Given the high number of genetic variants tested (typically now around 1,000,000), statistical significance thresholds are stringently set by convention at P ≤ 5 × 10⁻⁸ (so-called genome-wide significance) to avoid false-positive findings (type 1 errors), and generally external validation of findings is sought as well. Inherent to the study design (association analysis), variants detected at risk loci must have a relatively high frequency to be detectable (i.e. they are ‘common’, typically with a frequency above 1–5 % in the general population), and being common they generally also exert a relatively low impact on disease risk (odds ratio typically below 1.5) [5]. The latter fact also implies that large collections of cases and controls have been required for the study design to be useful, preferably thousands, and the networks organised to recruit patients for DNA collection have promoted a collaborative, international working environment which should be considered a beneficial ‘side effect’ of GWAS [6]. For rare diseases like PSC, the statistical stringency and the low effect size of implied variants inevitably lead to false negatives (type 2 statistical errors), and this has to be kept in mind as a limitation of the data herein reviewed.

During the 1990s, liver disease genetics was dedicated to Mendelian traits, starting with the identification of genes for hyperbilirubinemias and Wilson’s disease [7–12] and a strong subsequent focus on cholestasis and hemochromatosis [13–23]. The interpretation of the gene findings in these studies has greatly influenced the thinking of susceptibility genes also in non-Mendelian (i.e. complex) diseases like PSC. This is important to be aware of, since the genetics as determined by GWAS represent fundamentally different mechanisms of causality. In Mendelian diseases, there is an approximately 1:1 relationship between genetic aberrations and disease traits (the genetic variants ‘cause disease’ frequently as mutations are structurally damaging to protein function). This being said, given the time taken to mechanistically understand even Mendelian diseases, it is relevant to reflect that disease penetrance and clinical phenotype are often not so easily explained by a single mutation-single effect model, even for diseases as classic monogenic as hemochromatosis and Wilson’s disease.

For GWAS findings, contextual factors (environment, gene-gene interactions, etc.) nevertheless play a considerably greater role than in Mendelian genetics (Fig. 8.2), making it inappropriate to assume susceptibility genes as causal (the disease-associated genetic variants in GWAS ‘do not cause disease’). This distinction between Mendelian genetics and ‘GWAS genetics’ is underlined by the fact that the overall contribution of genetics to complex traits like PSC is limited [24]. GWAS outcomes, even by mathematical extrapolations, are likely to represent a minor fraction (probably less than one third) of the susceptibility to complex traits [25, 26], and PSC so far makes up less than 10 % of the overall disease liability [27].

Fig. 8.2

Relative impact of genetic versus non-genetic factors in PSC. Genetic studies emphasise that the genetic contribution to overall primary sclerosing cholangitis (PSC) liability is low and that interacting and co-occurring environmental factors (white) are likely important. Outcomes of genome-wide association studies (GWAS; light blue) may aid in the identification of such factors, since the common variants have been exposed to the historical environment. Despite an increasing number of reported risk loci (at present 16), a fraction of the heritable contribution to PSC pathogenesis is not detectable by GWAS due to limitations of the study design (dark blue) (Reprinted with permission from Ref. [24])

The relative importance of genetic influences in PSC is also evident in studies of heritability. There are no formal twin studies as for many other diseases, but registry data from Sweden makes a hazard ratio estimate of 11.1 in siblings of PSC patients (complicated by the lack of an ICD10 code for precise case identification) [28]. Such an estimate places PSC at the same degree of heritability as in most other complex autoimmune and immune-mediated conditions. In these diseases, heritability estimates mostly range near a relative sibling risk of ~10 on most instances. Notably, this number is very low compared to Mendelian conditions where relative sibling risk (depending on the penetrance of involved genetic variants) ranges from several hundreds to several thousands [29], further underlining how our thinking on the outcomes of genetic studies in complex diseases like PSC must be different from that of monogenic traits (Fig. 8.3).

Fig. 8.3

Distinguishing Mendelian from complex liver affections. For complex phenotypes, the contribution from genetics to overall disease liability is limited (typically less than 50 %). In addition, only a fraction (for primary sclerosing cholangitis [PSC] less than 10 %) of the genetic susceptibility is known. In both Mendelian and complex disease manifestations, the gene findings serve as clues to the underlying pathophysiology. However, only for the case of Mendelian manifestations of liver affections do genetic findings have clinically useful predictive power (Reprinted with permission from Ref. [43])

The HLA Association in PSC

As can be seen in Fig. 8.1, the genetic findings on chromosome 6 in PSC are several orders of magnitude stronger than those found in any other region. Throughout the rest of the genome, a number of weaker and less significant associations can be found. The important point about this overall genetic architecture in PSC is the fact that it resembles the genetic architecture of prototypical autoimmune diseases, e.g. type 1 diabetes, rheumatoid arthritis and multiple sclerosis. Prior to the genetic studies, it had been questioned whether PSC could be an autoimmune disease, particularly given the strong male predominance (two thirds of the patients are male) and lack of efficacy of immunosuppressive therapy. However, autoimmune diseases with a male predominance do exist (e.g. ankylosing spondylitis), and alongside other features observed (autoantibodies [30] and clonality of T cell receptors [31]), genetics clearly positions PSC as an inherently autoimmune condition, albeit one perhaps because of its biliary localisation that does not respond to classical immunosuppression. In many aspects, this global observation is one of the major outcomes of the genetic studies. The model contrasts that of other models of PSC development (e.g. toxic bile acid injury, gut leakage of bacterial components due to IBD [32]) whilst is compatible with models involving the cross-homing or cross-reactivity of lymphocytes between the bowel and the liver [33, 34].

Only gold members can continue reading. Log In or Register to continue