Introduction to Genetic Renal Disease



Introduction to Genetic Renal Disease


Erum Hartung

Terry Watnick

Gregory G. Germino



GENETIC RENAL DISEASE

The success of the human genome project has resulted in dramatic advances in our understanding of inherited renal diseases. There has been an explosion in the number of disease-causing genes that have been identified and in our understanding of the pathogenic mechanisms that underlie these disorders (Tables 14.1,14.2,14.3,14.4,14.5,14.6,14.7,14.8). The spectrum of physiologic and developmental pathways that are disrupted is broad and includes defects in isolated transport mechanisms (e.g., cystinuria, primary hypomagnesemia, pseudohypoaldosteronism), defects in complex developmental pathways (e.g., autosomal dominant polycystic kidney disease [ADP-KD] and renal coloboma syndrome), and defects in structural proteins (e.g., Alport syndrome and congenital nephrotic syndrome of the Finnish type). The tremendous progress in the field over the past 5 years has made an exhaustive discussion of the topic of genetic renal disease far beyond the scope of this chapter. The interested reader is referred to the Online Mendelian Inheritance of Man (OMIM) for a more complete description. This Web-based database provides a complete catalogue of diseases, their clinical features, and their molecular genetics (http://www.ncbi.nlm.nih.gov/Omim/). Instead, we will describe the process of gene discovery and how that process has evolved over the past several years. We will then review some of the scientific tools that have been applied in the postcloning stages of gene discovery in order to understand important aspects of renal biology. Finally, we will consider the clinical implications of these insights and how they may ultimately be applied in patient care.


GENE IDENTIFICATION

The basis of any inherited disease is an underlying alteration in genomic DNA that is transmitted from parent to offspring. Theoretically, one could compare the entire genomic sequence of an individual affected with a particular disease to that of unaffected individuals in order to identify the pathogenic difference. As simple as this approach sounds, it has, until recently, been a daunting challenge due to the size of the diploid human genome (6 × 109 base pairs [bp]), the limited throughput of traditional DNA sequencing methods (106 to 107 bp per day), and the high degree of normal variation in human populations. Important technical and bioinformatic developments have greatly changed the genetic disease discovery landscape, however. Whole genome sequencing is now a reality, and the <$ 1,000 genome will soon be here. Enigmatic clinical disorders that were once deemed too difficult to study using conventional positional cloning approaches because of their rarity are now giving up their secrets. In this section, we will briefly review the history of renal disease gene discovery, highlighting some of the most illustrative examples, and then discuss the impact of next generation sequencing on this field.

Prior to the easy availability of inexpensive DNA sequencing, multiple approaches had been developed to facilitate disease gene discovery. With some disease entities, a broader understanding of underlying pathogenic mechanisms had enabled the identification of potential “candidate genes.” Alport syndrome is an example of the successful application of this approach.1 Biochemical analysis of the Alport glomerular basement membrane (GBM) identified a set of α chains of type IV collagen that were missing.2,3,4,5 Molecular techniques were then used to clone the type IV collagen genes, and mutation analyses revealed that sequence variants in a subset of these genes segregated with the disease. In a similar manner, the recognition that individuals suffering from the infantile form of Bartter syndrome have a clinical presentation similar to that of patients on loop diuretics prompted investigators to evaluate the drug’s target, the Na-K-2Cl cotransporter (SLC12A1), as a probable candidate gene. As predicted for this recessive disease, inactivating mutations were found in both alleles in a subset of families.6

For other disorders, the underlying gene defect was identified by expression cloning of candidate genes. In this approach, one identifies genes responsible for a particular function by the transfer of genetic material into cells that lack that function and then screening for activity. Typically, multiple pools of genes are used for the initial screening and then the search is focused on only those pools that demonstrate the desired activity. By using a reiterative process of serial dilutions and functional testing, one can ultimately identify the gene or genes responsible for the observed activity. Finally, the cloned candidate genes are scanned for sequence differences (mutations) that segregate with disease. This approach has been termed expression cloning and has been used most successfully to identify various transporters. The genes implicated in cystinuria, SLC3A1 and SLC7A9, were identified by this approach.7,8,9,10,11 Likewise, the three subunits that comprise the epithelial sodium channel, ENaC, were isolated in this manner.12 Inactivating mutations of each of the subunits have been associated with recessive forms of pseudohypoaldosteronism type I,13,14 whereas activating mutations of either the β or γ subunit have been found in Liddle syndrome, an autosomal dominant form of hypertension.15,16










TABLE 14.1 Inherited Disorders of the Glomerulus













































































































































































































































Disease


OMIM #


Mode of Inheritance


Chromosomal Localization


Gene Name(s)


Gene Product(s)


Reference(s)


Alport Syndrome









X-linked


301050


X


Xq22.3


COL4A5


α5(IV) collagen


1



X-linked with leiomyomatosis


308940


X


Xq22.3


COL4A5 & COL4A6


α5(IV) and α6(IV) collagen


244



Autosomal recessive


203780


AR


2q35-q37


COL4A3 & COL4A4


α3(IV) and α4(IV) collagen


245



Autosomal dominant


104200


AD


2q35-q37


COL4A3 & COL4A4


α3(IV) collagen and α4(IV) collagen


246,247


Thin Basement Membrane Nephropathy / Benign Familial Hematuriaa


141200


AD


2q35-q37


COL4A3 & COL4A4


α4(IV) collagen


248,249,250


Hereditary Nephrotic Syndromea Congenital nephrotic syndrome of the Finnish type / nephrotic syndrome, type 1


256300


AR


19q13.1


NPHS1


Nephrin


108



Idiopathic nephrotic syndrome, steroid resistant / nephrotic syndrome, type 2


600995


AR


1q25-q31


NPHS2


Podocin


41



Nephrotic syndrome, type 3


610725


AR


10q23


PLCE1


Phospholipase C, epsilon-1


251



Isolated diffuse mesangial sclerosis / nephrotic syndrome, type 4


256370


AD


11p13


WT1


Wilms tumor 1, zinc finger protein


252



Denys-Drash syndrome (diffuse mesangial sclerosis, pseudohermaphroditism, Wilms tumor)


194080


AD


11p13


WT1


Wilms tumor 1, zinc finger protein


253



Frasier syndrome (diffuse mesangial sclerosis, pseudohermaphroditism, gonadoblastoma)


136680


AD


11p13


WT1


Wilms tumor 1, zinc finger protein


254,255



Pierson syndrome (microcoria, congenital nephrotic syndrome)


609049


AR


3p21


LAMB2


Laminin, beta-2


256


Hereditary Focal Segmental Glomerulosclerosis (FSGS)a









FSGS1


603278


AD


19q13


ACTN4


Alpha-actinin 4


257



FSGS2


603965


AD


11q21-q22


TRPC6


Transient receptor potential cation channel 6


258,259



FSGS3


607832


AR/AD


6p12


CD2AP


CD2-associated protein


47,48



FSGS4


612551


AR


22q12.3


APOL1


Apolipoprotein L-1


243



FSGS5


613237


AD


14q32.33


INF2


Inverted formin 2


260



FSGS6


614131


AR


15q21-q22


MYO 1E


Myosin 1E


50,261


Glomerulopathy with fibronectin depositsa


601894


AD


2q34


FN1


Fibronectin


262


IgA Nephropathya


161950


AD


6q22-q23


?


?


263,264





4q26-q31


?


?






17q12-q22


?


?



Nail-Patella syndrome


161200


AD


9q34.1


LMX1B


LIM-homeodomain transcription factor 1, beta


43,44


Epstein and Fechtner syndromes (progressive nephritis, macrothrombocytopenia, leukocyte inclusions, hearing loss)


153650 and 153640


AD


22q11.2


MYH9


Nonmuscle myosin heavy chain 9


265,266


a Identifies diseases with probable additional, unmapped loci. OMIM, Online Mendelian Inheritance of Man; AR, autosomal recessive; AD, autosomal dominant; X, X-linked; IgA, immunoglobin A.














TABLE 14.2 Renal Cystic Disorders





































































































































































































































































































































































































































































































































































































































































Disease


OMIM #


Mode of Inheritance


Chromosomal Localization


Gene Name(s)


Gene Product(s)


Reference(s)


Autosomal Dominant Polycystic Kidney Disease (ADPKD)


173900








ADPKD1, most common


601313


AD


16p13.3


PKD1


Polycystin-1


29,267



ADPKD2


173910


AD


4q21-q23


PKD2


Polycystin-2


34



ADPKD3


600666


AD


?


?


?


268,269,270,271,272


Polycystic Kidney Disease, Infantile Severe, with Tuberous Sclerosis (PKDTS)


600273


AD, sporadic


16p13.3


PKD1 & TSC2 (CGS)


Polycystin-1, tuberin


273,274


Autosomal Recessive Polycystic Kidney Disease (ARPKD)


263200


AR


6pl2


PKHD1


Polyductin, fibrocystin


275,276


Renal Cysts and Diabetes Syndrome (RCAD) / Maturity Onset Diabetes of the Young, Type 5 (MODY5)


137920


AD


17q12


HNF1-β (TCF2)


Hepatocyte nuclear factor 1-β (transcription factor 2)


277,278,279


Medullary Cystic Kidney Disease (MCKD)a









MCKD1


174000


AD


1q21


?


?


280



MCKD2 (allelic to familial juvenile hyperuricemic nephropathy [FJHN])


603860


AD


16pl2.3


UMOD


Uromodulin (Tamm-Horsfall protein)


281


Glomerulocystic Kidney Disease with Hyperuricemia and Isosthenuria (allelic to MCKD2 and FJHN)


609886


AD


16pl2.3


UMOD


Uromodulin (Tamm-Horsfall protein)


282


Nephronophthisis (NPHP)a









NPHP1 (juvenile), most common (allelic to Joubert syndrome 4 [JBTS4] and Senior-Loken syndrome 1 [SLSN1])


256100


AR


2q13


NPHP1


Nephrocystin-1


8,283



NPHP2 (infantile)


602088


AR


9q31


INVS (NPHP2)


Inversin


139



NPHP3 (adolescent) (allelic to SLSN3 and Meckel syndrome 7 [MKS7])


604387


AR


3q22


NPHP3


Nephrocystin-3


145



NPHP4 (allelic to SLSN4)


606966


AR


lp36


NPHP4


Nephrocystin-4 (nephroretinin)


3,146,284



NPHP5


See SLSN5








NPHP6


See SLSN6








NPHP7


611498


AR


16p13.3


GLIS2


GLIS family zinc finger protein 2


285



NPHP8


See JBTS7








NPHP9


609799


AR


17q11.1


NEK8


Never in mitosis gene A-related kinase 8


286



NPHP10


See SLSN7








NPHP11 (nephronophthisis and hepatic fibrosis) (allelic to JBTS6/MKS3)


613550


AR


8q21.13-q22.1


TMEM67 (MKS3)


Transmembrane protein-67 (Meckelin)


287



NPHP12


611150


AR


22q13


ATXN10


Ataxin 10


150


Nephronophthisis-like nephropathy (NPHPL1)


613159


AR


22q13.31-q13.33


XPNPEP3


X-prolyl aminopeptidase 3


288


Joubert syndrome (JBTS)a (nephronophthisis and cerebellar vermis hypoplasia)









JBTS1


213300


AR


9q34.3


INPP5E


Inositol polyphosphate-5-phosphatase, 72-KD


289,290



JBTS2 (allelic to MKS2)


608091


AR


11q13


TMEM216


Transmembrane protein 216


291,292



JBTS3


608629


AR


6q23.3


AHI1


Abelson helper integration site-1 (Jouberin)


293



JBTS4 (allelic to NPHP1 and SLSN1)


609583


AR


2q13


NPHP1


Nephrocystin-1


294



JBTS5 (allelic to SLSN6, MKS4, and Bardet-Biedl syndrome [BBS14])


610188


AR


12q21.3


CEP290 (NPHP6)


Centrosomal protein, 290-KD (nephrocystin-6)


295



JBTS6 (allelic to MKS3 and COACH syndrome)


610688


AR


8q21.13-q22.1


TMEM67 (MKS3)


Transmembrane protein-67 (Meckelin)


296



JBTS7 (allelic to MKS5 and COACH syndrome)


611560


AR


16q12.2


RPGRIP1L (NPHP8)


Retinitis pigmentosa GTPase regulator-interacting protein 1-like


297



JBTS8


612291


AR


3q11.2


ARL13B


ADP-ribosylation factor-like 13B


298



JBTS9 (allelic to MKS6 and COACH syndrome)


612285


AR


4p15.3


CC2D2A


Coiled-coil and C2 domains-containing protein 2A


299,300



JBTS10 (allelic to orofaciodigital syndrome 1 [OFD1])


300804


X


Xp22.3-p22.2


CXORF5 (OFD1)


Chromosome × open reading frame 5


106



JBTS11 (allelic to MKS8)



AR


12q24.31


TCTN2


Tectonic 2


150


Orofaciodigital Syndrome 1 (OFD1) (Malformations of the Face, Oral Cavity, and Digits, with Polycystic Kidneys)


311200


X


Xp22.3-Xp22.2


CXORF5 (OFD1)


Chromosome × open reading frame 5


301,302


COACH Syndrome (Joubert Syndrome with Congenital Hepatic Fibrosis)


216360


AR


Allelic to JBTS6/MKS3 (most common); also JBTS9/MKS6 and JBTS7/MKS5 (see previous)


300,303,304


Senior-Loken Syndrome (SLSN)a (Nephronophthisis and Retinitis Pigmentosa)









SLSN1 (allelic to NPHP1 and JBTS4)


266900


AR


2q13


NPHP1


Nephrocystin-1


305



SLSN3 (allelic to NPHP3)


606995


AR


3q22


NPHP3


Nephrocystin-3


306



SLSN4 (allelic to NPHP4)


606996


AR


lp36


NPHP4


Nephrocystin-4 (nephroretinin)


146,284



SLSN5


609254


AR


3q21.1


IQCB1 (NPHP5)


IQ motif-containing protein B1 (nephrocystin-5)


307



SLSN6


610189


AR


12q21.3


CEP290 (NPHP6)


Centrosomal protein, 290-KD (nephrocystin-6)


295



SLSN7 (allelic to BBS16)


613615


AR


1q43-q44


SDCCAG8 (NPHP10)


Serologically defined colon cancer antigen 8


52


Meckel Syndrome / Meckel-Gruber Syndrome (MKS)a (Cystic Kidney Dysplasia, Hepatic Fibrosis, Occipital Meningoencephalocele, Polydactyly)









MKS1 (allelic to BBS13)


249000


AR


17q23


MKS1 (BBS13)


Meckel syndrome type 1 protein (BBS 13)


308,309



MKS2 (allelic to JBTS2)


603194


AR


11q13


TMEM216


Transmembrane protein 216


292,310



MKS3 (allelic to JBTS6 and COACH syndrome)


607361


AR


8q21.13-q22.1


TMEM67 (MKS3)


Transmembrane protein-67 (Meckelin)


311



MKS4 (allelic to JBTS5, SLSN6, and BBS14)


611134


AR


12q21.3


CEP290 (NPHP6)


Centrosomal protein, 290-KD (nephrocystin-6)


312



MKS5 (allelic to JBTS7 and COACH syndrome)


611561


AR


16q12.2


RPGRIP1L (NPHP8)


Retinitis pigmentosa GTPase regulator-interacting protein 1-like


297



MKS6 (allelic to JBTS9 and COACH syndrome)


612284


AR


4p15.3


CC2D2A


Coiled-coil and C2 domains-containing protein 2A


313



MKS7 (allelic to NPHP3 and SLSN3)


267010


AR


3q22


NPHP3


Nephrocystin-3


314



MKS8 (allelic to JBTS11)



AR


12q24.31


TCTN2


Tectonic 2


315


Bardet-Biedl Syndrome (BBS)a (Retinal Dystrophy, Obesity, Polydactyly, Mental Retardation, Genitourinary Malformations/Hypogonadism, Renal Abnormalities)


209900








BBS1


209901


AR


11q13


BBS1


BBS1


316



BBS2


606151


AR


16q21


BBS2


BBS2


317



BBS3


608845


AR


3pl2-p13


ARL6 (BBS3)


ADP-ribosylation factor-like 6


318



BBS4


600374


AR


15q22.3-q23


BBS4


BBS4


319



BBS5


603650


AR


2q31


BBS5


BBS5


35



BBS6 (allelic to McKusick-Kaufman syndrome [MKKS])


604896


AR


20pl2


MKKS


MKKS chaperonin protein


320,321



BBS7


607590


AR


4q27


BBS7


BBS7


322



BBS8


608132


AR


14q32.1


TTC8 (BBS8)


Tetratricopeptide repeat domain 8 (BBS8)


323



BBS9


607968


AR


7p14


PTHB1 (BBS9)


Parathyroid hormone-responsive B1(BBS)


324



BBS10


610148


AR


12q21.2


BBS10 (C120RF58)


BBS10 (chromosome 12 open reading frame 58)


325



BBS11


602290


AR


9q31-q34.1


TRIM32 (BBS11)


Tripartite motif-containing protein 32 (BBS11)


326



BBS12


610683


AR


4q27


BBS12


BBS12


327



BBS13 (allelic to MKS1)


609883


AR


17q23


MKS1 (BBS13)


Meckel syndrome type 1 protein (BBS13)


328



BBS14 (allelic to JBTS5, SLSN6, and MKS4)


610142


AR


12q21.3


CEP290 (NPHP6)


Centrosomal protein, 290-KD (nephrocystin-6)


328



BBS15


613580


AR


2p15


C20RF86 (BBS15)


Chromosome 2 open reading frame 86 (Fritz/BBS15)


329



BBS16 (allelic to SLSN7)


613524


AR


1q43-q44


SDCCAG8 (NPHP10)


Serologically defined colon cancer antigen 8


52


a Identifies diseases with probable additional, unmapped loci. OMIM, Online Mendelian Inheritance in Man; AR, autosomal recessive; AD, autosomal dominant; CGS, contiguous gene syndrome; COACH syndrome, cerebellar vermis hypo/aplasia, oligophrenia (mental retardation), ataxia, ocular coloboma, and hepatic fibrosis; ADP, adenosine diphosphate. Reviewed in reference 345










TABLE 14.3 Genetic Causes of Hypertension









































































































































































Disease


OMIM #


Mode of Inheritance


Chromosomal Localization


Gene Name(s)


Gene Product(s)


Reference(s)


Apparent Mineralocorticoid Excess (AME)









AME type 1


218030


AR


16q22


HSD11B2


11-β-hydroxysteroid dehydro-genase, type 2 isoform


330



AME type 2 (variant of type 1)


207765


AR


16q22


HSD11B2


11-β-hydroxysteroid dehydro-genase, type 2 isoform


331


Early Onset Hypertension, with Severe Exacerbation in Pregnancy


605115


AD


4q31.1


NR3C2


Nuclear receptor 3C2 (mineralocorticoid receptor)


107


Familial Hyperaldosteronism (FH) FH type I (glucocorticoid-remediable aldosteronism [GRA])


103900


AD


8q21


CYP11B1-CYP11B2 fusion


Promoter of CYP11B1 (11-β-hydroxylase) controls expression of CYP11B2 (aldosterone synthase)


332








?




FH Type II


605635


AD


7p22


?




333



FH Type III


613677


AD


?


?



334


Pseudohypoaldosteronism Type II (PHA2) (Gordon Syndrome/Familial Hypertensive Hyperkalemia)


145260








PHA2A


145260


AD


1q31-q42


?


?


335



PHA2B


601844


AD


17q21-q22


WNK4


Protein kinase, lysine-deficient 4


336



PHA2C


605232


AD


12q13


WNK1


Protein kinase, lysine-deficient 1


336


Hypertension with Brachydactyly


112410


AD


12pl2.2-p11.2


?


?


337,338


Liddle Syndrome (Pseudoaldosteronism)


177200








Type 1


600760


AD


16p13-p12


SCNN1B


β subunit of ENaC, the renal epithelial sodium channel


15








γ subunit of ENaC




Type 2


600761


AD


16p13-pl2


SCNN1G



16


Progressive Nephropathy with Hypertension


161900


AD


1q21


?


?


339


OMIM, Online Mendelian Inheritance in Man; AR, autosomal recessive; AD, autosomal dominant.



Each of the prior approaches requires an in-depth understanding of the underlying pathobiology of the disease for its success. Unfortunately, we lack this information for most diseases. This necessitated the use of a strictly molecular genetic approach, termed positional cloning, which seeks to identify a disease gene solely on the basis of its chromosomal location.17

In some cases, important positional clues were provided by a cytogenetic analysis of affected individuals. Several of the loci involved in the origin of renal tumors were identified by this approach (Table 14.4). The gene responsible for the major form of Wilms tumor, WT1, was initially discovered because of its involvement in WAGR syndrome (Wilms tumor, aniridia, genitourinary anomalies, and mental retardation).18,19,20 Individuals affected by this disorder were found to have constitutional deletions of 11p13 involving a zinc finger transcription factor, WT1, a paired box transcription factor (Pax6), and adjacent DNA sequences. Fine mapping proved that WT1 was responsible for the genetic susceptibility to Wilms tumor, whereas Pax6 was responsible for the aniridia phenotype.21 One of the familial forms of nonpapillary renal cell carcinoma (RCC) also was identified on the basis of its underlying chromosomal rearrangement. A translocation between chromosomes 3p and 8q (t[3;8][p14.2;q24.1]) was found to segregate with renal cell carcinoma.22 Nearly 2 decades later, investigators identified the genes disrupted by the translocation. They determined that the chromosomal rearrangement resulted in a novel gene that consisted of 5′ elements of a gene called FHIT (fragile histidine triad gene) fused to the coding sequence of TRC8.23 The protein product of TRC8 has high homology to the basal cell carcinoma/ segment polarity gene product, Patched, a signaling receptor, suggesting that it may have a similar function. Since that time, 11 further chromosome 3 translocations have been associated with a susceptibility to RCC, including several that disrupt candidate tumor suppressor genes such as LSAMP and NORE1. 24

Perhaps one of the most striking examples of the power of cytogenetic abnormalities to expedite gene discovery is that provided by the search for PKD1, the gene responsible for the most common form of ADPKD. A combination of molecular and genetic techniques had rapidly localized the gene to a 500 kilobase (kb) gene-rich segment, but the lack of known chromosome rearrangements or deletions coupled with the large number of potential candidate genes greatly complicated the search.25,26 Several years of mutation screening had failed to determine which one of the many candidates was in fact PKD1 when an astute clinician identified an unusual family that had individuals with classic ADPKD as well as a child with both tuberous sclerosis and renal cysts. Because it was known that a major form of tuberous sclerosis (TSC2) was located near the PKD1 gene,27 cytogenetic studies of the family were undertaken. This revealed two individuals in the family with balanced translocations between chromosomes 16 and 22 (t[16;22] [p13.3;q11.21]).28 The child with TSC2 had an unbalanced karyotype and was missing a portion of chromosome 22 as well as the telomeric portion of chromosome 16 (45XY/-16-22 + der[16][16qter-16p13.3::22q11.21-22qter]). It was correctly speculated that the TSC2 gene was located in the portion of chromosome 16 that was lost while PKD1 was likely to be the gene bisected by the translocation breakpoint. This was confirmed by additional studies and resulted in the identification of both TSC2 and PKD1.28,29

Although chromosomal rearrangements are incredibly helpful when associated with disease, they are uncommon. Therefore, most gene searches began using linkage-based methods. Linkage analysis requires both well-characterized pedigrees and an array of genetic markers. Genetic markers are DNA variants that differ within the normal population in their length or sequence and can be used to trace inheritance of parental chromosomes within families. The principle underlying this approach is as follows: a genetic disease is assumed to be the clinical manifestation of a DNA mutation. Therefore, one can identify the location of the mutant gene by comparing the segregation of the disease phenotype with a battery of genetic markers. Alleles of loci (a specific chromosomal address of a DNA segment) on different chromosomes will appear to segregate randomly, whereas alleles of loci physically close on the same chromosome are inherited together and are linked. Meiotic recombination produces novel haplotypes by exchanging alleles between homologous chromosomes and the frequency with which this occurs depends in part on the distance separating them. In other words, if the alleles of two genes are adjacent to each other on the same chromosome segment, there will be no recombination between them. Statistical programs are used to score the probability that an observed association has not happened by chance. The most common approach determines the ratio of the probability of the observed associations assuming linkage to that of no linkage. The LOD score is the decimal logarithm of this ratio and is considered significant when greater than 3.

Until recently, it was necessary to clone the chromosomal interval in question, identify its genes, and then screen them for mutations. These steps often took many years to complete. The human genome project revolutionized this process by virtually providing the complete, annotated sequence of the human genome in public databases and producing dense genetic maps that included over a million single nucleotide polymorphisms (SNPs). SNP-chips (which query hundreds of thousands of SNPs in a single hybridization) have facilitated rapid genetic localization of disease loci, and the genomic maps have provided a comprehensive set of all potential candidate genes. Despite these advances, the task of identifying a disease-associated gene still remained a time-consuming endeavor prior to the widespread availability of high-throughput sequencing. The reason for this is that the resolution of genetic mapping is typically on the order of 500,000 to 1,000,000 bp. With an average gene density of one gene per 30,000 bp in gene-dense regions, a segment of this length could harbor between 30 to 40 genes! For diseases that are uncommon, the small number of family members available for testing often limited the resolution of genetic mapping to an area millions of base pairs in length.










TABLE 14.4 Renal Tumor Loci




















































































































































































Disease


OMIM #


Mode of Inheritance


Chromosomal Localization


Gene Name(s)


Gene Product(s)


Reference(s)


Wilms Tumora









Wilms tumor 1 (WT1)


194070


AD


11p13


WT1


Wilms tumor 1, zinc finger protein


18,19



Denys-Drash syndrome (diffuse mesangial sclerosis, pseudohermaphroditism, Wilms tumor)


194080


AD


11p13


WT1


Wilms tumor 1, zinc finger protein


253



WAGR syndrome (Wilms tumor, aniridia, genitourinary anomalies, mental retardation syndrome)


194072


AD


11p13


WT1, PAX6, nearby DNA sequences (CGS)


Wilms tumor 1, zinc finger protein and paired box gene 6, transcription factor


18,19,20



Wilms tumor 2 (WT2) (multiple tumor associated chromosome region 1, MTACR1)


194071


AD


11p15.5


H19 (imprinting defects)


H19 gene (untranslated, expressed exclusively from maternal allele)


340,341



Wilms tumor 3 (WT3)


194090


AD


16q


?


?


342



Wilms tumor 4 (WT4)


601363


AD


17q12-q21


?


?


343



Wilms tumor 5 (WT5)


601583


?


7p14-p13


POU6F2


POU domain, class 6, transcription factor 2


344


Von Hippel-Lindau Syndrome (VHL) (Clear cell renal carcinoma, hemangioblastomas, pheochromocytomas, renal and pancreatic cysts)


193300


AD


3p26-p25


VHL


VHL protein


81,82,83


Renal Cell Carcinoma (RCC), Nonpapillarya


144700








Familial RCC associated with chromosome 3 translocationsb



AD


t(3;8)(p14.2; q24.1)


FHIT/TRC8 gene fusion


Disruption of TRC8 protein (similar to patched protein), by fusion with fragile histidine triad gene


22,23


Renal Cell Carcinoma, Papillary (RCCP)a


605074








RCCP1 (papillary renal cell carcinoma [PRCC] translocation-associated gene)


179755


AD


t(X;1)(p11.2; q21.2)


PRCC/TFE3 gene fusion


PRCC/TFE3 helix-loop-helix transcription factor


346,347



Hereditary papillary renal cancer (HPRC)


164860


AD


7q31


MET


MET proto-oncogene


30



Hereditary leiomyomatosis and renal cancer (HLRC)


605839


AD


1q42.1


FH


Fumarate hydratase


348


Birt-Hogg-Dube Syndrome (BHD) (Cutaneous fibrofolliculomas, lung cysts, RCC)


135150


AD


17p11.2


FLCN


Folliculin


349


Tuberous Sclerosis (Renal angiomyolipoma, RCC)









TSC1


191100


AD


9q34


TSC1


Hamartin


350



TSC2


613254


AD


16p13.3


TSC2


Tuberin


28


a Identifies diseases with probable additional, unmapped loci.

b Eleven further chromosome 3 translocations have been associated with RCC susceptibility, involving various genes including LSAMP, NORE1, and FBXW7. Reviewed in reference 345.


OMIM, Online Mendelian Inheritance in Man; AR, autosomal recessive; AD, autosomal dominant.



Several shortcuts were used to minimize the level of effort. The first strategy was to use publicly available databases to determine the identity and likely function of genes within one’s target region. One could use a modest understanding of the pathobiology of a disease to narrow the field of candidates to a set whose functions were consistent with the underlying defect. For example, linkage studies revealed that one of the loci responsible for hereditary papillary renal carcinoma (HPRC) mapped to an interval on 7q31 that included the MET proto-oncogene. The well-established relationship between MET and other carcinomas prompted the investigators to focus their search on this gene, leading to the rapid discovery of pathogenic mutations.30 In another example, Dent disease was known to be an X-linked disorder often associated with microdeletions of Xp11. 22,31 Fisher et al.32 initiated their search for the gene responsible for Dent disease by screening for expressed sequences that were encoded by the deleted segment. They identified a novel chloride channel family member that was deleted in many patients with Dent disease, that had a restricted pattern of expression, and whose function was consistent with the pathophysiology of the disease. They subsequently showed that mutations of CLCN5 were responsible for this disease.33 In the case of ADPKD, investigators seeking the identity of PKD2 determined that one of the candidate genes in their genetic interval had homology to the gene product of PKD1, polycystin-1. This gene was an obvious candidate for PKD2, and mutation analysis quickly confirmed this suspicion.34

One of the most interesting examples of this approach was its use to identify a novel locus for Bardet-Biedl syndrome (BBS). Investigators had determined that this rare, autosomal recessive disorder, which is characterized by obesity, mental retardation, anosmia, fibrocystic renal disease, congenital hepatic fibrosis, and left/right axis defects, was genetically heterogeneous. A number of loci were known, and investigators had found that their respective gene products were localized to either the basal body and/or the primary cilium. The overlap in clinical features between BBS and other diseases that result from dysfunction of ciliary proteins gave rise to the idea that candidate genes for BBS and other similar diseases could be identified based on this property. Scientists compared the complete genomic sequence of multiple species that have cilia to those that do not, and identified a set of genes exclusively present in organisms with cilia and basal bodies. Two of the genes mapped to a previously defined interval for BBS5 that contained 230 predicted genes. Molecular testing identified mutations in one of the two genes in BBS5 families.35 The locus encodes a novel protein of unknown function that would have otherwise been a low priority candidate for further study.

A second approach that had been used to speed the pace of gene discovery was to search for disease-associated microscopic chromosomal abnormalities that were below the level of resolution of standard cytogenetic analyses. The genes responsible for the most common form of nephronophthisis (NPHP1) and for cystinosis (CTNS) were identified in this manner. In the case of NPHP1, large-scale rearrangements were detected in 80% of the patients belonging to inbred or multiplex NPHP1 families and in 65% of the sporadic cases.36 Most of the time, large homozygous deletions of approximately 250 kb involving a 100-kb inverted duplication were discovered to disrupt the gene. In a small number of individuals, oligo-base pair mutations were identified, proving that NPHP1 was the specific gene responsible for the disorder.37,38 CTNS was identified in a similar manner. Investigators found that one of the genetic markers used in their study was homozygously deleted in 23 out of 70 patients.39 They quickly focused their search on the minimal deleted region and identified a ubiquitously expressed transcript that was disrupted in all patients with deletions involving this segment. They subsequently found single or oligo-base pair mutations in many of the remaining patients, thus proving that this gene and not one of its neighbors was in fact responsible for the disease.

A third strategy had been to determine the expression pattern of the various candidates and see if any were consistent with the clinical features of the disorder. Fuchshuber and colleagues40 had localized a form of steroid-resistant idiopathic nephrotic syndrome (NPHS2) to a 2.5 million-base pair interval on chromosome 1. They had identified multiple putative candidates but focused their search on one whose expression by Northern blot was detectable only in fetal and adult renal tissues. They subsequently discovered recessive, inactivating mutations in NPHS2, and further showed that its expression was restricted to glomerular podocytes.41 In a similar manner, the kidney-restricted pattern of expression of PCLN1 helped to identify it as a probable candidate gene for primary hypomagnesemia.42

In a number of diseases, gene discovery resulted more from good luck than from the pursuit of a particular strategy. The ability to manipulate the murine genome through gene targeting (described in more detail later) has allowed investigators to generate a lengthy list of murine models of human diseases. In most cases, scientists first identified the
disease gene and then created a mutant phenotype in the mouse with the intention of modeling the human disease state (see the subsequent text). In some cases, however, the genes targeted for study had not been previously implicated in a genetic disorder; rather, they had been selected for study because the investigators had a fundamental interest in their biologic properties. Careful analysis of the murine phenotypes revealed surprising similarity to human diseases, leading investigators to test for mutations in their human homologues.

It was in this way that LMX1B, which encodes LIM homeobox transcription factor 1β, was found mutated in Nail-Patella syndrome (NPS). Investigators with an interest in basic developmental processes had targeted this gene for inactivation and discovered limb and kidney defects in Lmx1b mutant mice that were remarkably similar to those observed in human NPS.43 They quickly identified three independent NPS patients with de novo heterozygous mutations of LMX1B.44 The identification of CD2AP (CD2-associated protein) as a cause of steroid-resistant nephrotic syndrome is another example.45,46 CD2AP was thought to be an adapter protein critical for stabilizing contacts between T cells and antigen presenting cells. Mice that were null for Cd2ap had compromised immune function but died unexpectedly at 6 to 7 weeks from renal failure. The investigators showed that homozygotes developed proteinuria associated with defects in epithelial cell foot processes and eventual glomerulosclerosis. CD2AP was found expressed in podocytes where it associates with nephrin, the primary component of the slit diaphragm. Subsequent studies in humans discovered CD2AP mutations in patients with focal segmental glomerulosclerosis.47,48,49

It is likely that additional fortuitous relationships will be established, as the list of murine genes that are inactivated by gene targeting becomes more complete. The National Institutes of Health (NIH)-sponsored Knockout Mouse Project (http://www.komp.org/), in collaboration with the International Knockout Mouse Consortium (http://www .knockoutmouse.org/), is aiming to mutate all protein-coding genes in the mouse and then perform broad, standardized phenotyping on a large subset (https://commonfund.nih. gov/KOMP2/overview.aspx). These studies will likely identify additional, unsuspected candidate genes for human disorders.

Unfortunately, these approaches could not be successfully used for most genetic diseases. In these cases, one had to resort to the use of sequence-based strategies to identify the disease gene. As the reader can understand from the previous discussion, this was often a tedious, time-consuming, and expensive process. For rare diseases where the genetic interval defined by genetic linkage was on the order of millions of base pairs, gene discovery was stalled.

Breakthroughs in DNA sequencing technology have revolutionized this process. As indicated in the introduction, there has been an explosion of new high-throughput methods for determining DNA sequence. A comprehensive review of the subject is beyond the scope of this chapter and surely would be outdated by the time this volume is published. Common to all of the methods is the use of massively parallel systems that determine 106 to 108 sequence reads of ˜30 to 400 bp per read in a single experiment. This is in sharp contrast to standard Sanger sequencing machines that maximally determine up to 384 sequence reads of 600 to 1,000 bp per read. Although the new methods, commonly called next generation sequencing, dramatically increase throughput, they also generally have higher error rates and require much greater levels of redundancy to maximize accuracy. Depths of coverage are routinely greater than 20×, and even with this degree of redundancy, gaps and sequence errors can occur. Most laboratories confirm variants of interest with direct Sanger sequencing because of its greater reliability.

There presently are three different approaches for using next generation sequencing for disease gene discovery. In situations where linkage studies have already localized a gene to a chromosome region, investigators use a variety of methods to “capture” the genomic interval and then subject it to next generation sequencing. When linkage is either not possible or when the investigator prefers to use a generally applicable method, he or she pursues either whole genome or whole exome sequencing. As the respective names imply, whole genome sequencing (WGS) determines the sequence of the entire genome, whereas whole exome sequencing (WES, also known as targeted exome capture) restricts its analysis to the DNA sequence of exonic sequences and flanking intron/exon boundaries for all genes. Although WGS is the most comprehensive approach because it includes all regulatory and intronic sequences, its cost is still limiting and the sheer quantity of data it produces presents significant bioinformatic challenges. WES is currently the preferred option because the exome is less than 5% of the size of the entire genome but is estimated to harbor over 85% of all disease-causing mutations.

WES has been successfully used to identify a rapidly growing list of disease genes. In the renal community, this approach was recently used to identify MY01E (myosin 1E, a podocyte cytoskeletal protein) and NEIL1 (endonuclease VIII-like 1, a base-excision DNA repair enzyme) as new candidate genes for human autosomal recessive steroid-resistant nephrotic syndrome.50 In another example, this approach helped to identify NPHP1 mutations as the cause of disease in two families where consanguinity mapping localized the gene to an interval of almost 30 × 106 bp, far too large to tackle with conventional Sanger methods.51 Using a variation of this approach, Otto et al.52 identified mutations in SDCCAG8 (serologically defined colon cancer antigen 8) as the cause of a nephronophthisis-related ciliopathy (Fig. 14.1). As the cost drops even further and the technique becomes universally accessible, there is little question that WES is destined to accelerate gene discovery for hundreds of Mendelian disorders.53

In summary, the combination of multiple avenues of biologic data with genetic map position has proved to be a
powerful strategy for finding disease genes. Hence, we have witnessed a remarkable increase in the number of inherited diseases the genetic bases of which have been elucidated.






FIGURE 14.1 Homozygosity mapping, exon capture, and massively parallel sequencing identifies SDCCAG8 mutations as causing nephronophthisis with retinal degeneration.A: Nonparametriclog of odds (NPL) scores across the human genome in two siblings with nephronophthisisand retinal degeneration of consanguineous family SS23/A1365. The x-axis shows Affymetrix 250K StyI array single nucleotide polymorphism (SNP) positions on human chromosomes concatenated from p-ter (left) to q-ter (right). Genetic distance is given in centimorgans. Four maximum NPL peaks (redcircles) indicate candidate regions of homozygosity by descent. B: Exon capture of 828 ciliopathy candidate genes with consecutive massively parallel sequencing and sequence evaluation within the four mapped homozygous candidate regions (redcircles in image A) identifies mutation of SDCCAG8 in SS23/A1365. C: The SDCCAG8 gene extends over 244 kb and contains 18 exons (vertical hatches). D: Exon structure of human SDCCAG8 cDNA. Positions of start codon (ATG) and of stop codon (TGA) are indicated. For mutations detected, arrows indicate positions relative to exons and protein domains. E: Domain structure of the SDCCAG8 protein. NGD, N-terminal globular domain, NLS, nuclear localization domain, CC, coiled-coil domains, and Gln_rich, glutamine-rich region. PN and PC denote peptides used for antibody generation. F: Eight homozygous SDCCAG8 mutations detected in eight families with nephronophthisis and retinal degeneration. Family number, mutation, and predicted translational changes are indicated. A homozygous deletion covering exons 5 through 7 is demonstrated by agarose gel electrophoresis (see gray bar in image E). Sequence traces are shown for mutations that are above normal controls. Mutated nucleotides are indicated by arrowheads in traces of normal controls. (From Otto EA, Hurd TW, Airik R, et al. Candidate exome capture identifies mutation of SDCCAG8 as the cause of a retinal-renal ciliopathy. Nat Genet. 2010 Oct;42(10):840-850. Reprinted by permission from Macmillan Publishers Ltd.) (See Color Plate.)



POSTCLONING PHASE OF GENE DISCOVERY Using Databases

Identification of a gene is often when the work of defining the biology of a disease begins. This is especially true in diseases such as ADPKD or tuberous sclerosis where a complex phenotype exists and a unifying biochemical defect is not apparent. The first step is usually to perform a series of database analyses, looking for sequence similarities, functional motifs, and structural features that might be used to generate a variety of testable hypotheses regarding a gene’s function. In some cases, one finds that a segment of one’s query has a high degree of similarity to a family of proteins of known function. In primary hypomagnesemia, the protein encoded by PCLN1 was found to have a sequence and structural similarity to members of the claudin family.42 All other members of this family localize to tight junctions and appear to bridge the intercellular space by homo- or heterotypic interactions, suggesting a similar function for PCLN1 . This result was particularly intriguing in that renal magnesium ion (Mg2+) resorption occurs predominantly through a paracellular conductance in the thick ascending limb of Henle (TAL).

In other situations, one may identify homologous genes in other organisms, vertebrate or invertebrate, that have already been studied and for which a function may be known. The potential power of this strategy to help expedite the study of disease genes is highlighted by the fact that over 60% of human disease genes have a homologue in the common fruit fly, Drosophila melanogaster, and a surprisingly high fraction is even conserved in yeasts.54,55,56,57

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

May 29, 2016 | Posted by in NEPHROLOGY | Comments Off on Introduction to Genetic Renal Disease

Full access? Get Clinical Tree

Get Clinical Tree app for offline access