Fig. 1.1
HBV DNA structure and genome organization. The inner circle represents the virion rcDNA, and the dashes represent the region of the (+) strand DNA that is yet to be synthesized. The 5′ ends of the (−) and (+) strands are indicated. The small, filled circle represents the P protein covalently attached to the 5′ end of the (−) strand, and the short wavy line, the capped RNA oligomer attached to the 5′ end of the (+) strand. The vertical bars on rcDNA denote the direct repeats 1 and 2 (DR1 and DR2). The short terminal redundancy (r) on the (−) strand is denoted by the flap attached to the P protein (for clarity, it is not labeled in the figure). The promoter and enhancer positions are indicated. The middle circle of shaded boxes represent the four open reading frames (ORFs) corresponding to the precore/core, X, polymerase, and surface proteins, with their C-terminal ends denoted by the arrows. The outer circle of wavy lines represent the viral RNAs. Filled squares at one end of the mRNAs denotes heterogeneous 5′ ends (PreC/C, PreS2/S, and X mRNAs), the thin vertical line represents the precise 5′ end of the PreS1 mRNA. The arrows at the other end of the lines denote the 3′ ends of the mRNAs with polyA tails (AAA). BCP basal core promoter, Pol polymerase, PolyA polyadenylation
Four distinct classes of viral mRNAs, all 5′ capped and 3′ polyadenylated, are encoded by the viral DNA. The genomic PreC/C mRNA is in fact longer than the DNA template (i.e., overlength), being 3.5 kb in length. The subgenomic PreS1, PreS2/S, and X mRNAs are approximately 2.4 kb, 2.1 kb, and 0.7 kb long, respectively. All viral mRNAs share the same 3′ sequences as represented by the shortest X mRNA, since they all terminate at the single polyadenylation signal. Transcription of these four groups of mRNAs is driven by four different viral promoters, respectively, the core, PreS1, PreS2/S, and X promoters that are further regulated by two viral enhancers, enhancer I upstream and overlapping with the X promoter and enhancer II located upstream of the core promoter (Fig. 1.1). A total of seven viral proteins are produced from these mRNAs using four open reading frames (ORF) (Fig. 1.1). The PreC/C mRNAs encode the viral core or capsid (C) protein and the slightly longer PreC protein using the same ORF, and the P protein using an alternative reading frame. As will be detailed below, the shortest of these genomic RNAs also serves as the template for reverse transcription to reproduce rcDNA during replication and is thus termed pregenomic RNA or pgRNA. The PreS1, PreS2/S, and X mRNAs encode, respectively, the large (L) envelope protein, and the middle (M) and small (S) envelope proteins, and the X protein. All three envelope (or surface) proteins are encoded within a single ORF, which is entirely embedded within the alternative P ORF (Fig. 1.1). The P gene also overlaps with the 3′ end of the C gene and 5′ end of the X gene at its 5′ and 3′ ends respectively. In addition, all transcriptional regulatory elements including the promoters, enhancers, and the polyadenylation signal overlap with the protein-coding sequences. The genomic organization of hepadnaviruses is thus characterized by extreme economy.
Structure and Functions of Viral Proteins
The Envelope Proteins
Of the three HBV envelope proteins , the smallest, S, is 226 residues long and is the most abundant. M contains a N-terminal extension, relative to S, called the PreS2 region, which is 55 residues long (Fig. 1.1). L is the longest and contains yet another N-terminal extension called the PreS1 region, which is 108 (or 119 depending on the strains) residues long. In addition to being major constituents of the virions, the envelope proteins are also secreted in large excess to the blood stream of infected people as spheres and filaments in the absence of capsids or genome, as mentioned above. Indeed, it was the abundance of these particles that allowed the discovery of HBV as the Australian antigen, i.e., hepatitis B surface antigen (HBsAg). The spheres contain mostly S and some M, and the filaments have in addition some L, which is enriched in virion particles [18]. Both L and S are required for virion secretion but M is dispensable [19]. In particular, the PreS1 region in L contains determinants required for both capsid envelopment during virion formation as well as receptor binding during entry (see below) [20]. This dual role of PreS1 is facilitated by its dynamic dual topology in the virions [21, 22]. Immediately following translation in the endoplasmic reticulum (ER) membrane, all PreS1 is located on the cytosolic side allowing it to interact with the capsids to fulfill its role in virion formation; as the virions traffic through the cellular secretory pathway , ca. 50 % of PreS1 is translocated from the interior of the virions to the exterior to allow it to bind the cell surface receptor. How this dramatic gymnastic feat is accomplished remains an enigma.
The C Protein and e Antigen
The C protein is 183 (or 185 depending on the strains) residues long. C can be divided into two structural and functional domains. The N-terminal 140 residues form the assembly domain (NTD) that is sufficient to mediate capsid assembly [23, 24]. The C-terminal domain (CTD) is dispensable for capsid assembly but plays essential roles in packaging of pgRNA into replication-competent nucleocapsids (NCs) and in reverse transcription of pgRNA to rcDNA. The C protein rapidly forms dimers, which are the building blocks for capsid assembly. Two morphological capsid isomers, with either 120 (T = 4, the major isomer) or 90 dimers (T = 3), are formed [25, 26]. The functional significance, if any, of this dichotomy is unknown. The arginine-rich CTD is highly basic and has nonspecific nucleic acid binding activity [27]. It also harbors multiple nuclear localization signals (NLSs) [28–30] that may be important for delivery of NCs to the nucleus (see section “Intracellular Trafficking and Uncoating” below).
Moreover, CTD is heavily phosphorylated when expressed in mammalian cells, with three major sites of phosphorylation all displaying the Ser-Pro motifs [28, 31] plus three to four additional minor sites of phosphorylation [32]. As will be described below, CTD phosphorylation plays critical roles for C functions in viral replication. As HBV does not encode any viral kinase, it has to usurp host protein kinases for C phosphorylation. A number of cellular kinases, including protein kinase C (PKC) [33], cyclin-dependent protein kinase 2 (CDK2) [34], serine-arginine protein kinase (SRPK) [35] have been reported to phosphorylate C or specifically its CTD. Among these, CDK2 has been shown to associate with and phosphorylate the CTD, in particular, its Ser-Pro sites (consistent with the known substrate specificity of CDK2 as a well-known proline-directed kinase), and is incorporated into the capsids (see below) [34, 36]. As will be detailed below, CTD phosphorylation is highly dynamic and a dramatic dephosphorylation of CTD is shown to accompany viral DNA synthesis in the DHBV NCs. The cellular phosphatase(s) responsible for C dephosphorylation remains to be identif ied.
The precore (PreC) protein is translated from its own mRNA (PreC mRNA), which differs from the C mRNA (pgRNA) by a 5′ extension some 30 nt long. The sequence of PreC is thus essentially the same as C, except for an additional 29 amino acids at its N-terminus [37, 38]. However, these two proteins are functionally very different; unlike C, PreC is entirely dispensable for viral replication and mutants unable to express this protein are frequently selected late during persistent infection [38]. The first 19 residues of PreC comprise a secretion signal that induces its translocation into the lumen of the ER, where the signal sequence is cleaved off by a host cell signal peptidase. The remainder of PreC undergoes further proteolytic processing (e.g., by furin) in the host cell secretory pathway to remove the highly basic CTD in C, resulting the secretion of a heterogeneous population of soluble, dimeric proteins [39, 40], defined serologically as the hepatitis B e antigen (HBeAg) (Fig. 1.2, 9c) [41]. While dispensable for viral replication, PreC/HBeAg appears to play an important role in vivo for establishing persistent infection by regulating host immune response against the related and highly immunogenic C protein [42]. Also, serum HBeAg has proven to be a useful marker to monitor viral replication as its presence tends to correlate with high levels of viral replication and its loss usually signifies a decrease in viral replication [38].
Fig. 1.2
HBV life cycle. The replication cycle of HBV is depicted schematically. (1) Virus binding and entry into the host cell (large rectangle). (2) Intracellular trafficking and delivery of rcDNA to the nucleus (large circle). (3) Repair of rcDNA to form cccDNA, or integration of dslDNA into host DNA (3a). (4 and 4a) Transcription to synthesize viral RNAs. (5) Translation to synthesize viral proteins. (6) Assembly of the pgRNA-containing NC, or alternatively, empty capsids (6a). (7) Reverse transcription to make the (−) strand DNA and then rcDNA. (8) Nuclear recycling of progeny rcDNA. (9) Envelopment of the rcDNA-containing NC and secretion of complete virions, or alternatively, secretion of empty virions (9b) or HBsAg spheres and filaments (9a). Processing of the PreC protein and secretion of HBeAg are depicted in (9c). The different viral particles outside the cell are depicted schematically with their approximate titers indicated: the complete or empty virions as large circles (outer envelope) with an inner diamond shell (capsid), with or without rcDNA inside the capsid respectively; HBsAg spheres and filaments as small circles and cylinder. Intracellular capsids are depicted as diamonds, with either SS [(−) strand] DNA (straight line), viral pgRNA (wavy line), or empty, and the letters “P” denoting phosphorylated residues on the immature NCs (containing SS DNA or pgRNA) or empty capsid. The dashed lines of the diamond in the rcDNA-containing mature NCs signify the destabilization of the mature NC, which is also dephosphorylated. The soluble, dimeric HBeAg is depicted as grey double bars. The dashed line and arrow denote the fact that HBeAg is not always secreted during viral replication. The wavy lines denote the viral RNAs: C, mRNA for the C and P protein (and pgRNA); S and LS, mRNAs for the S/M and L envelope proteins, respectively; PreC, mRNA for the PreC protein and followingFig 1.2 (continued) processing, HBeAg. Boxed letters denotes the viral proteins translated from the RNAs. The filled circle on rcDNA denotes the P protein attached to the 5′ end of the (−) strand (outer circle) of rcDNA and the arrow denotes the 3′ end of the (+) strand (inner circle) of rcDNA. ccc cccDNA, dsl double stranded linear DNA, HNF hepatocyte nuclear factor, HSP heat shock protein, PPase phosphatase, rc rcDNA, TF transcription factor. For simplicity, the synthesis of dslDNA (the minor genomic DNA form) in the mature NC, its secretion in virions, and infection of dslDNA-containing virions are not depicted here, as are the functions of X. See text for details
The Reverse Transcriptase
The HBV RT or P protein is a multifunctional protein that plays a central role in viral replication. P is 832 or (or 845 depending on the strains) residues long and can be divided into four separate domains, from the N-terminus: TP, the spacer, the RT domain, and the RNase H domain [43–47]. TP harbors the invariant Tyr residue essential for priming reverse transcription [48–50] (see section “Reverse Transcription and NC Maturation”), and together with the RT domain, are required for specific binding of pgRNA for its encapsidation into NCs. pgRNA packaging requires also the RNase H domain but none of the known enzymatic activities of P [51, 52]. The spacer region is the least conserved of the four domains and is dispensable for all known functions of P. However, its coding sequences have to be retained to encode the PreS1 region in the overlapping S ORF, which is essential for the virus as discussed above. The RT domain harbors the polymerase active site essential for DNA polymerization [44, 45], particularly the Tyr-Met-Asp-Asp motif conserved across all RT proteins including those in retroviruses and retrotransposons. The RNase H domain is responsible for degrading the pgRNA template during (−) strand DNA synthesis [44, 45, 53].
The HBV DNA polymerase activity was discovered early on, before it was realized that HBV replicates via reverse transcription, via the so-called endogenous polymerase assay [54] whereby a DNA polymerase activity in the virions was shown to carry out DNA synthesis using the endogenous virion DNA as a template. It was only a decade later that Summers and Mason made the landmark discovery that a DNA virus (i.e., DHBV) replicates through reverse transcription of an RNA intermediate [6]. However, biochemical studies on this important enzyme have proven difficult to date, and no high-resolution structures of P are yet available. As will be detailed below (section “NC Assembly”), the discovery that P requires specific host factors for its folding and functions provides at least a partial explanation to this difficulty.
The X Protein
The 154 residue-long hepatitis B X protein (HBx) is the smallest HBV protein but is arguably the least understood. There is general agreement that X is required for viral replication in vivo [55] and perhaps contributes to viral pathogenesis (for a recent review, see ref. [56]). Numerous reports have suggested a large number of functions for X in the regulation of viral and host gene expression [57], DNA damage repair [58], Ca2+ signaling [59], cell cycle [60], apoptosis [61], and autophagy [62, 63]. As there is no evidence that X has DNA binding activity, it is thought to affect gene expression through host protein interactions, which are probably also important for the various other viral or cellular effects attributed to X. It remains to be clarified how the various activities attributed to X are related to each other, and how they in turn relate to viral replication and/or pathogenesis (esp., hepatocarcinogenesis ). As these activities are uncovered usually using different systems and assays, which are often less than physiologically optimal due to experimental limitations, and the role of X in viral replication or pathogenesis is likely regulatory and indirect, the interpretation of the results obtained, which can be in apparent conflicts, is by no means straight-forward. Recent attempts at standardization of experimental systems and assays and the development of more physiologically relevant systems will hopefully help clarify the functions of HBx in viral replication and pathogenesis [64].
Viral Life Cycle
As a para-retrovirus, the HBV life cycle (Fig. 1.2) shares a number of similarities to conventional retroviruses, including, of course, the central role of reverse transcription. However, HBV and hepadnaviruses in general have indeed a rather unique replication strategy, which is different from the conventional retroviruses in a number of important aspects including the initiation of reverse transcription and NC assembly, genome maintenance, and virion morphogenesis.
Entry
The strong species and tissue tropism of hepadnaviruses are in part underpinned by viral entry. Until recently, the only cells in culture that are reported to support HBV infection reproducibly are primary hepatocytes from humans [65] and the small primate tupaia [66], and one human hepatoma cell line HepaRG, which requires differentiation in vitro for even the rather low infection efficiency achieved [67]. Very recently, hepatocyte-like cells differentiated from induced human pluripotent stem cells [68], and a newly established human hepatoma cell line HLCZ01 [69], are reported to support limited HBV infection.
After many false starts, the primary entry receptor for HBV was finally identified in 2012 as a hepatic bile acid transporter, sodium taurocholate cotransporting polypeptide (NTCP) (Fig. 1.2, step 1) [70]. This breakthrough allows the establishment of convenient hepatoma cell lines such as HepG2 and Huh7, which have been the mainstay for studying other aspects of HBV replication and can now support infectious entry via NTCP reconstitution. On the other hand, NTCP is insufficient to render mouse hepatocytes susceptible to HBV infection [71, 72]. This result, though disappointing, is not unexpected given previous observations that another essential, intracellular, stage in the viral life cycle, the formation of the covalently closed circular DNA (cccDNA), is also defective in mouse hepatocytes (see section “Nuclear Recycling of rcDNA and Amplification of cccDNA”). It is clear that additional host factors are required for the early stages of the viral life cycle beyond cell attachment and entry.
The viral requirements for entry are much better defined. Specifically, the N-terminal 48 residues of L as well as its N-terminal myristylation are both required for NTCP binding and infection [20, 70, 73, 74]. In addition, a region in S within the conserved “a” determinant (the major antigenic loop) is also required for infection via binding to the cell surface heparin sulfate proteoglycans and mediating the initial (nonspecific) viral attachment to the cells [75, 76]. This observation helps explain the high conservation of this antigenic determinant among HBV genotypes/strains. A role for glycosylation of the envelope proteins has also been reported recently [77]. Intriguingly, HBV binding to NTCP, in addition to mediating viral infection, may also inhibit the transport function of NTCP and alter cellular gene expression [78], raising the possibility that this initial virus-host interplay may contribute to viral pathogenesis.
Intracellular Trafficking and Uncoating
The next essential step in the HBV life cycle after entry into susceptible host cells is to deliver the virion rcDNA into the nucleus (Fig. 1.2, step 2). Relatively little is understood here due to the lack of convenient and efficient infection systems until very recently. It is thought once the viral envelope and cellular membrane fuse to release the internal NC, the latter traffics towards the nuclear membrane, through interactions with cellular importins mediated by NLSs located on the C CTD [36, 79, 80]. As HBV can efficiently infect nondividing hepatocytes in the liver and NC is too large to pass through the nuclear pore complex (NPC), it has been proposed that NC interacts with components of the NPC leading to the arrest of NC and release of its rcDNA content into the nucleus [81] for cccDNA formation (see next).
cccDNA Formation
cccDNA is the first new viral DNA species detected upon infection [82] and is essential to initiate and sustain viral replication, as it is the only viral transcriptional template that can direct the expression of all viral RNAs and proteins. As with NC trafficking/uncoating, little is currently understood about this critical stage of HBV infection due to the lack of convenient experimental systems, until recently, that can support efficient HBV infection (for a recent review, see ref. [83]). It is clear, however, that the conversion of rcDNA to cccDNA in the nucleus (Fig. 1.2, step 3), as well as the preceding stages of entry and uncoating, must be highly efficient during natural infections since one DNA-containing virion particle is able to establish a productive infection in susceptible hosts [10, 84]. As will be detailed below (section “Nuclear Recycling of rcDNA and Amplification of cccDNA”), cccDNA can also be formed from progeny rcDNA synthesized de novo, via an intracellular amplification pathway. This process, which bypasses the entry process, has been used, with limited success so far, to study cccDNA formation.
Viral RNA Synthesis
Once formed, the nuclear episomal cccDNA functions as the equivalent of a provirus in retroviruses and is used as the template to transcribe, by the host RNA Pol II, all the viral RNAs (Figs. 1.1 and 1.2). Viral transcription is dependent on ubiquitously expressed as well as liver-enriched transcriptional factors (hepatocyte nuclear factors or HNFs) (Fig. 1.2, step 4), which contributes to the liver (hepatocyte) specificity of viral replication [85–94] (for a recent review, see ref. [95]). cccDNA is organized into mini-chromosomes with host cell histones and potentially other host and viral proteins [96, 97]. Transcription from the cccDNA mini-chromosomes, like that from host chromosomes, is subject to epigenetic regulation, which may be further modulated by the viral regulatory protein, X [98–100]. X has been reported to be critical for transcription from cccDNA during infection [57], and apparently functions only on episomes (like cccDNA) but not integrated DNA, in a DNA sequence-independent manner [101]. This sequence independence is consistent with the lack of DNA binding activity of X but if and how X specifically affects viral transcription remains an enigma. It has been suggested that HBx is recruited onto the cccDNA mini-chromosomes [98] but how this is accomplished in a DNA sequence specific fashion also remains unclear.
An interesting feature of HBV transcription is its dimorphic response to sex hormones, being stimulated by androgen [102, 103] and suppressed by estrogen [104]. Why HBV has evolved such a sexual dimorphism is an interesting unresolved question but this phenomenon likely contributes to the well-known male predominance of HBV replication and carcinogenesis.
All HBV mRNAs described above are unspliced, which have to be exported from the nucleus to the cytoplasm in order to be translated or packaged into NCs in the case of pgRNA. As eukaryotic mRNA export is usually coupled to splicing, HBV has evolved a mechanism of exporting its mRNAs out of the nucleus in a splicing-independent manner, which relies instead on a cis-acting RNA sequence called the post-transcriptional regulatory element (PRE) [105] encoded by viral DNA sequences overlapping enhancer I (Fig. 1.1).
Viral Protein Synthesis
Another interesting feature of HBV gene expression is that all viral promoters, except PreS1, lack an canonical TATA box and as a result, all viral RNAs, except the PreS1 mRNA, have heterogeneous 5′ ends, which is used to translate distinct proteins from closely related mRNA species. The heterogeneous 5′ ends of the over-length genomic RNAs, PreC/C mRNA, bracket the translation initiation codon of the PreC protein. The longer PreC mRNAs containing the PreC initiation codon are translated to produce the PreC protein and ultimately the secreted HBeAg (Fig. 1.2, step 5), as described above. The shortest, C mRNA, missing the PreC initiation codon, is translated to produce both the core and RT proteins, the latter from an internal AUG in a different reading frame from C (Fig. 1.1). As mentioned above, the C mRNA is also pgRNA, serving as the template for reverse transcription to produce progeny viral DNA (section “Reverse Transcription and NC Maturation”). The M and S envelope proteins are similarly translated from the PreS2/S mRNAs, which have heterogeneous 5′ ends bracketing the PreS2 initiation codon. Thus, the longer RNAs containing this initiation codon are translated to produce M and the shorter ones lacking it are translated to produce S. This gene expression strategy effectively increases further the coding capacity of the highly compact HBV genome.
NC Assembly
The next stage in the viral life cycle ensues once C and P are translated from their shared mRNA, which doubles further as the template for reverse transcription (pgRNA) as alluded to above. These three components, the C and P protein and their shared mRNA (pgRNA) are all the viral factors needed for intracellular HBV replication (i.e., in the absence of virus secretion or infection). Assembly of the replication-competent NC requires the incorporation, into the same capsid, of both pgRNA—the template for reverse transcription, and P—the catalyst for DNA synthesis, by assembling C protein dimers. HBV has evolved to satisfy this dual (the P protein and pgRNA) packaging requirement by initiating NC assembly via the formation of a specific pgRNA-P ribonucleoprotein (RNP) complex, which then serves to trigger NC assembly (Fig. 1.2, step 6). A short structured RNA signal, called ε, located at the 5′ end of pgRNA (Fig. 1.3, step 1) was identified as the RNA packaging signal that mediates the packaging of pgRNA into NCs [106, 107]. ε was later found to be recognized specifically by the P protein (Fig. 1.3), not C, and P and pgRNA packaging are mutually dependent [51, 108, 109].
Fig. 1.3
HBV reverse transcription pathway. The pathway is depicted schematically from bottom to top on the left so as to match the Southern blot image of replicative viral DNAs extracted from intracellular NCs shown on the right. (1) The pgRNA (dashed line) harbors a large terminal repeat (R) that bears the RNA packaging signal, ε, and DR1. (2) P protein binding to ε triggers protein-primed initiation of (−) strand DNA synthesis at ε and packaging of pgRNA into NCs (not depicted). (3) (−) strand template switch to the 3′ DR1 and continuation of (−) strand DNA synthesis. (4) Degradation of pgRNA as (−) strand DNA synthesis proceeds, leaving a capped RNA oligomer containing DR1. (5) Translocation of the RNA oligomer to DR2 to prime (+) strand DNA synthesis. (6) (+) strand template switch from the 5′ to the 3′ end of the (−) strand DNA facilitated, in part, by the short terminal repeat (r) on the (−) strand DNA, circularizing the DNA, and continuation of (+) strand DNA synthesis for a variable length generating rcDNA. The boxed numerals 1 and 2 denote DR1 and DR2. A n polyA tail, nasc nascent
ε forms a conserved stem-loop structure with an apical loop and two short stems separated by an internal bulge (Fig. 1.3). To form the RNP complex, the internal bulge but not the apical loop is required [50, 110, 111]. However, for ε to serve its RNA packaging function, both the internal bulge and apical loop are required [112–114]. Furthermore, a closely spaced 5′ cap next to ε is also critical for pgRNA packaging in HBV [115] but dispensable for RT-ε interaction [50, 110, 111]. This requirement for a closely spaced 5′ cap in pgRNA packaging provides a satisfying explanation for the failure of the other copy of ε, which is located at the 3′ end of all viral RNAs (Figs. 1.1 and 1.3), to serve as a functional RNA packaging signal. Similarly, the requirements from P for pgRNA packaging go beyond those required for ε binding. Whereas only parts of the TP and RT domains are required for ε binding [110, 111, 116–118], pgRNA packaging additionally requires P sequences extending into most of the RNase H domain [51, 52, 116, 119]. Interestingly, four conserved Cys, three within the C-terminal portion of the initially defined spacer region and one in the RT domain, were found to be required for both ε binding and pgRNA packaging [116, 120].
In addition to the viral P protein and ε RNA, host cell factors play an important role in RNP formation, and thus in pgRNA packaging (and protein-primed initiation of viral reverse transcription or protein priming, see section “Reverse Transcription and NC Maturation” below). In particular, the chaperone proteins heat shock protein 90 (Hsp90) and Hsp70, and other co-chaperones, are required to establish and maintain a P conformation active in ε binding (Fig. 1.2, step 6) [121–125]. One structural effect on P elicited by chaperone action is the exposure of a site in TP that may directly bind ε RNA [126]. Deletion of the RNase H domain and the C-terminal portion of the RT domain in the DHBV P protein leads to a “mini” P that is active in ε binding (and protein priming) independent of the cellular chaperones. This suggests a model for conformational activation of P whereby the chaperones act, during P-ε interaction, to relieve the auto-inhibitory effects on ε binding exerted by these C-terminal sequences of P [127], which are nevertheless essential for the later stages of viral replication (see section “Reverse Transcription and NC Maturation”).
Another essential viral protein for NC assembly, of course, is the C protein itself, which forms the capsid shell enclosing the P-pgRNA complex. As discussed above, the NTD of C alone is able to form empty capsids and when the CTD is present, those capsids can incorporate nonspecific RNAs when assembled in bacteria or in vitro [23, 24, 26, 128]. On the other hand, for the capsids to package the P-pgRNA RNP complex, the CTD is required and furthermore, has to be phosphorylated in HBV (Fig. 1.2) [129, 130]. Precisely how the assembling C dimers recognize the P-pgRNA complex remains unknown. A recent cryo-EM study revealed that P is located at a unique position inside NCs [131], consistent with the suggestion that the initial site of interactions between the P-pgRNA complex and first C dimers may mediate the nucleation of NC assembly. The kinetics of capsid assembly also influences P-pgRNA packaging as either C mutants or small molecules binding to C that perturb assembly kinetics inhibit P-pgRNA packaging [132–134].
As in bacteria, HBV capsids also assemble in authentic human host cells without incorporating the RT-pgRNA complex. These empty capsids are produced in large excess relative to the replication competent NCs both in cell cultures and in the liver (Fig. 1.2, step 6a) [11, 135] and are indeed secreted as empty virions upon envelopment (i.e., enveloped capsids without DNA or RNA) (section “NC Envelopment and Virion Secretion”). In contrast to assembly in vitro or in bacteria, the nonproductive assembly in mammalian cells leads to the formation of truly empty capsids, packaging little to no RNA. One reason to account for this difference may be the fact that CTD is heavily phosphorylated in mammalian cells (Fig. 1.2, steps 6 and 6a) (but not in bacteria), which leads to the inhibition of its nonspecific RNA binding activity as described above.
It has long been known that HBV capsids package a cellular protein kinase(s), the so-called endogenous kinase [136]. Kinase packaging into capsids is evidently independent of either the P protein or pgRNA [34]. While early work suggested that the endogenous kinase was possibly PKC [33], recent work indicates that the cellular CDK2 may be the major endogenous kinase and it can phosphorylate specifically the CTD in capsids (Fig. 1.2, steps 6 and 6a) [34].
Reverse Transcription and NC Maturation
Once both the P protein and pgRNA are packaged into NCs, viral DNA synthesis occurs converting the RNA pregenome to the characteristic rcDNA genome (Fig. 1.2, step 7). This process of reverse transcription in hepadnaviruses is also defined as NC maturation. NCs, as initially assembled and containing pgRNA, as well as those containing the SS DNA intermediates are considered to be immature as they are not secreted in virions, in contrast to those containing the rcDNA that are secreted in virions and are therefore considered mature.
Protein-Primed Initiation of Reverse Transcription
Initiation of viral reverse transcription is triggered by the same P-pgRNA interaction described above that triggers NC assembly. Indeed, initiation of viral DNA synthesis could occur before, during, or after NC assembly, since C is dispensable for initiation of DNA synthesis [48–50, 137]. In an unusual reaction, very different from conventional retroviruses and so far, unique to hepadnaviruses, initiation of (−) strand DNA synthesis occurs via protein priming and the catalyst of DNA synthesis, i.e., the P protein itself, is also used as a specific protein primer. In addition to being essential for pgRNA packaging, the ε RNA element, specifically its internal bulge, also serves as the template for protein-primed DNA synthesis [50, 138–142]. The result of protein priming is a short (3–4 nt long) DNA oligomer covalently attached to the P protein (Fig. 1.3, step 2), specifically, a Tyr residue within its TP domain.
Protein priming is a highly complex reaction that requires multiple determinants of both the ε RNA and P protein and additionally, host factors. The ε RNA, in addition to serving as the template for DNA synthesis, also functions as an allosteric activator of the P enzymatic activity during protein priming [143, 144]. Similarly, upon RNP formation, the ε RNA also undergoes significant structural rearrangement thought to be critical for protein priming [145, 146]. As discussed above, HBV protein priming, like pgRNA packaging , also requires a 5′ cap near the ε RNA signal and thus only the 5′, but not the 3′ copy of ε on pgRNA can support protein priming (Fig. 1.3, step 2) [50, 147]. Therefore, host factors binding the cap structure are potentially needed for protein priming, in addition to the host chaperones discussed above that play a critical role in facilitating RNP formation. The apical loop of ε, like the cap structure, is dispensable for P binding (above) and is not part of the template sequence, but yet is essential for protein priming [50]. While the exact role of these RNA elements in protein priming (or pgRNA packaging, see above) remains to be defined, the similar requirements for these two related reactions, both being critically dependent on P-pgRNA interaction, may suggest these two processes are mechanistically linked to ensure that viral DNA synthesis will only occur using an RNA template that can also be packaged into NCs.
The Tyr residue used to prime (−) strand DNA synthesis resides in the N-terminal TP domain of P (Y63 in HBV and Y96 in DHBV) [50, 148–150]. The structural basis for selecting the particular Tyr residue as the primer remains ill-understood. “Cryptic” sites, i.e., other Tyr (or even Ser/Thr residues) in both the TP and RT domains, can also serve to initiate protein-primed synthesis of short DNA strands, albeit inefficiently [151, 152]. However, they cannot support viral replication. In addition to the priming Tyr residue, other sequences in TP are also critical for protein priming by participating in ε RNA binding and possibly helping to present the primer to the RT active site [50, 116, 126, 153].
Protein priming in HBV also requires the RT domain and most of the RNase domain. The RT domain bears the Tyr-Met-Asp-Asp polymerase catalytic center, which is required to form the initial phosphotyrosyl linkage between the 5′ dGMP residue and the priming Tyr residue in TP during priming as well as for all subsequent DNA polymerization [48, 50, 150]. Additional RT domain sequences, beyond the polymerase active site, are required for ε binding and DNA synthesis during protein priming [116, 153]. Also, although both protein priming and pgRNA packaging (see section “NC Assembly”) require additional sequences and structures from the ε RNA and P protein beyond RNP formation, distinct requirements also exist for these highly related reactions [116]. In particular, the very C-terminal portion of the RNase H domain is required for pgRNA packaging but not for protein priming, suggesting that these RNase H sequences may interact with the viral C protein, which is similarly required for NC assembly but not protein priming. In contrast, other sequences in the TP and RT domains are required for protein priming but not pgRNA packaging, suggesting that these may play a role in positioning Y63 in the RT active site to allow priming or in some aspect of catalysis per se.
DNA Synthesis Following Protein Priming
Following protein priming, which occurs at the 5′ end of pgRNA, the short (−) strand DNA oligomer attached to P is translocated to a site (acceptor) overlapping the short (ca. 12 nt) sequence motif DR1 (Fig. 1.1) near the 3′ end of pgRNA before DNA synthesis continues (Fig. 1.3, step 3) [14, 138, 147, 154]. In addition to the short (3–4 nt) sequence complementarity between the nascent (−) strand DNA and the pgRNA sequence at the acceptor site, this (−) strand template switch reaction is facilitated by other cis-acting elements on pgRNA, which help bring together spatially the 5′ donor site at ε and the 3′ DR1 acceptor site on pgRNA via base-pairing [155, 156]. As (−) strand DNA synthesis continues, the RNase H activity of P degrades pgRNA except its extreme 5′ end (Fig. 1.3, step 4) [157]. The preserved 18-nt long capped RNA oligomer is subsequently used as a primer to initiate (+) strand DNA synthesis, but in most cases, only after the RNA primer is first translocated from the 3′ end to near the 5′ end of the (−) strand DNA (Fig. 1.3, step 5) [14, 15, 158]. This (+) strand primer translocation is facilitated by sequence complementarity between the other DR1 motif at the 5′ end of the terminally redundant pgRNA (part of the RNA primer) (Fig. 1.3, step 1) and an identical sequence motif called DR2 near the 5′ end of the (−) strand DNA. (+) strand DNA synthesis soon reaches the 5′ end of the template (−) strand DNA, when another template switch occurs resulting in the translocation of the elongating 3′ end of the (+) strand DNA from the 5′ to the 3′ end of the (−) strand DNA and the circularization of the DNA product (Fig. 1.3, step 6). This (+) strand template switch is facilitated by the short (ca. 9 nt) terminal repeat (r) at both ends of the (−) strand DNA, as well as multiple other cis-acting sequences on the template (−) strand DNA, which function, as in (−) strand template switch, to bring the donor and acceptor sites together spatially [159–162]. (+) strand DNA synthesis then continues to at least half completion before the maturing NC is enveloped and secreted or recycles its rcDNA content to the nucleus for amplifying the cccDNA pool (see below).
Failure to translocate the RNA primer during (+) strand DNA synthesis leads to the production of the double stranded linear DNA (dslDNA) via an alternative, minor pathway of DNA synthesis when the primer is elongated in situ [163]. dslDNA is the predominant viral DNA substrate for integration into host chromosomes via nonhomologous recombination (Fig. 1.2, step 3a), which occurs early during acute infection and accumulates over time during the chronic phase of infection [164–166]. Integrated dslDNA is unable to support viral replication as it cannot direct the expression of the genomic RNA species. However, it can drive the expression of the viral envelope proteins (Fig. 1.2, step 4a) and has diagnostic implications (see section “NC Envelopment and Virion Secretion” for more detail).
Whereas C appears to be dispensable for the protein priming stage of viral reverse transcription, it clearly functions as a critical trans-acting factor, in addition to the enzyme P itself, in all subsequent stages of viral DNA synthesis. In particular, the phosphorylation state of its CTD plays an integral role in facilitating reverse transcription (Fig. 1.2, steps 6 and 7), possibly via regulating the charge state of the maturing NC or the CTD function as a nucleic acid chaperone [167–173]. Recent structural studies suggest indeed that the phosphorylation state of CTD can influence pgRNA organization in the NCs [174]. In DHBV, CTD is heavily phosphorylated in immature NCs, which is required to facilitate (−) strand DNA synthesis, but it is completely dephosphorylated in mature NCs, which is required to facilitate (+) strand DNA synthesis and to stabilize mature NCs once rcDNA is synthesized (Fig. 1.2, step 7) [36, 169, 175, 176]. Also, the NTD, in addition to forming the capsid shell, may play an active role in facilitating reverse transcription [133, 134, 177].
It is likely that additional host factors regulate viral reverse transcription, other than those required for protein priming and the host kinase(s) (e.g., CDK2) and phosphatase(s) that regulate the state of CTD phosphorylation, as described above. For example, induction of the early stage of autophagy was reported to facilitate HBV DNA replication [62]. On the other hand, the antiviral deaminase proteins, the Apobec3 proteins, can be incorporated into NCs and block the early stage of (−) strand DNA synthesis when overexpressed ectopically [178–180], although whether levels of Apobec3 proteins under physiological conditions in vivo ever reach those needed for viral inhibition remains uncertain.
Like the RT enzymes encoded by retroviruses (e.g., the human immunodeficiency virus or HIV) and RNA-dependent RNA polymerases of RNA viruses (e.g., the hepatitis C virus or HCV), the HBV P protein lacks proofreading activity. As a result, HBV DNA replication is associated with a much higher (by ca. 104-fold) error rate as compared to host cell DNA replication, resulting in viral genetic variations. However, its compact genetic organization means that HBV variations that are viable (thus observable) are not nearly as great as HIV or HCV, since many of the variations will be lethal due to their detrimental effects on multiple overlapping coding sequences and/or cis-acting sequences important for viral gene expression or replication (Fig. 1.1). Still, HBV can be classified worldwide into eight to ten genotypes, with inter-genotype differences being 8 % or higher [181]. Furthermore, the sequence variations provide opportunities for selecting drug resistant or vaccine escape mutants under drug treatment or immune pressure (see section “NC Envelopment and Virion Secretion” also).
Nuclear Recycling of rcDNA and Amplification of cccDNA
As alluded to above, cccDNA can be derived from de novo synthesized rcDNA in intracellular mature NCs, in addition to rcDNA in the incoming virion. In this process, mature NCs deliver (recycle) their rcDNA content into the nucleus (instead of secretion extracellularly as virions, see section “NC Envelopment and Virion Secretion”) to make more cccDNA (Fig. 1.2, step 8), which amplifies the cccDNA reservoir for production of more pgRNA and other viral RNAs. Discovered initially in DHBV infected primary duck hepatocytes [182, 183], this process also occurs in hepatoma cells replicating DHBV and HBV [184–188]. Through this intracellular amplification pathway (and possibly super-infection as well), the steady state level of cccDNA is maintained at ca. 1–17 copies per cell as shown using DHBV-infected duck livers [189].
The viral envelope protein, L, directly regulates cccDNA amplification through an apparent negative feedback mechanism. Thus, when cccDNA levels are low, e.g., during the early stage of infection, L protein levels are low and rcDNA in the mature NCs is recycled to the nucleus to form more cccDNA. Later during infection, when the cccDNA levels are raised, more L proteins are produced that block this recycling pathway and instead direct the mature NCs for envelopment and secretion extracellularly [184, 185, 190]. The C protein is also involved in this recycling process, and the recently revealed destabilization of mature NCs, relative to immature ones [191], likely facilitates the uncoating of mature NCs and release of rcDNA for cccDNA formation. The much lower efficiency of cccDNA amplification by HBV compared to DHBV in the same cells [187, 192] also indicates that viral specific factors can affect the efficiency of cccDNA formation. The differential binding of host factors by CTD in a phosphorylation state-dependent manner [193] may play a role in the nuclear recycling of mature NCs or virion formation (see section “NC Envelopment and Virion Secretion”), as mature NCs, as opposed to immature ones, are dephosphorylated.
The involvement of host factors in cccDNA formation is also suggested by the age-related difference in DHBV cccDNA formation kinetics (being more rapid in young ducklings) [194]. Moreover, as mentioned above, normal mouse hepatocytes fail to accumulate cccDNA [195] and the elimination of the liver specific transcription factor, HNF1α, increases (albeit only weakly) levels of cccDNA in these cells [196], again indicating that host cell factors, related to the physiological or differentiation state, can influence cccDNA levels. In particular, host DNA damage repair factors probably play a direct role in the conversion of rcDNA to cccDNA [187, 197], which involves a number of distinct biochemical reactions. These include the completion of (+) strand DNA synthesis, removal of the capped RNA primer from the 5′ end of the (+) strand DNA, removal of the covalently attached P protein from the 5′ end of the (−) strand DNA, removal of precisely one copy of the terminal repeat (r) from the (−) strand DNA, and the ligation of both DNA strands. To date, a role for the P protein, the only viral protein with any enzymatic activities, in cccDNA formation has not been conclusively demonstrated although it could play a role conceivably in completing the (+) strand DNA synthesis. This and all the other reactions required for cccDNA formation are instead likely carried out by host DNA repair enzymes.
HBV cccDNA levels can reach five to ten copies per cell in hepatoma cell cultures, exclusively via intracellular amplification as those cells are not susceptible to HBV infection; however, over-amplification in the absence of the viral envelope proteins is limited, reaching only a few fold higher than the wild type (WT) virus [187, 188, 192, 198]. This is in contrast to DHBV, which can reach hundreds of copies of cccDNA per cell in either avian or human cells [36, 185]. It remains unclear if HBV cccDNA can be amplified to the same extent as DHBV in the liver or culture systems that would more closely mimic human hepatocytes in vivo . Even for DHBV, cccDNA amplification can be saturated, suggesting the need for rate-limiting host factors [36]. Whatever these host factors may be, they are unlikely to be strictly species- or hepatocyte-specific. DHBV forms cccDNA efficiently in duck, chicken and human cells [187, 192] and the human embryonic kidney cell line HEK293 supports cccDNA formation by both HBV and DHBV [36, 187].
To date, no clear intermediate in the conversion of rcDNA to cccDNA has been identified conclusively, which would facilitate studies on the mechanism of cccDNA formation. However, a rcDNA species, called protein-free (PF) or deproteinated (dp) rcDNA, accumulates in established cell lines that support HBV replication [187, 192, 198] but not in normal human hepatocyte in vivo [199] or in primary culture [65], nor in the chimpanzee liver [200]. PF-rcDNA resembles grossly the rcDNA in mature NCs except that the P protein is removed [187, 198]. However, the precise structure of PF-rcDNA, particularly, the structure of the 5′ end of the (−) strand DNA, from which the P protein (or at least the bulk of it) has been removed, remains to be more clearly defined. PF-rcDNA also accumulates in mouse hepatocytes when HNF1α is eliminated [196], suggesting that the accumulation of PF-rcDNA, like cccDNA, is subject to regulation by the host cell physiology. Although the PF rcDNA has been suggested to be a precursor to cccDNA (and thus a true intermediate during rcDNA to cccDNA conversion) [79, 187, 198], the possibility exists that it could instead represent a dead-end processing product from rcDNA and cannot be converted further to cccDNA. How the P protein is removed during the formation of the PF-rcDNA (or cccDNA) remains unknown. However, the host DNA repair factor, tyrosyl-5′ DNA phosphodiesterase 2 (Tdp2), has been shown to cleave precisely at the P protein-(−) strand DNA junction [50, 201, 202], which is not entirely surprising given that an important cellular function of Tdp2 is to remove covalently trapped topoisomerase II (Topo II) from Topo II-DNA adducts [203] with exactly the same phosphotyrosyl-DNA bond at their protein–DNA junction as that found at the 5′ end of the viral (−) strand DNA. Whether Tdp2 plays a role in HBV cccDNA formation remains to be clarified although a recent report suggests that Tdp2 may play a modest role in facilitating DHBV cccDNA formation in human hepatoma cells [202].
NC Envelopment and Virion Secretion
To complete the viral life cycle, the mature NCs containing rcDNA acquire the host-derived lipid bilayer studded with the viral envelope proteins via budding into the lumen of an intracellular vesicle thought to represent the late endosome or multi-vesicular body (MVB) [204, 205] and secreted outside of the cells via the cellular secretory pathway. As with many other enveloped viruses, the cellular ESCRT proteins critical for host vesicular trafficking may play a role in HBV virion secretion although pleiotropic effects of these factors on NC maturation makes the interpretation for a specific role of ESCRT proteins in virion formation difficult [204–206].
A particularly interesting aspect of HBV virion morphogenesis is the selection of the “correct” NCs, i.e., only the mature ones containing rcDNA (or dslDNA) but not the immature ones containing SS DNA or pgRNA, for envelopment and virion formation (Fig. 1.2, step 9) [6, 11, 207–210]. A putative maturation signal, which emerges on the mature NCs following rcDNA synthesis, was hypothesized long ago to direct the viral envelope proteins in the selection of the mature NCs for envelopment [6]. The nature of this signal, or the exact timing of its emergence during NC maturation, remains poorly understood. As mentioned above, the (+) strands of HBV rcDNA found in virions in the blood of infected patients are heterogeneous; they can be up to half incomplete but mostly are several hundred nt from completion [11, 13]. Elongation of the (+) strands may stop when the nucleotides that are trapped in the enveloped virion particles are exhausted and no additional nucleotides can get into the virions. Indeed, dissolution of the virion membrane and provision of nucleotides allow further (+) strand DNA elongation of the virion rcDNA during the endogenous polymerase reaction [54], and blocking envelopment can increase the length of the (+) strands in rcDNA [188].
The C protein likely plays an integral role in the selection of mature NCs for virion formation as it forms the NC shell and is thus situated appropriately to transmit the nature of the nucleic acid inside NC (rcDNA vs. SS DNA or pgRNA) to its exterior so as to allow the viral envelope proteins to differentiate NCs with different maturity. In other words, the envelope proteins have to sense, indirectly, the interior content of the maturing NC through maturation-associated structural changes on the capsid surface, which could constitute the elusive maturation signal. Indeed, NTD mutants have been identified that remain competent for rcDNA synthesis but are defective in virion formation [211–213]. Furthermore, other NTD mutations lead to the secretion of SS DNA in virions (“immature secretion”) [214, 215]. Interestingly, the snow goose hepatitis B virus (SGHBV) [216] naturally secrets SS DNA-containing virions, in contrast to all other hepadnaviruses identified to date, and two specific residues [74, 107] in the NTD of the SGHBV C protein have been identified as responsible for this immature secretion phenotype [217]. These results thus all suggest that NTD may contain the maturation signal or is at least involved in generating the signal during NC maturation.
On the envelope side, it has been long known that the L and S, but not M, proteins are required for virion formation [19]. More recently, the C-terminal portion of the PreS1 region of L was identified as a “matrix” domain (MD) that is thought to recognize the mature NCs [218–221]. As discussed above, the N-terminal portion of the PreS1 region also mediates receptor (NTCP) binding during virus entry, with this dual role in NC and receptor recognition being accommodated by a dramatic shift of the PreS1 topology following NC envelopment. The host chaperone, heat shock cognate protein 70, may play an essential role in virion formation by retaining the PreS1 region on the cytosolic side of the ER membrane during L protein synthesis to allow it to serve its MD function [222].
Secretion of HBsAg Spheres and Filaments
As described above, HBV virion secretion is characterized by the release of a large excess of defective subviral particles containing only the envelope proteins (HBsAg particles, including the spheres and filaments) (Fig. 1.2, step 9a). The cellular pathway for secreting these particles appears to be distinct from that used for virion secretion [204], which is also suggested by the fact that the virions and HBsAg particles contain a different complement of viral envelope proteins (see section “The Envelope Proteins” above). The functions of the HBsAg particles remain to be better defined although they probably act as a decoy for the virions to protect the latter from host neutralizing antibodies that target the envelope proteins.
Secretion of Empty Virions
Given the above discussion on selective HBV virion formation, it was indeed surprising to find that HBV also secrets a large excess (typically >100-fold above the DNA-containing or complete virions) of empty virions in vivo and in vitro , which contain the envelope and the capsid but no genome (Fig. 1.2, step 9b) [11, 12]. In sharp contrast to complete virions, the secretion of these empty virions is completely independent of pgRNA packaging or DNA synthesis [11, 223]. In retrospect, these empty virions were probably detected decades ago, even before the discovery of reverse transcription in hepadnaviruses, as “light” Dane particles [224, 225] but received little attention, perhaps deemed to be an artifact of virion isolation. To reconcile the apparent stringency in selecting mature (but not immature) NCs for complete virion formation and the secretion of empty virions containing no genome at all, it was proposed that a SS DNA (or pgRNA)-dependent “blocking signal” is induced in immature NCs that actively prevents their envelopment [11]. The empty capsids, devoid of any nucleic acid, lack such a negative signal and can thus be enveloped and secreted as empty virions. However, the requirements from either the capsid or the envelope for secretion of these empty virions need to be characterized and it remains possible that the secretion of the complete and empty virions may involve distinct signals and pathways. Similarly, the functions of empty virions remain to be determined.
Like HBsAg particles (Australian antigen), which greatly facilitated the discovery of HBV and the development of both the diagnostics for HBV infection and the first generation HBV vaccine that was derived from these particles in the human serum [1], the empty virions may also prove to be valuable as a diagnostic marker and perhaps, a new vaccine candidate. On the diagnostic side, a recent pilot study found that the ratios of empty to complete virions in the sera of HBV infected patients vary greatly (50–100,000:1) [12]. Among other factors, this ratio may reflect the efficiency of intrahepatic assembly of empty vs. pgRNA-containing capsids, which, together with the efficiency of reverse transcription and virion assembly, ultimately determine the ratio of empty vs. complete virions in the blood (Fig. 1.2). Furthermore, the empty virions, which can be readily monitored as serum hepatitis B core antigen (HBcAg) (due to the large excess of empty virions relative to complete ones, the contribution of the complete virions to serum HBcAg is negligible), may be useful as an easily accessible biomarker to monitor antiviral responses, in particular, the levels and transcriptional activity of cccDNA in the liver during treatment with inhibitors of viral DNA synthesis. Treatment with a nucleoside analog drug that inhibits the DNA polymerase activity of the P protein effectively blocks the secretion of complete virions in virtually all cases, but the secretion of empty virions is not decreased in most cases [12]. This is exactly as predicted given that DNA synthesis is required for secretion of complete virions, but dispensable for empty virions. On the other hand, if the hepatic cccDNA levels (or its transcriptional activity) is decreased or eliminated, the production and secretion of empty virions will be reduced or eliminated as both C and envelope proteins are required for empty virion production [12]. Although serum HBsAg particles have been suggested as a marker for hepatic cccDNA, they can also be produced from integrated viral DNA (Fig. 1.2, step 4a) [226, 227], which accumulates to high levels during chronic infections [166] and is not decreased by viral polymerase inhibitors [166], and therefore, are not reliable for monitoring cccDNA especially during the later stage of chronic infection [228, 229]. Secretion of empty virions, on the other hand, requires also the viral C protein, which is unlikely to be produced from the integrated DNA due to the disruption of the C gene in the dslDNA, the precursor to the integrated HBV DNA (Fig. 1.2, step 3a) [164], and thus should be a more reliable marker for hepatic cccDNA. Under antiviral therapy, significant decrease of serum empty virions (and thus HBcAg), without HBsAg decrease in parallel, could reflect a reduction of intrahepatic cccDNA level (or its transcriptional activity) leading to a decrease in HBcAg (hence empty virion) production but not serum HBsAg, whose expression may be driven exclusively from integrated HBV DNA [12]. Another potential marker for hepatic cccDNA is the secreted HBeAg, which like empty virions probably can only be produced from cccDNA but not integrated viral DNA. However, HBV frequently mutates to reduce or eliminate HBeAg expression under immune pressure (Fig. 1.2, step 9c) [228], rendering HBeAg less useful or no use at all as a biomarker for cccDNA.
The empty virions could also form the basis for a new generation of HBV vaccine. The current recombinant (second generation) HBV vaccine contains the S envelope protein only. Though it is very safe and effective in most cases, it does not induce sufficient response in some vaccinees. Also, as the vaccine elicits predominantly neutralizing antibodies against a single epitope (the “a” determinant as discussed above) in S, HBV can evolve mutations in this determinant to escape the vaccine-induced antibodies [230, 231]. To potentially exacerbate the vaccine escape problem, inhibitors of the P protein can also select drug-resistant mutants that are, additionally, vaccine escapees. Due to the overlap of the P and S coding sequences (Fig. 1.1), certain drug resistant mutations in the P gene also encode vaccine-escape S proteins in the overlapping S gene [232, 233]. A potential (third generation) HBV vaccine could be based on empty virions and would contain all the viral structural proteins but no genome. Such as vaccine should be as safe as the current vaccine but may help overcome the limitations of the current vaccine by providing additional antigenic determinants for both humoral and cellular immunity, the latter of which targets mostly the internal C protein and may render an empty virion-based vaccine effective for t herapeutic as well as prophylactic purposes.
Perspectives
HBV research is experiencing a renaissance in recent years, along with the advent of effective antiviral therapies that can dramatically suppress viral replication and potentially improve the prognosis for hundreds of millions of chronically HBV-infected patients worldwide [234]. With the exception of type I interferon that is thought to derive its efficacy from immune-regulatory functions as well as direct antiviral activities, both of which are complex and still ill-understood, all other currently approved treatments target the HBV P protein, specifically its DNA polymerase activity in the RT domain, and belong to the same structural class—nucleoside analogs. As the viral life cycle becomes understood in greater detail, it is anticipated that more antiviral therapies targeting different stages of the life cycle and different viral proteins will be forthcoming. Along with better strategies to manipulate the host immune response, these antivirals may be able to bring about the complete elimination of the nuclear cccDNA reservoir and thus cure chronic HBV infection. Viral and host targets that can be potentially exploited include entry by using the PreS1 peptide responsible for NTCP binding as well as using small molecule NTCP ligands to disrupt virus-cell binding [235, 236]; NC assembly using small molecules binding to the viral C protein [132, 237, 238]; additional functions of P such as ε binding and RNase H activity and novel ways of inhibiting its polymerase activity [47, 53, 239, 240]; and inhibition of cccDNA formation [36, 241]; and possibly suppression of cccDNA transcriptional activity [99, 100] or even degradation of cccDNA [242, 243]. These and other antiviral strategies as well as immune modulation approaches will be detailed elsewhere in this volume. Similarly, the large amounts of classical HBsAg particles and HBeAg released into the blood stream, though nonessential for viral replication, have been extremely valuable for monitoring viral infection and as the basis for prophylactic vaccines, and the recent discovery of empty virions may yet spur the development of new diagnostics and vaccine candidates.