Fig. 3.1
(a) Four overlapping open reading frames of hepatitis B virus (HBV) were analyzed. Position: “1p/3preS2” means the nucleotide position is codon 1 in polymerase gene and codon 3 in the preS2 region. (b) dN and dS estimates for all HBV genes. Cited from Torres et al. [8] with publishers permission
Classification of HBV Genotypes
In 1988, HBV was classified into four genotypes due to a sequence divergence in the entire genome exceeding 8 %; these genotypes were designated by capital letters A to D (Fig. 3.2) [9]. In 1994, Norder et al. [10] found an additional two HBV genotypes using the same criteria, and named these E and F. Genotype G was reported in 2000 [11], and genotype H, which is phylogenetically closely related to genotype F, was proposed in 2002 [12]. In 2008, sequence analysis of the complete genome of a single isolate (AB231908) obtained from a Vietnamese male revealed a ninth genotype, I, which was closely related to genotypes A, C, and G [13]. Thereafter, an HBV strain was isolated from a Japanese patient who had resided in Borneo during World War II [14]; phylogenetic analysis of this isolate showed that it was closely related to gibbon HBV, with mean divergences of 10.9 and 10.7 %, and it was pro visionally named as genotype J [14].
Fig. 3.2
Phylogenetic tree of the full genome of hepatitis B virus (HBV). HBV isolates are represented by GenBank accession number. HBV strains are classified into ten genotypes, designated A to J. Each genotype is further subdivided into 2–16 subgenotypes, except for genotypes E, G, H, and J
Subgenotypes of HBV
Each HBV genotype is subdivided into several subgenotypes, based on exceeding 4 % of full genome differences (Fig. 3.3). To date, at least 35 subgenotypes have been reported [15], but no subgenotypes of genotype E, G, H, or J have been reported to date.
Fig. 3.3
Frequency of genetic distance of the full genome of hepatitis B virus (HBV). In total, 27,261 base pairs of the full genome sequence were analyzed. A genetic distance of <4 was defined as inter-genotype, 4–8 as subgenotype, and >8 as intra-genotype
Some points should be considered when classifying HBV strains into subgenotypes [16]. (1) Analysis of the full length of the genome, including entire ORFs, at nucleotide level is a prerequisite for determining subgenotypes accurately. (2) Adherence to the ranges of intra-genotypic nucleotide divergence (more than 4.5 % and less than 7.5 %) that define distinct genotypes should be observed. (3) Bootstrap values greater than 75 % are required to support the monophyletic tree for introducing a cluster as an independent subgenotype. (4) Recombinant strains should be excluded from any subgenotyping analysis as far as possible, as these can disrupt the topology of a phylogenetic tree and can falsely increase nucleotide divergence. (5) To introduce novel subgenotypes, strains harboring specific nucleotide and amino acid motifs should be identified. (6) To avoid sampling bias, a minimum of three purported novel strains, together with all available subgenotype strains from the same genotype, should be subjected to evolutionary and phylogenetic analysis. Using random reference sequences, as opposed to selecting some particular reference sequence, is highly recommended for subgenotyping by phylogenetic analysis.
Distribution of HBV Genotypes
HBV genotypes have a distinct geographic distribution (Fig. 3.4). Genotypes A and D are seen frequently in the USA and Europe, while genotypes B and C are the most common in Asia. Genotype E has been reported exclusively from West Africa, and genotype F is reported to cluster in Central America [17].
Fig. 3.4
Distribution of hepatitis B virus (HBV) genotypes. HBV genotypes have a distinct geographic distribution. Genotypes A and D are seen frequently in the USA and Europe, while genotypes B and C are the most common in Asia. Genotype E is reported exclusively from West Africa, and genotype F clusters in Central America [17] Cited from Miyakawa and Mizokami [17] with publisher’s permission
S Region Mutation and HBV Vaccination
The mutation rate of the HBV genome is 10,000 times faster than that of the human genome and HBV infects humans as quasi-species. The first-generation HBV vaccine that contained polyclonal HBV antibodies could prevent HBV infection; however, it is possible that the second-generation HBV vaccine, which is made using biogenetic technology and which contains an antigen from a monoclonal sequence, might fail to prevent HBV infection. The existence of a vaccine-induced escape mutant (VEM) was firstly reported by Carman et al. in 1990 [18, 19]. The sequence from amino acid (a.a.) 111 to a.a. 156 of the S region of HBV (the so called α-loop, Fig. 3.5), which is consistently exposed to humoral immunity and cell immunity , was shown to be frequently substituted [20–23]. However, it is controversial whether VEM increases the rate of infection in individuals vaccinated against HBV or whether HBV vaccination could prevent HBV infection even if VEM is detected in the resource. Although the significance of VEM in HBV vaccination has not been fully elucidated, further studies are required for the production of a safe and assured HBV vaccine [24, 25].
Fig. 3.5
α-Loop structure of the S region of HBV and escape mutations. The Thr126 to Asn126, Gln129 to His129, Pro142 to Leu142 or Ser142, Asp144 to Ala144, and Gly145 to Arg145, amino acid mutations are shown above, either solitary or in combination with the induced escape mutant
Stramer et al. [6] reported six individuals who became infected with HBV despite having received HBV vaccination previously. Moreover, five of the six individuals infected with a non-A2 HBV genotype, although HBV genotype A2 is the primary strain used for producing the HBV vaccine in the USA. Furthermore, HBV was also found in the partners of four of the six individuals. In Taiwan, Lai et al. [26] also reported a follow-up survey of children who had received the HBV vaccine immediately after birth, in which HBsAg, anti-HBc, and HBV DNA were investigated. The data showed that HBsAg, anti-HBc, and HBV DNA were more frequently present in the 18 years and older group than in the younger groups. Thus, the present HBV vaccine cannot completely prevent HBV infection, especially as it is sexually transmitted, and strongly suggests that the HBV genotype should be studied in detail, especially because it differs geographically.
Characteristic Mutations in HBV Genotypes
The HBV genotypes are determined by molecular phylogenetic analysis that takes into account mutations in the entire genome and includes an examination of the various sequences that reveal characteristic mutations in each genotype. In particular, mutations in a sequence known as the core promoter impact the effectiveness of viral replication. In this section, we provide an overview of the mutations observed in this region.
The subgenotype A1 virus has characteristic mutations in the Kozak (1809T/12T) and epsilon sequences (1862T/88A) that immediately precede the open reading frame (ORF) for the HBe antigen [27]. In a basic study, it was found that the presence of the 1809T/12T mutation decreased antigen production and suppressed replication in one case. The presence of the 1862T/88A mutation also reportedly inhibits viral replication and decreases core protein production [28]. These characteristics were especially observed in subgenotype A1.
The subgenotypes B and C have similar characteristics, including a well-known core promoter mutation (1762T/64A) [29]. Not only are mutations in this region known to result in long-term infections in individuals, but also new infections with a virus with such mutations are also reported to carry the risk of fulminant hepatitis [29]. Studies have also shown that such mutations are associated with liver cancer [30]. Basic studies have shown that the 1762T/64A core promoter mutation enhances viral replication.
In the subgenotype D, the mutation pattern around the core promoter varies in different subgenotypes. A different core promoter mutation is observed in HBV/D1 (1764T/66G) than that in the subgenotype B and C [31]. In terms of virological properties, it is thought to bind to different transcription factors apart from those that bind to 1762T/64A. Although viral replication is enhanced, at present there is no evidence of an association with liver cancer, unlike HBV/B and C. In genotypes such as HBV/D2, the key 1762T/64A mutation is observed frequently, similar to HBV/B and C.
Characteristics of HBV Genotypes
The characteristics of the HBV genotypes are given in Table 3.1.
Table 3.1
Comparison of virological characteristics of HBV genotypes [14]
Genotype | Length | Differentiating features | Subgenotypes | Serotype |
---|---|---|---|---|
A | 3,221 | 6-nucleotide insert at carboxy end of core region | A1A2A3 (A3, A4,A5)A4 (A6) | adw2/ayw2adw2ayw1ayw1adw4 |
B | 3,215 | B1(Bj), the subgenotype without recombination with genotype C in the precore/core region, distributes in Japan | B1B2B3 (B3, B5,B7-9),B6B4B5 (B6) | adw2adw2adw2 ayw1/adw2adw2adr |
C | 3,215 | Presumed to be the oldest HBV genotype [32] | C1C2C3C4C5C6-C12C13-C15C16 | adradradrayw2/ayw3adw2adradrayr |
D | 3,182 | 33-nucleotide deletion at the amino terminus of the preS1 region | D1D2D3D4D5D6 | ayw2ayw3ayw2/ayw3ayw2ayw3/ayw2ayw2ayw4/adw3 |
E | 3,212 | 3-nucleotide deletion at the amino terminus of the preS1 region | ayw4 | |
F | 3,215 | Intra-genotypic diversity is the mostly high | F1F2F3F4 | adw4adw4adw4adw4 |
G | 3,248 | 36-nucleotide insert of the core region; 3-nucleotide deletion at the amino terminus of the preS1 region; two stop codons at position 2 and 28 of the precore region | adw2 | |
H | 3,245 | Closely related to genotype F | adw4 | |
I | 3,215 | Genotype A, C, G recombination | I1I2 | adw2ayw2 |
J | 3,182 | 33-nucleotide deletion at the amino terminus of the preS1 region | ayw3 |
Genotype A
Genotype A is characterized by an insertion of six nucleotides at the carboxyl end of the core gene. Genotype A is dominant in Northwest Europe and North America. Additionally, some strains of genotype A have been found in the Philippines, Hong Kong, and in some parts of Africa and Asia. Subgenotype A2 is dominant in Europe, A1 is prevalent in Asia and most of Africa, A3 is found in the Cameroon and Gambia, A6 (currently named A4) and quasi-subgenotype A3 (which includes the strains previously named A4 from Mali, A5 from Nigeria, and A7 from Cameroon) have been isolated from other regions [32].
Genotype B
Genotype B is distributed throughout Asia and has been classified into nine subgenotypes to date. B1(Bj), the subgenotype without recombination with genotype C in the precore/core region, is distributed in Japan [33, 34]. Genotype B is mainly prevalent in Southeast Asia, but can also be found in the Pacific islands. Subgenotype B5, obtained from a Canadian Inuit population [35], represents genotype B without recombination with genotype C in the precore/core region, as opposed to the other subgenotypes of B, which do show this recombination [33]. Subgenotype B1 is the most likely ancestor of B5, which was possibly carried by the indigenous peoples during migration from Siberia and Alaska to North America and Greenland [36, 37].
Genotype C
Genotype C is mainly prevalent in Southeast Asia, but can also be found in the Pacific islands [38]. According to Paraskevis et al. [39], genotype C is the oldest HBV genotype. It has the highest number of subgenotypes, C1–C16 [40, 41], reflecting the long duration of being endemic in humans. A large number of these subgenotypes circulate in Indonesia [40]. Subgenotype C4 is exclusively found in the indigenous people of northern Australia [42], who are descended from a founder group that emigrated from Africa at least 50,000 years ago [43].
Genotype D
Genotype D is the genotype most widely distributed globally. It is found in northeastern Europe, the eastern and central Mediterranean, northern Africa, and the Middle East. Furthermore, it is highly prevalent in the Indian subcontinent and in a group of islands in the Indian Ocean with high endemic levels of HBV (Nicobar and Andaman) [44], and has also been identified in Oceania [43]. Nine HBV/D subgenotypes (D1–D9) have been described to date [45]. D1 is the most prevalent subgenotype in Greece, Turkey, and North Africa [46, 47]; D2 in northeastern Europe (Russia, Belarus, and Estonia) and Albania [48, 49]; and D3 in Italy and Serbia [50, 51]. D4 is the dominant subgenotype in Oceania [40], D5 in primitive tribes living in India, where a number of different D subgenotypes are also found [52], D6 in Papua New Guinea and Indonesia [53], and D7 in Tunisia and Morocco [54, 55]. Finally, the recently described D8 and D9 subgenotypes are found in Nigeria and India.
Genotype E
Genotype E is characterized by a three-nucleotide deletion in the preS1 region. Genotype E is mainly dominant in West Africa [56]. Genotype E is rarely found outside of Africa, except in individuals of African descent. Although it is found over a large geographical area, it is interesting to note that it has a very low degree of genetic diversity: the isolates studied by means of phylogenetic analysis do not segregate into distinct subgenotypes, but are included in a single monophyletic group [57]. This observation suggests that it has a relatively recent evolutionary history among humans and, despite the forced immigration of West African slaves [57], the absence of any significant spread among Afro-Americans indicates that it was probably rare in West Africa at the time of the slave trade and before the nineteenth century. The only documented finding of its presence in America is a report by Alvarado et al., who identified nine HBV-infected individuals carrying genotype E in 2010 in the relatively isolated Afro-American community of Quibdò, Colombia [58]. All of these strains were identified by means of their two-nucleotide synapomorphy in the S region, thus forming a highly significant monophyletic group.
Genotype F
Genotype F is indigenous to America, and is the most prevalent HBV genotype in Central and South America, and among the Amerindians of the Amazon basin [59, 60]. Genotype F is classified into four subgenotypes (F1–F4), which are further subdivided into different clades [43]. F1 is highly prevalent in Central America, Alaska, and southeast America [61, 62]; F2 is highly prevalent in Venezuela, and is also present in Brazil [61]; F3 is present in central (Panama) and northern Latin America (Colombia and Venezuela); and F4 is present in Bolivia and Argentina [63]. The presence of HBV-F among the Amerindian population suggests the long evolution of this s train. In the study by Alvarado et al. on the molecular epidemiology and evolutionary dynamics of HBV/F in Colombia [64], it was found that HBV/F3 was the most prevalent subgenotype in Colombia, and its origin was suggested to be in Venezuela. This is probably the oldest F subgenotype, as it is closely related to genotype H [61, 64].
Genotype G
In 2000, genotype G was defined as the seventh HBV genotype from a strain isolated from a French patient [11]. Genotype G harbors a 36-nucleotide insertion in the core region and a genome length of 3248 base pairs. The HBe antigen was detected in the sera of individuals infected with genotype G, despite the presence of two stop codons in the precore region, which should not allow production of HBe antigen [65]. Stuyver et al. proposed that genotype G might have a unique mechanism that allowed production of the HBe antigen. However, we revealed that the sera of all the individuals infected with genotype G showed coinfection with genotype A [66]. Thus, the HBe antigen in the sera of individuals infected with genotype G was produced due to the coinfection with genotype A HBV, which does not have a stop codon in its precore region. Much evidence has accumulated showing that genotype G was not exceptionally associated with co-infection with HBV of other genotypes [67, 68]. Genotype G was identified frequently in homosexual men, and demonstrated very low genome diversity.
Genotype H
Phylogenetic analysis has demonstrated that genotype H is closely related to genotype F. Genotype H is prevalent in Mexico in both the indigenous populations and the mestizos (individuals of mixed descent), suggesting that this genotype has a long history among the descendants of the Aztecs, preceding the arrival of Europeans [12, 69]. Considering that genotype F demonstrates a wide range of diversity, it has been proposed that genotype H should be classified as a subgenotype of genotype F [70].
Genotype I
In 2008, sequence analysis of the complete genome of a single isolate from a Vietnamese male showed that it was closely related to three previously described “aberrant” Vietnamese strains [15, 71] and a ninth genotype, I, was proposed [13]. This proposal was not accepted, because the mean genetic divergence of these four strains from genotype C was 7 % and the recombination analysis was not robust [72]. Subsequently, sequences derived from isolates obtained from Laos [73], the Idu Mishmi tribe in northeast India [74], a Canadian of Vietnamese descent [75], and China [76] have expanded the number of these sequences. The nucleotide divergence of most of these sequences relative to genotype C was at least 7.5 %, with good bootstrap support for the group, thus meeting the criteria for genotype assignment [77]. Two subgenotypes, I1 and I2, with serological subtypes adw 2 and ayw 2, respectively, were described [73]. Genotype I is a recombinant of genotypes A/C/G and an indeterminate genotype [73–76], which clusters close to genotype C when the complete genome is analyzed, and with genotype A in the polymerase gene region [76]. The genotype A and C regions are closely related to subgenotypes A3 and C3, respectively [73–76].