The Variability of Amino Acid Sequences in Hepatitis B Virus
Hepatitis B virus (HBV) is an important human pathogen belonging to the Hepadnaviridae family, Orthohepadnavirus genus. Over 240 million people are infected with HBV worldwide. The reverse transcription during its genome replication leads to low fidelity DNA synthesis, which is the source of variability in the viral proteins. To investigate the variability quantitatively, we retrieved amino acid sequences of 5,167 records of all available HBV genotypes (A–J) from the Genbank database. The amino acid sequences encoded by the open reading frames (ORF) S/C/P/X in the HBV genome were extracted and subjected to alignment. We analyzed the variability of the lengths and the sequences of proteins as well as the frequencies of amino acids. It comprehensively characterized the variability and conservation of HBV proteins at the level of amino acids. Especially for the structural proteins, hepatitis B surface antigens (HBsAg), there are potential sites critical for virus assembly and immune recognition. Interestingly, the preS1 domains in HBsAg were variable at some positions of amino acid residues, which provides a potential mechanism of immune-escape for HBV, while the preS2 and S domains were conserved in the lengths of protein sequences. In the S domain, the cysteine residues and the secondary structures of the alpha-helix and beta-sheet were likely critical for the stable folding of all HBsAg components. Also, the preC domain and C-terminal domain of the core protein are highly conserved. However, the polymerases (HBpol) and the HBx were highly variable at the amino acid level. Our research provides a basis for understanding the conserved and important domains of HBV viral proteins, which could be potential targets for anti-virus therapy.
KeywordsHepatitis B virus (HBV) Amino acid Sequence characterization Variability and conservation
The authors would like to thank Prof. Ping Zhu (Institute of biophysics, Chinese Academy of Sciences) and Prof. Jingqiang Zhang (Sun Yat-sen university), who provided help in this research. This work was partially supported by the National Natural Science Foundation of China (Nos. U1611265, 81773271 and 31672536) and the Key Projects of Department of Education of Guangdong Province (No. 2017KZDXM088). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
JC, SL and YX designed the study. JC conducted computational work, JC and YX performed data analysis. JC, SL and YX wrote the manuscript draft.
Compliance with Ethical Standards
Conflict of interest
The authors declare that they have no conflict of interest.
Animal and Human Rights Statement
This article does not contain any studies with human or animal subjects performed by any of the authors.