Abstract
AIDS is caused by HIV, which can be divided into two strains: HIV-1 and HIV-2. Whereas HIV-1 is distributed around the world and is the major cause of global infections, HIV-2 is less infectious and transmissible and is therefore generally confined to West Africa. Thus this research aims to account for their difference by analyzing genome sequences of HIV-1 and HIV-2 using some methods: Apriori algorithm, Decision tree, and Support Vector Machine. Apriori demonstrates that HIV-1 has lysine, arginine, and serine as its typical amino acids, while HIV-2 has glycine, lysine, leucine, and arginine. Decision tree determines the significant positions of amino acids that can distinguish the two viruses: pos5 in 9 window, pos13 in 13 window, and pos16 in 19 window. SVM indicates that two viruses are seemingly similar but indeed different. The collective results provide a biologically verifiable background for making effective vaccines for HIV, especially for HIV-2.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chinen, J., Shearer, W.T.: Secondary immunodeficiencies, including HIV infection. J. Allergy Clin. Immunol. 125(2), S195–S203 (2010)
Sharp, P.M., Hahn, B.H.: Origins of HIV and the AIDS pandemic. Cold Spring Harb. Perspect. Med. 1(1), a006841 (2011)
Hemelaar, J., et al.: Global and regional distribution of HIV-1 genetic subtypes and recombinants in 2004. Aids 20(16), W13–W23 (2006)
Reeves, J.D., Doms, R.W.: Human immunodeficiency virus type 2. J. Gen. Virol. 83(6), 1253–1265 (2002)
Keele, B.F., et al.: Chimpanzee reservoirs of pandemic and nonpandemic HIV-1. Science 313(5786), 523–526 (2005)
Gilbert, P.B., et al.: Comparison of HIV-1 and HIV-2 infectivity from a prospective cohort study in Senegal. Stat. Med. 22(4), 573–593 (2003)
Marlink, R., et al.: Reduced rate of disease development after HIV-2 infection as compared to HIV-1. Science 265(5178), 1587–1590 (1994)
Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157(1), 105–132 (1982)
Creighton, C., Hanash, S.: Mining gene expression databases for association rules. Bioinformatics 19(1), 79–86 (2003)
Go, E., Lee, S., Yoon, T.: Analysis of Ebolavirus with decision tree and Apriori algorithm. Int. J. Mach. Learn. Comput. 4(6), 543 (2014)
Stiglic, G., et al.: Comprehensive decision tree models in bioinformatics. PLoS ONE 7(3), e33812 (2012)
Kropp, S., Caulfield, V.I.C.: Data Mining and Bioinformatics. Faculty of Information Technology, Monash University, Caulfield (2004)
Hsu, C.-W., Chang, C.-C., Lin, C.-J.: A practical guide to support vector classification. 1–16 (2003)
Byvatov, E., Schneider, G.: Support vector machine applications in bioinformatics. Appl. Bioinform. 2(2), 67–77 (2002)
Inokuchi, A., Washio, T., Motoda, H.: An Apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Chen, X., Wang, M., Zhang, H.: The use of classification trees for bioinformatics. Wiley Interdiscip. Rev.: Data Min. Knowl. Disc. 1(1), 55–63 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Roh, Y., Yoon, S., Lee, M.Y., Jang, S., Yoon, T. (2016). Analysis and Comparison of Genomes of HIV-1 and HIV-2 Using Apriori Algorithm, Decision Tree, and Support Vector Machine. In: Huang, DS., Bevilacqua, V., Premaratne, P. (eds) Intelligent Computing Theories and Application. ICIC 2016. Lecture Notes in Computer Science(), vol 9771. Springer, Cham. https://doi.org/10.1007/978-3-319-42291-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-42291-6_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42290-9
Online ISBN: 978-3-319-42291-6
eBook Packages: Computer ScienceComputer Science (R0)