Abstract
Viral sequence classification has widespread applications in structural and functional categorization, clinical and epidemiological studies. Most approaches of subtyping and classification depends on an initial alignment step to asses similarity score followed by distance-based phylogenetic or statistical algorithms. We observe that interval distributions of nucleotide(s) over the sequence possess the potential for sequence comparison and devise an algorithm that determines the similarity/dissimilarity score among pairs of sequences. Classification of HIV virus subtyping by the method obtains exact tally with its biological taxonomy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Vinga, S., Almeida, J.: Alignment-free sequence comparison-a review. Bioinformatics (Oxford Journal) 19(4), 513–523 (2002)
Bonham-Carter, O., Steele, J., Bastola, D.: Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis. Bioinformatics (Oxford Journal) 15(6), 890–905 (2013)
Sims, G.E., Jun, S.R., Wu, G.A., Kim, S.H.: Whole-genome phylogeny of mammals: evolutionary information in genic and nongenic regions. Proc. Natl. Acad. Sci. U.S.A. 106(40), 17077–17082 (2009)
Sims, G.E., Kim, S.H.: Whole-genome phylogeny of Escherichia coli/Shigella group by feature frequency profiles (FFPs). Proc. Natl. Acad. Sci. U.S.A. 108(20), 8329–8334 (2011)
Gao, L., Qi, J.: Whole genome molecular phylogeny of large dsDNA viruses using composition vector method. BMC Evol. Biol. (2007)
Wang, H., Xu, Z., Gao, L., Hao, B.: A fungal phylogeny based on 82 complete genomes using the composition vector method. BMC Evol. Biol. 9, 195 (2009)
Wei, D., Jiang, Q., Wei, Y., Wang, S.: A novel hierarchical clustering algorithm for gene sequences. BMC Bioinform. 13(174) (2012)
Bao, J., Yuan, R., Bao, Z.: An improved alignment-free model for dna sequence similarity matric. BMC Bioinform. 15(321) (2014)
Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Calcutta Math. Soc. 35, 99–109 (1943)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Struck, D., Lawyer, G., Ternes, A.M., Schmit, J.C., Perez Bercoff, D.: COMET: adaptive context-based modeling for ultrafast HIV-1 subtype identification. Nucleic Acids Res. 42, e144 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mitra, U., Bhattacharyya, B. (2018). Alignment-Independent Sequence Analysis Based on Interval Distribution: Application to Subtyping and Classification of Viral Sequences. In: Bhattacharyya, S., Sen, S., Dutta, M., Biswas, P., Chattopadhyay, H. (eds) Industry Interactive Innovations in Science, Engineering and Technology . Lecture Notes in Networks and Systems, vol 11. Springer, Singapore. https://doi.org/10.1007/978-981-10-3953-9_48
Download citation
DOI: https://doi.org/10.1007/978-981-10-3953-9_48
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3952-2
Online ISBN: 978-981-10-3953-9
eBook Packages: EngineeringEngineering (R0)