Abstract
In this paper, we propose two four-base related 2D curves of DNA primary sequences (termed as F-B curves) and their corresponding single-base related 2D curves (termed as A-related, G-related, T-related and C-related curves). The constructions of these graphical curves are based on the assignments of individual base to four different sinusoidal (or tangent) functions; then by connecting all these points on these four sinusoidal (tangent) functions, we can get the F-B curves; similarly, by connecting the points on each of the four sinusoidal (tangent) functions, we get the single-base related 2D curves. The proposed 2D curves are all strictly non degenerate. Then, a 8-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on a normalized geometrical centers of the proposed curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species, similarity of cDNA sequences of beta-globin gene from eight species, and similarity of the whole mitochondrial genomes of 18 eutherian mammals. The experimental results well demonstrate the effectiveness of the proposed method.
Similar content being viewed by others
Change history
05 June 2018
In the original publication of the article, the y axis labels present in Figs. 1a and 2a are incorrect. The correct Figs. 1a and 2a are provided here.
References
Bai F, Zhang J, Zheng J (2011) Similarity analysis of DNA sequences based on the EMD method. Appl Math Lett 24(2):232–237
Bai F, Zhang J, Zheng J, Li C, Liu L (2015) Vector representation and its application of DNA sequences based on nucleotide triplet codons. J Mol Graph Model 62:150–156
Chi R, Ding KQ (2005) Novel 4D numerical representation of DNA sequences. Chem Phys Lett 407(1):63–67
Dai Q, Liu X, Wang T (2006) A novel 2D graphical representation of DNA sequences and its application. J Mol Graph Model 25:340–344
Gate M (1986) A simple way to look at DNA. J Theor Biol 119:319–328
Guo XF, Nandy A (2003) Numerical characterization of DNA sequences in a 2-D graphical representation scheme of low degeneracy. Chem Phys Lett 369:361–366
Guo XF, Randic M, Basak S (2001) A novel 2-D graphical representation of DNA sequences of low degeneracy. Chem Phys Lett 350(3):106–112
Hamori E (1985) Novel DNA sequence representations. Nature (London) 314(1):585–586
Hamori E, Ruskin J (1983) H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences. J Biol Chem 258(2):1318–1327
Huang G, Liao B, Li Y et al (2008) H–L curve: a novel 2D graphical representation for DNA sequences. Chem Phys Lett 462(1):129–132
Jafarzadeh N, Iranmanesh A (2013) C-curve: a novel 3D graphical representation of DNA sequence based on codons. Math Biosci 241:217–224
Leong PM, Morgenthaler S (1995) Random walk and gap plots of DNA sequences. Comput Appl Biosci 11:503–507
Li Y, Xiao W (2016) Circular helix-like curve: an effective tool of biological sequence analysis and comparison. Comput Math Methods Med. https://doi.org/10.1155/2016/3262813
Li Y, Liu Q, Zheng X et al (2016) DUC-curve, a highly compact 2D graphical representation of DNA sequences and its application in sequence alignment. Phys A 456:256–270
Liao B, Ding KQ (2006) A 3D graphical representation of DNA sequences and its application. Theoret Comput Sci 358:56–64
Liao B, Wang TM (2004a) Analysis of similarity/dissimilarity of DNA sequences based on nonoverlapping triplets of nucleotide bases. J Chem Inf Comput Sci 44(5):1666–1670
Liao B, Wang T (2004b) 3-D graphical representation of DNA sequences and their numerical characterization. J Mol Struct THEOCHEM 681(1–3):209–212
Liao B, Zhu W, Liu Y (2006) 3D graphical representation of DNA sequence without degeneracy and its application in constructing phylogenic tree. MATCH Commun Math Comput Chem 56:209–216
Liu Y, Guo XF, Xu J et al (2002) Some notes on 2-D graphical representation of DNA sequences. J Chem Inf Comput Sci 42:529–533
Liu XQ, Dai Q, Xiu ZL, Wang TM (2006) PNN-curve: a new 2D graphical representation of DNA sequences and its application. J Theor Biol 243:555–561
Nandy A (1994) A 2D graphical representation and analysis of DNA sequence structure I. Methodology and application to globin genes. Curr Sci 66:309–313
Qi Z, Qi X (2007) Novel 2D graphical representation of DNA sequence based on dual nucleotides. Chem Phys Lett 440:139–144
Randic M, Vracko M, Nandy A et al (2000) On 3-D graphical representation of DNA primary sequences and their numerical characterization. J Chem Inf Comput Sci 40(5):1235–1244
Randic M, Vracko M, Lers N, Plavsic O (2003) Novel 2-D graphical representation of DNA sequences and their numerical characterization. Chem Phys Lett 368:1–6
Rodgers JL, Nicewander WA (1988) Thirteen ways to look at the correlation coefficient. Am Stat 42(1):59–66
Tang XC, Zhou PP (2010) On the similarity/dissimilarity of DNA sequences based on 4D graphical representation. Phys Chem 6:55
Wang J, Wang W (2011) New 2-D graphical representation of DNA sequences. Biophys Rev Lett 1(02):133–140
Wang J, Zhang Y (2006) Characterization and similarity analysis of DNA sequences grounded on a 2-D graphical representation. Chem Phys Lett 423:50–53
Wu Y, Liew AW-C, Yan H, Yang M (2003) DB-curve: a novel 2D method of DNA sequence visualization and representation. Chem Phys Lett 367:170–176
Xie G, Mo Z (2011) Three 3D graphical representation of DNA primary sequences based on the classifications of DNA bases and its applications. J Theor Biol 269:123–130
Xin J et al (2016) A novel DNA sequence similarity calculation based on simplified pulse-coupled neural network and Huffman coding. Phys A Stat Mech Appl 461:325–338
Yao Y-H, Dai Q (2008) Analysis of similarity/dissimilarity of DNA sequences based on a class of 2D graphical representation. J Comput Chem 29:1632–1639
Yao Y, Nan X, Wang T (2006) A new 2D graphical representation—classification curve and the analysis of similarity/dissimilarity of DNA sequences. J Mol Struct 764:101–108
Yu J, Sun X, Wang J (2009) TN curve: a novel 3D graphical representation of DNA sequence based on trinucleotides and its applications. J Theor Biol 261:459–468
Yuan C, Liao B, Wang T (2003) New 3D graphical representation of DNA sequences and their numerical characterization. Chem Phys Lett 379:412–417
Zhang CT (1997) A symmetrical theory of DNA sequences and its applications. J Theor Biol 187:297–306
Zhang Z-J (2009) DV-curve: a novel intuitive tool for visualizing and analyzing DNA sequences. Bioinformatics 25:1112–1117
Zhang R, Zhang CT (1994) Z-curve: an intuitive tool for visualizing and analyzing the DNA sequences. Chem Phys Lett 11:767–782
Zhang CT, Zhang R, Qu HY (2003) The Z curve database: a graphic representation of genome sequences. Bioinformatics 19:593–599
Zhao L, Lv Y-H (2010) An S-curve-based approach of identifying biological sequences. Acta Biotheor 58:1–14
Zou S, Wang L, Wang J (2014) A 2D graphical representation of the sequences of DNA based on triplets and its application. J Bioinf Syst Biol 2014:1–7
Acknowledgements
This work has been supported in part by the National Natural Science Foundation of China Grants 61103138, 61702163; the Fundamental Research Funds for the Henan Provincial Colleges and Universities in Henan University of Technology(2016RCJH06); the National Key Research and Development Program (2016YFD0400104-5); and the Henan International Cooperation Project (Grant 152102410036).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xie, GS., Jin, XB., Yang, C. et al. Graphical Representation and Similarity Analysis of DNA Sequences Based on Trigonometric Functions. Acta Biotheor 66, 113–133 (2018). https://doi.org/10.1007/s10441-018-9324-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10441-018-9324-0