Advertisement

Evaluation of Chaos Game Representation for Comparison of DNA Sequences

  • André R. S. MarcalEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11255)

Abstract

Chaos Game Representation (CGR) of DNA sequences has been used for visual representation as well as alignment-free comparisons. CGR is considered to be of great value as the images obtained from parts of a genome present the same structure as those obtained for the whole genome. However, the robustness of the CGR method to compare DNA sequences obtained in a variety of scenarios is not yet fully demonstrated. This paper addresses this issue by presenting a method to evaluate the potential of CGR to distinguish various classes in a DNA dataset. Two indices are proposed for this purpose - a rejection rate (\(\alpha \)) and an overlapping rate (\(\beta \)). The method was applied to 4 datasets, with between 31 to 400 classes each. Nearly 430 million pairs of DNA sequences were compared using the CGR.

Keywords

Chaos game representation Discrete Fourier Transform Fractals 

References

  1. 1.
    Deschavanne, P.J., Giron, A., Vilain, J., Fagot, G., Fertil, B.: Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. Mol. Biol. Evol. 16(10), 1391–1399 (1999)CrossRefGoogle Scholar
  2. 2.
    Hoang, T., Yin, C., Yau, S.S.T.: Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison. Genomics 108, 134–142 (2016)CrossRefGoogle Scholar
  3. 3.
    Jeffrey, H.J.: Chaos game representation of gene structure. Nucleic Acids Res. 18, 2163–2170 (1990)CrossRefGoogle Scholar
  4. 4.
    Joseph, J., Sasikumar, R.: Chaos game representation for comparison of whole genomes. BMC Bioinform. 7, 243 (2006)CrossRefGoogle Scholar
  5. 5.
    Kari, L., et al.: Mapping the space of genomic signatures. PLoS ONE 10(5), e0119815 (2015)CrossRefGoogle Scholar
  6. 6.
    Mitra, S.K.: Digital Signal Processing: A Computer-Based Approach, 4th edn. McGraw-Hill, New York (2011)Google Scholar
  7. 7.
    Ni, H.M., Qi, D.W., Mu, H.B.: Applying MSSIM combined chaos game representation to genome sequences analysis. Genomics 110(3), 180–190 (2018)CrossRefGoogle Scholar
  8. 8.
    Palmenberg, A.C., et al.: Sequencing and analyses of all known human rhinovirus genomes reveal structure and evolution. Science 324, 55–59 (2009)CrossRefGoogle Scholar
  9. 9.
    Stan, C., Cristescu, C.P., Scarlat, E.I.: Similarity analysis for DNA sequences based on chaos game representation. Case study: the albumin. J. Theoret. Biol. 267, 513–518 (2010)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Stepanyan, I.V., Petoukhov, S.V.: The matrix method of representation, analysis and classification of long genetic sequences. Information 8(1), 12 (2017)CrossRefGoogle Scholar
  11. 11.
    Tanchotsrinon, W., Lursinsap, C., Poovorawan, Y.: A high performance prediction of HPV genotypes by chaos game representation and singular value decomposition. BMC Bioinform. 16, 71 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Departamento de Matemática, Faculdade de CiênciasUniversidade do PortoPortoPortugal

Personalised recommendations