Evaluation of Chaos Game Representation for Comparison of DNA Sequences
Chaos Game Representation (CGR) of DNA sequences has been used for visual representation as well as alignment-free comparisons. CGR is considered to be of great value as the images obtained from parts of a genome present the same structure as those obtained for the whole genome. However, the robustness of the CGR method to compare DNA sequences obtained in a variety of scenarios is not yet fully demonstrated. This paper addresses this issue by presenting a method to evaluate the potential of CGR to distinguish various classes in a DNA dataset. Two indices are proposed for this purpose - a rejection rate (\(\alpha \)) and an overlapping rate (\(\beta \)). The method was applied to 4 datasets, with between 31 to 400 classes each. Nearly 430 million pairs of DNA sequences were compared using the CGR.
KeywordsChaos game representation Discrete Fourier Transform Fractals
- 6.Mitra, S.K.: Digital Signal Processing: A Computer-Based Approach, 4th edn. McGraw-Hill, New York (2011)Google Scholar
- 11.Tanchotsrinon, W., Lursinsap, C., Poovorawan, Y.: A high performance prediction of HPV genotypes by chaos game representation and singular value decomposition. BMC Bioinform. 16, 71 (2015)Google Scholar