Using PQ Trees for Comparative Genomics
Permutations on strings representing gene clusters on genomes have been studied earlier in [3, 12, 14, 17, 18] and the idea of a maximal permutation pattern was introduced in . In this paper, we present a new tool for representation and detection of gene clusters in multiple genomes, using PQ trees : this describes the inner structure and the relations between clusters succinctly, aids in filtering meaningful from apparently meaningless clusters and also gives a natural and meaningful way of visualizing complex clusters. We identify a minimal consensus PQ tree and prove that it is equivalent to a maximal πpattern  and each subgraph of the PQ tree corresponds to a non-maximal permutation pattern. We present a general scheme to handle multiplicity in permutations and also give a linear time algorithm to construct the minimal consensus PQ tree. Further, we demonstrate the results on whole genome data sets. In our analysis of the whole genomes of human and rat we found about 1.5 million common gene clusters but only about 500 minimal consensus PQ trees, and, with E Coli K-12 and B Subtilis genomes we found only about 450 minimal consensus PQ trees out of about 15,000 gene clusters. Further, we show specific instances of functionally related genes in the two cases.
KeywordsPattern discovery data mining clusters patterns motifs permutation patterns PQ trees comparative genomics whole genome analysis evolutionary analysis
Unable to display preview. Download preview PDF.
- 9.Bryan, S.K., Hagensee, M.E., Moses, R.E.: DNA Polymerase III Requirement for Repair of DNA Damage Caused by Methyl Methanesulfonate and Hydrogen Peroxide. Journal of Bacteriology 16(10), 4608–4613 (1987)Google Scholar
- 13.He, X., Goldwasser, M.H.: Identifying conserved gene clusters in the presence of orthologous groups. In: Proceedings of the Eighth Annual International Conferences on Research in Computational Molecular Biology (RECOMB), pp. 272–280 (2004)Google Scholar
- 15.McConnell, R.M.: A certifying algorithm for the consecutive-ones property. In: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), vol. 15, pp. 761–770 (2004)Google Scholar