Using PQ Trees for Comparative Genomics

Landau, Gad M.; Parida, Laxmi; Weimann, Oren

doi:10.1007/11496656_12

Using PQ Trees for Comparative Genomics

Gad M. Landau^19,20,
Laxmi Parida²¹ &
Oren Weimann¹⁹

Conference paper

847 Accesses
16 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3537))

Abstract

Permutations on strings representing gene clusters on genomes have been studied earlier in [3, 12, 14, 17, 18] and the idea of a maximal permutation pattern was introduced in [12]. In this paper, we present a new tool for representation and detection of gene clusters in multiple genomes, using PQ trees [6]: this describes the inner structure and the relations between clusters succinctly, aids in filtering meaningful from apparently meaningless clusters and also gives a natural and meaningful way of visualizing complex clusters. We identify a minimal consensus PQ tree and prove that it is equivalent to a maximal πpattern [12] and each subgraph of the PQ tree corresponds to a non-maximal permutation pattern. We present a general scheme to handle multiplicity in permutations and also give a linear time algorithm to construct the minimal consensus PQ tree. Further, we demonstrate the results on whole genome data sets. In our analysis of the whole genomes of human and rat we found about 1.5 million common gene clusters but only about 500 minimal consensus PQ trees, and, with E Coli K-12 and B Subtilis genomes we found only about 450 minimal consensus PQ trees out of about 15,000 gene clusters. Further, we show specific instances of functionally related genes in the two cases.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alexandersson, M., Cawley, S., Pachter, L.: SLAM- Cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Research 13(3), 496–502 (2003)
Article Google Scholar
Bergeron, A., Blanchette, M., Chateau, A., Chauve, C.: Reconstructing ancestral gene orders using conserved intervals. In: Jonassen, I., Kim, J. (eds.) WABI 2004. LNCS (LNBI), vol. 3240, pp. 14–25. Springer, Heidelberg (2004)
Chapter Google Scholar
Bergeron, A., Corteel, S., Raffinot, M.: The algorithmic of gene teams. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 464–476. Springer, Heidelberg (2002)
Chapter Google Scholar
Bergeron, A., Mixtacki, J., Stoye, J.: Reversal Distance without Hurdles and Fortresses. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 388–399. Springer, Heidelberg (2004)
Chapter Google Scholar
Bergeron, A., Stoye, J.: On the similarity of sets of permutations and its applications to genome comparison. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 68–79. Springer, Heidelberg (2003)
Chapter Google Scholar
Booth, K., Leuker, G.: Testing for the consecutive ones property, interval graphs, and graph planarity using pq-tree algorithms. Journal of Computer and System Sciences 13, 335–379 (1976)
Article MATH MathSciNet Google Scholar
Bray, N., Couronne, O., Dubchak, I., Ishkhanov, T., Pachter, L., Poliakov, A., Rubin, E., Ryaboy, D.: Strategies and Tools for Whole-Genome Alignments. Genome Research 13(1), 73–80 (2003)
Article Google Scholar
Bray, N., Dubchak, I., Pachter, L.: AVID: A Global Alignment Program. Genome Research 13(1), 97–102 (2003)
Article Google Scholar
Bryan, S.K., Hagensee, M.E., Moses, R.E.: DNA Polymerase III Requirement for Repair of DNA Damage Caused by Methyl Methanesulfonate and Hydrogen Peroxide. Journal of Bacteriology 16(10), 4608–4613 (1987)
Google Scholar
Burns, K.H., Matzuk, M.M., Roy, A., Yan, W.: Tektin3 encodes an evolutionarily conserved putative testicular micro tubules-related protein expressed preferentially in male germ cells. Molecular Reproduction and Development 67, 295–302 (2004)
Article Google Scholar
Didier, G.: Common intervals of two sequences. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 17–24. Springer, Heidelberg (2003)
Chapter Google Scholar
Eres, R., Parida, L., Landau, G.M.: A combinatorial approach to automatic discovery of cluster-patterns. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 139–150. Springer, Heidelberg (2003)
Chapter Google Scholar
He, X., Goldwasser, M.H.: Identifying conserved gene clusters in the presence of orthologous groups. In: Proceedings of the Eighth Annual International Conferences on Research in Computational Molecular Biology (RECOMB), pp. 272–280 (2004)
Google Scholar
Heber, S., Stoye, J.: Finding all common intervals of k permutations. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 207–218. Springer, Heidelberg (2001)
Chapter Google Scholar
McConnell, R.M.: A certifying algorithm for the consecutive-ones property. In: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), vol. 15, pp. 761–770 (2004)
Google Scholar
Mulley, J., Holland, P.: Small genome, big insights. Nature 431, 916–917 (2004)
Article Google Scholar
Schmidt, T., Stoye, J.: Quadratic time algorithms for finding common intervals in two and more sequences. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 347–358. Springer, Heidelberg (2004)
Chapter Google Scholar
Uno, T., Yagiura, M.: Fast algorithms to enumerate all common intervals of two permutations. Algorithmica 26(2), 290–309 (2000)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Haifa, Mount Carmel, Haifa, 31905, Israel
Gad M. Landau & Oren Weimann
Department of Computer and Information Science, Polytechnic University, Six MetroTech Center, Brooklyn, NY, 11201-3840, USA
Gad M. Landau
Computational Biology Center, IBM TJ Watson Research Center, Yorktown Heights, New York, 10598, USA
Laxmi Parida

Authors

Gad M. Landau
View author publications
You can also search for this author in PubMed Google Scholar
Laxmi Parida
View author publications
You can also search for this author in PubMed Google Scholar
Oren Weimann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Georgia Institute of Technology and Università di Padova,
Alberto Apostolico
Université Paris-Est, France
Maxime Crochemore
School of Computer Science and Engineering, Seoul National University, 151-742, Seoul, Korea
Kunsoo Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Landau, G.M., Parida, L., Weimann, O. (2005). Using PQ Trees for Comparative Genomics. In: Apostolico, A., Crochemore, M., Park, K. (eds) Combinatorial Pattern Matching. CPM 2005. Lecture Notes in Computer Science, vol 3537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11496656_12

Download citation

DOI: https://doi.org/10.1007/11496656_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26201-5
Online ISBN: 978-3-540-31562-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics