Abstract
Recently, an Alignment approach for the comparison of two genomes, based on an evolutionary model restricted to Duplications and Losses, has been presented. An exact linear programming algorithm has been developed and successfully applied to the Transfer RNA (tRNA) repertoire in Bacteria, leading to interesting observation on tRNA shift of identity. Here, we explore a direct dynamic programming approach for the Duplication-Loss Alignment of two genomes, which proceeds in two steps: (1) (The Dynamic Programming step) Outputs a best candidate alignment between the two genomes and (2) (Minimum Label Alignment problem) Finds an evolutionary scenario of minimum duplication-loss cost that is in agreement with the alignment. We show that the Minimum Label Alignment is APX-hard, even if the number of occurrences of a gene inside a genome is bounded by 5. We then develop a heuristic which is a thousands of times faster than the linear programming algorithm and exhibits a high degree of accuracy on simulated datasets. The heuristic has been implemented in JAVA and is available on request.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alimonti, P., Kann, V.: Some APX-completeness results for cubic graphs. Theoretical Computer Science 237(1-2), 123–134 (2000)
Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer, Heidelberg (1999)
Bergeron, A.: A Very Elementary Presentation of the Hannenhalli-Pevzner Theory. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 106–117. Springer, Heidelberg (2001)
Bourque, G., Pevzner, P.: Genome-scale evolution: Reconstructing gene orders in the ancestral species. Genome Research 12, 26–36 (2002)
Canzar, S., Andreotti, S.: A branch-and-cut algorithm for the 2-species duplication-loss phylogeny problem. CoRR abs/1208.2698 (2012)
El-Mabrouk, N.: Genome rearrangement with gene families. In: Mathematics of Evolution and Phylogeny, pp. 291–320. Oxford University Press, Oxford (2005)
El-Mabrouk, N., Sankoff, D.: Analysis of Gene Order Evolution beyond Single-Copy Genes. In: Evolutionary Genomics: Statistical and Computational Methods. Methods in Molecular Biology. Springer (Humana), New York (2012)
El-Mabrouk, N.: Genome Rearrangement by Reversals and Insertions/Deletions of Contiguous Segments. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 222–234. Springer, Heidelberg (2000)
Fertin, G., Labarre, A., Rusu, I., Tannier, E., Vialette, S.: Combinatorics of genome rearrangements. The MIT Press, Cambridge (2009)
Hannenhalli, S., Pevzner, P.A.: Transforming cabbage into turnip (polynomial algorithm for sorting signed permutations by reversals). Journal of the ACM 48, 1–27 (1999)
Holloway, P., Swenson, K., Ardell, D., El-Mabrouk, N.: Evolution of Genome Organization by Duplication and Loss: An Alignment Approach. In: Chor, B. (ed.) RECOMB 2012. LNCS, vol. 7262, pp. 94–112. Springer, Heidelberg (2012)
Ma, J., Zhang, L., Suh, B., Raney, B., Burhans, R., Kent, W., Blanchette, M., Haussler, D., Miller, W.: Reconstructing contiguous regions of an ancestral genome. Genome Research 16, 1557–1565 (2007)
Marron, M., Swenson, K.M., Moret, B.M.E.: Genomic Distances Under Deletions and Insertions. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 537–547. Springer, Heidelberg (2003)
Moret, B., Wang, L., Warnow, T., Wyman, S.: New approaches for reconstructing phylogenies from gene order data. Bioinformatics 173, S165–S173 (2001)
Sankoff, D., Blanchette, M.: The Median Problem for Breakpoints in Comparative Genomics. In: Jiang, T., Lee, D.T. (eds.) COCOON 1997. LNCS, vol. 1276, pp. 251–264. Springer, Heidelberg (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Benzaid, B., Dondi, R., El-Mabrouk, N. (2013). Duplication-Loss Genome Alignment: Complexity and Algorithm. In: Dediu, AH., Martín-Vide, C., Truthe, B. (eds) Language and Automata Theory and Applications. LATA 2013. Lecture Notes in Computer Science, vol 7810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37064-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-37064-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37063-2
Online ISBN: 978-3-642-37064-9
eBook Packages: Computer ScienceComputer Science (R0)