Abstract
Sequence alignment is a central problem in bioinformatics. The classical dynamic programming algorithm aligns two sequences by optimizing over possible insertions, deletions and substitution. However, other evolutionary events can be observed, such as inversions, tandem duplications or moves (transpositions). It has been established that the extension of the problem to move operations is NP-complete. Previous work has shown that an extension restricted to non-overlapping inversions can be solved in O(n 3) with a restricted scoring scheme. In this paper, we show that the alignment problem extended to non-overlapping moves can be solved in O(n 5) for general scoring schemes, O(n 4logn) for concave scoring schemes and O(n 4) for restricted scoring schemes. Furthermore, we show that the alignment problem extended to non-overlapping moves, inversions and tandem duplications can be solved with the same time complexities. Finally, an example of an alignment with non-overlapping moves is provided.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Fliess, A., Motro, B., Unger, R.: Swaps in protein sequences. Proteins. 48(2), 377–387 (2002)
Lopresti, D., Tomkins, A.: Block edit models for approximate string matching. Theor. Comput. Sci. 181(1), 159–179 (1997)
Shapira, D., Storer, J.A.: Edit distance with move operations. In: Apostolico, A., Takeda, M. (eds.) CPM 2002. LNCS, vol. 2373, pp. 85–98. Springer, Heidelberg (2002)
Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. ACM Trans. Algorithms 1(2), 350–366 (2005)
Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: SODA 2002. Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, Philadelphia, PA. Society for Industrial and Applied Mathematics, pp. 667–676. ACM Press, New York (2002)
Caprara, A.: Sorting by reversals is difficult. In: RECOMB 1997. Proceedings of the first annual international conference on Computational molecular biology, pp. 75–83. ACM Press, New York (1997)
Schoeninger, M., Waterman, M.S.: A local algorithm for dna sequence alignment with inversions. Bull. Math. Biol. 54(4), 521–536 (1992)
Chen, Z.Z., Gao, Y., Lin, G., Niewiadomski, R., Wang, Y., Wu, J.: A space-efficient algorithm for sequence alignment with inversions and reversals. Theor. Comput. Sci. 325(3), 361–372 (2004)
do Lago, A.P., Muchnik, I.: A sparse dynamic programming algorithm for alignment with non-overlapping inversions. Theoret. Informatics Appl. 39(1), 175–189 (2005)
Alves, C.E.R., do Lago, A.P., Vellozo, A.F.: Alignment with non-overlapping inversions in o(n 3 logn) time. In: Proceedings of GRACO 2005. Electronic Notes in Discrete Mathematics, vol. 19, pp. 365–371. Elsevier, Amsterdam (2005)
Vellozo, A.F., Alves, C.E.R., do Lago, A.P.: Alignment with non-overlapping inversions in o(n 3)-time. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 186–196. Springer, Heidelberg (2006)
Apic, G., Gough, J., Teichmann, S.A.: Domain combinations in archaeal, eubacterial and eukaryotic proteomes. J. Mol. Biol. 310(2), 311–325 (2001)
Kaessmann, H., Zöllner, S., Nekrutenko, A., Li, W.H.: Signatures of domain shuffling in the human genome. Genome Res. 12(11), 1642–1650 (2002)
Liu, M., Walch, H., Wu, S., Grigoriev, A.: Significant expansion of exon-bordering protein domains during animal proteome evolution. Nucleic Acids Res. 33(1), 95–105 (2005)
Vibranovski, M.D., Sakabe, N.J., de Oliveira, R.S., de Souza, S.J.: Signs of ancient and modern exon-shuffling are correlated to the distribution of ancient and modern domains along proteins. J. Mol. Evol. 61(3), 341–350 (2005)
Bashton, M., Chothia, C.: The geometry of domain combination in proteins. J. Mol. Biol. 315(4), 927–939 (2002)
Shandala, T., Gregory, S.L., Dalton, H.E., Smallhorn, M., Saint, R.: Citron kinase is an essential effector of the pbl-activated rho signalling pathway in drosophila melanogaster. Development 131(20), 5053–5063 (2004)
Andrade, M.A., Perez-Iratxeta, C., Ponting, C.P.: Protein repeats: structures, functions, and evolution. J. Struct. Biol. 134(2-3), 117–131 (2001)
Marcotte, E.M., Pellegrini, M., Yeates, T.O., Eisenberg, D.: A census of protein repeats. J. Mol. Biol. 293(1), 151–160 (1999)
Maes, M.: On a cyclic string-to-string correction problem. Inf. Process. Lett. 35(2), 73–78 (1990)
Myers, E.W.: An overview of sequence comparison algorithms in molecular biology. Technical Report 91-29, Univ. of Arizona, Dept. of Computer Science (1991)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: computer science and computational biology. Press Syndicate of the University of Cambridge, Cambridge (1997/1999)
Landau, G.M., Ziv-Ukelson, M.: On the common substring alignment problem. J. Algorithms 41(2), 338–354 (2001)
Schmidt, J.P.: All highest scoring paths in weighted grid graphs and their application to finding all approximate repeats in strings. SIAM J. Comput. 27(4), 972–992 (1998)
Aggarwal, A., Klawe, M.M., Moran, S., Shor, P., Wilber, R.: Geometric applications of a matrix-searching algorithm. Algorithmica 2(1), 195–208 (1987)
Gonnet, G.H., Hallett, M.T., Korostensky, C., Bernardin, L.: Darwin v. 2.0: An interpreted computer language for the biosciences. Bioinformatics 16(2), 101–103 (2000)
Monge, G.: Déblai et remblai. Mémoires de l’Académie Royale des Sciences (1781)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ledergerber, C., Dessimoz, C. (2007). Alignments with Non-overlapping Moves, Inversions and Tandem Duplications in O(n 4) Time. In: Lin, G. (eds) Computing and Combinatorics. COCOON 2007. Lecture Notes in Computer Science, vol 4598. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73545-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-73545-8_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73544-1
Online ISBN: 978-3-540-73545-8
eBook Packages: Computer ScienceComputer Science (R0)