Advertisement

Multiple Genome Alignment: Chaining Algorithms Revisited

  • Mohamed Ibrahim Abouelhoda
  • Enno Ohlebusch
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2676)

Abstract

Given n fragments from k > 2 genomes, we will show how to find an optimal chain of colinear non-overlapping fragments in time O(n logk−2 n log log n) and space O(n logk−2 n). Our result solves an open problem posed by Myers and Miller because it reduces the time complexity of their algorithm by a factor log2 n / log log n and the space complexity by a factor log n. For k = 2 genomes, our algorithm takes O(n log n) time and O(n) space.

Keywords

Range Query Priority Queue Longe Common Subsequence Range Tree Longe Common Subsequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    B.S. Baker and R. Giancarlo. Longest common subsequence from fragments via sparse dynamic programming. In Proc. 6th European Symposium on Algorithms, LNCS 1461, pp. 79–90, 1998.Google Scholar
  2. 2.
    B. Chazelle. A functional approach to data structures and its use in multidimensional searching. SIAM Journal on Computing, 17(3):427–462, 1988.zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    A.L. Delcher, A. Phillippy, J. Carlton, and S.L. Salzberg. Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Res., 30(11):2478–2483, 2002.CrossRefGoogle Scholar
  4. 4.
  5. 5.
    D. Eppstein, R. Giancarlo, Z. Galil, and G.F. Italiano. Sparse dynamic programming. I:Linear cost functions; II:Convex and concave cost functions. Journal of the ACM, 39:519–567, 1992.zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    M.S. Gelfand, A.A. Mironov, and P.A. Pevzner. Gene recognition via spliced sequence alignment. Proc. Nat. Acad. Sci., 93:9061–9066, 1996.CrossRefGoogle Scholar
  7. 7.
    L.J. Guibas and J. Stolfi. On computing all north-east nearest neighbors in the L 1 metric. Information Processing Letters, 17(4):219–223, 1983.zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    D.B. Johnson. A priority queue in which initialization and queue operations take O(log log D) time. Mathematical Systems Theory, 15:295–309, 1982.zbMATHCrossRefGoogle Scholar
  9. 9.
    D. Joseph, J. Meidanis, and P. Tiwari. Determining DNA sequence similarity using maximum independent set algorithms for interval graphs. Proc. 3rd Scandinavian Workshop on Algorithm Theory, LNCS 621, pp. 326–337, 1992.Google Scholar
  10. 10.
    M. Höhl, S. Kurtz, and E. Ohlebusch. Efficient multiple genome alignment. Bioinformatics, 18:S312–S320, 2002.Google Scholar
  11. 11.
    M.-Y. Leung, B.E. Blaisdell, C. Burge, and S. Karlin. An efficient algorithm for identifying matches with errors in multiple long molecular sequences. Journal of Molecular Biology, 221:1367–1378, 1991.CrossRefGoogle Scholar
  12. 12.
    W. Miller. Comparison of genomic DNA sequences: Solved and unsolved problems. Bioinformatics, 17:391–397, 2001.CrossRefGoogle Scholar
  13. 13.
    B. Morgenstern. A space-efficient algorithm for aligning large genomic sequences. Bioinformatics 16:948–949, 2000.CrossRefGoogle Scholar
  14. 14.
    E.W. Myers and X. Huang. An O(n 2 log n) restriction map comparison and search algorithm. Bulletin of Mathematical Biology, 54(4):599–618, 1992.zbMATHGoogle Scholar
  15. 15.
    E.W. Myers and W. Miller. Chaining multiple-alignment fragments in sub-quadratic time. Proc. 6th ACM-SIAM Symposium on Discrete Algorithms, pp. 38–47, 1995.Google Scholar
  16. 16.
    F.P. Preparata and M.I. Shamos. Computational geometry: An introduction. Springer-Verlag, New York, 1985.Google Scholar
  17. 17.
    S. Schwartz, Z. Zhang, K.A. Frazer, A. Smit, C. Riemer, J. Bouck, R. Gibbs, R. Hardison, and W. Miller. PipMaker—A web server for aligning two genomic DNA sequences., Genome Research, 4(10):577–586, 2000.CrossRefGoogle Scholar
  18. 18.
    P. van Emde Boas. Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters, 6(3):80–82, 1977.zbMATHCrossRefGoogle Scholar
  19. 19.
    D.E. Willard. New data structures for orthogonal range queries. SIAM Journal of Computing, 14:232–253, 1985.zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Z. Zhang, B. Raghavachari, R. Hardison, and W. Miller. Chaining multiple-alignment blocks. Journal of Computational Biology, 1:51–64, 1994.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Mohamed Ibrahim Abouelhoda
    • 1
  • Enno Ohlebusch
    • 2
  1. 1.Faculty of TechnologyUniversity of BielefeldBielefeldGermany
  2. 2.Faculty of Computer ScienceUniversity of UlmUlmGermany

Personalised recommendations