Skip to main content

New Algorithms for Multiple DNA Sequence Alignment

  • Conference paper
Algorithms in Bioinformatics (WABI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3240))

Included in the following conference series:

Abstract

We present a mathematical framework for anchoring inglobal multiple alignment. Our framework uses anchors that are hits to spaced seeds and identifies anchors progressively, using a phylogenetic tree. We compute anchors in the tree starting at the root and going to the leaves, and from the leaves going up. In both cases, we compute thresholds for anchors to minimize errors. One innovative aspect of our approach is the approximate inference of ancestral sequences with accomodation for ambiguity. This, combined with proper scoring techniques and seeding, lets us pick many anchors in homologous positions as we align up a phylogenetic tree, minimizing total work. Our algorithm is reasonably successful in simulations, is comparable to existing software in terms of accuracy and substantially more efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blanchette, M., Kent, W.J., Riemer, C., et al.: Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715 (2004)

    Article  Google Scholar 

  2. Bray, N., Pachter, L.: MAVID: Constrained ancestral alignment of multiple sequences. Genome Res. 14, 693–699 (2004)

    Article  Google Scholar 

  3. Brejova, B., Brown, D., Vinar, T.: Vector seeds: an extension to spaced seeds allows substantial improvements in sensitivity and specificity. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 39–54. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Brejova, B., Brown, D., Vinar, T.: Optimal spaced seeds for homologous coding regions. J. Bioinf. and Comp. Biol. 1, 595–610 (2004)

    Article  Google Scholar 

  5. Brown, D.: Multiple vector seeds for protein alignment. In: These proceedings

    Google Scholar 

  6. Brudno, M., Chapman, M., Gottgens, B., Batzoglou, S., Morgenstern, B.: Fast and sensitive multiple alignment of large genomic sequences. BMC Bioinf. 4, 66 (2003)

    Article  Google Scholar 

  7. Brudno, M., Do, C., Cooper, G., Kim, M., et al.: LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13, 721–731 (2003)

    Article  Google Scholar 

  8. Brudno, M., Morgenstern, B.: Fast and sensitive alignment of large genomic sequences. In: Proceedings of CSB 2002, pp. 138–147 (2002)

    Google Scholar 

  9. Carrillo, H., Lipman, D.: The multiple sequence alignment problem in biology. SIAM J. Appl. Math. 48, 1073–1082 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  10. Eppstein, D., Giancarlo, R., Galil, Z., Italiano, G.F.: Sparse dynamic programming. I: Linear cost functions; II: Convex and concave cost functions. J. ACM 39 (1992)

    Google Scholar 

  11. Feller, W.: An Introduction to Probability Theory and Its Applications. John Wiley & Sons, New York (1957)

    MATH  Google Scholar 

  12. Fitch, W.M.: Toward defining the course of evolution: minimum change for a specified tree topology. Syst. Zool. 20, 406–416 (1971)

    Article  Google Scholar 

  13. Hohl, M., Kurtz, S., Ohlebusch, E.: Efficient multiple genome alignment. Bioinf. 18, 312–320 (2002)

    Google Scholar 

  14. Kececioglu, J.D., Zhang, W.: Aligning alignments. In: Farach-Colton, M. (ed.) CPM 1998. LNCS, vol. 1448, pp. 189–208. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  15. Keich, U., Li, M., Ma, B., Tromp, J.: On spaced seeds for similarity search. Discrete Appl. Math. 138, 253–263 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  16. Li, M., Ma, B., Kisman, D., Tromp, J.: PatternHunter II: Highly sensitive and fast homology search. J. Bioinf. and Comp. Biol. (2004) (to appear)

    Google Scholar 

  17. Ma, B., Tromp, J., Li, M.: PatternHunter: faster and more sensitive homology search. Bioinf. 18, 440–445 (2002)

    Article  Google Scholar 

  18. Ma, B., Wang, Z., Zhang, K.: Alignment between two multiple alignments. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 254–265. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  19. Morgenstern, B., Dress, A., Werner, T.: Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. 93, 12098–12103 (1996)

    Article  MATH  Google Scholar 

  20. Thompson, J., Higgins, D., Gibson, T.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680 (1994)

    Article  Google Scholar 

  21. Zhang, Y., Waterman, M.: An eulerian path approach to global multiple alignment for DNA sequences. J. Comp. Biol. 10, 803–819 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brown, D.G., Hudek, A.K. (2004). New Algorithms for Multiple DNA Sequence Alignment. In: Jonassen, I., Kim, J. (eds) Algorithms in Bioinformatics. WABI 2004. Lecture Notes in Computer Science(), vol 3240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30219-3_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30219-3_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23018-2

  • Online ISBN: 978-3-540-30219-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics