Skip to main content

GESTALT: Genomic Steiner Alignments

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1645))

Abstract

We describe GESTALT (GEnomic sequences STeiner ALignmenT), a public-domain suite of programs for generating multiple alignments of a set of biosequences.We allow the use of either of the two popular objectives, Tree Alignment or Sum-of-Pairs. The main distinguishing feature of our method is that the alignment is obtained via a tree in which the internal nodes (ancestors) are labeled by Steiner sequences for triples of the input sequences. Given lists of candidate labels for the ancestral sequences, we use dynamic programming to choose an optimal labeling under either objective function. Finally, the fully labeled tree of sequences is turned into into a multiple alignment. Enhancements in our implementation include the traditional space-saving ideas of Hirschberg as well as new data-packing techniques. The running-time bottleneck of computing exact Steiner sequences is handled by a highly effective but much faster heuristic alternative. Finally, other modules in the suite allow automatic generation of linear-program input files that can be used to compute new lower bounds on the optimal values. We also report on some preliminary computational experiments with GESTALT.

Most of this work was done when this author was visiting CMU during Summer’ 98, under a grant from the CMU Faculty Development Fund.

Supported in part by an NSF CAREER grant CCR-9625297

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Altschul and D. Lipman, Trees, Stars and Multiple Sequence Alignment, SIAM J. Appl. Math. 49 (1989) 197–209

    Article  MathSciNet  MATH  Google Scholar 

  2. S. Altschul, D. Lipman and J.D. Kececioglu, A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. USA 86 (1989) 4412–4415

    Google Scholar 

  3. V. Bafna, E.L. Lawler and P. Pevzner. Approximation Algorithms for Multiple Sequence Alignment. Proceedings of the 5th Combinatorial Pattern Matching conference LNCS 807 (1994) 43–53

    MATH  Google Scholar 

  4. H. Carrillo and D. Lipman. The multiple sequence alignment problem in biology. SIAM J. Appl. Math. 49:1 (1989) 197–209

    Article  MathSciNet  MATH  Google Scholar 

  5. S.C. Chan, A.K. C.Wong and D.K.Y. Chiu, “A survey of multiple sequence comparison methods,” Bull. Math. Biol. 54 (1992) 563–598

    Article  MATH  Google Scholar 

  6. D. Feng and R. Doolittle. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J.Molec. Evol. 25 (1987) 351–360

    Article  Google Scholar 

  7. O. Gotoh, Optimal alignment between groups of sequences and its application to multiple sequence alignment, CABIOS 9:3 (1993) 361–370

    Google Scholar 

  8. S.K. Gupta, J. Kececioglu, and A.A. Schaffer, Making the Shortest-Paths Approach to Sum-of-Pairs Multiple Sequence Alignment More Space Efficient in Practice, (extended abstract) Proceedings of the 6th Combinatorial Pattern Matching conference (1995)

    Google Scholar 

  9. D. Gusfield, Efficient methods for multiple sequence alignment with guaranteed error bounds, Bulletin of Mathematical Biology 55 (1993) 141–154

    Article  MATH  Google Scholar 

  10. D. Gusfield and L. Wang, New Uses for Uniform Lifted Alignments, Submitted for publication (1996)

    Google Scholar 

  11. D.G. Higgins, A.J. Bleasby and R. Fuchs, Clustal V: Improved software for multiple sequence alignment, CABIOS 8 (1992) 189–191

    Google Scholar 

  12. D. Hirschberg, A linear space algorithm for computing maximal common subsequences, Communications of the ACM 18 (1975) 341–343

    Article  MathSciNet  MATH  Google Scholar 

  13. T. Jiang and F. Liu, Tree Alignment And Reconstruction application software, Version 1.0, February 1998. Available from http://www.dcss.mcmaster.ca/~fliu.

  14. D. Lipman, S. Altschul and J.D. Kececioglu, A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. USA 86 (1989) 4412–4415

    Google Scholar 

  15. S.B. Needleman and C.D. Wunsch. A general method applicable to search the similarities in the amino acid sequences of two proteins. J. Mol. Biol., 48 (1970) 444

    Google Scholar 

  16. M.A. McClure, T.K. Vasi and W.M. Fitch. Comparative analysis of multiple protein sequence alignment methods, Mol. Biol. Evol. 11 (1994) 571–592

    Google Scholar 

  17. R. Ravi and J. Kececioglu. Approximation algorithms for multiple sequence alignment under a fixed evolutionary tree, Proceedings of the 6th Combinatorial Pattern Matching conference (1995) 330–339

    Google Scholar 

  18. D. Sankoff, Minimal mutation trees of sequences, SIAM J. Applied Math. 28(1) (1975) 35–42

    Article  MathSciNet  MATH  Google Scholar 

  19. D. Sankoff and R. Cedergren, Simultaneous comparison of three or more sequences related by a tree, inD. Sankoff and J. Kruskal editors, Time warps, string edits and macromolecules: the theory and practice of sequence comparison, Addison Wesley (1983) 253–264

    Google Scholar 

  20. D. Sankoff, R. Cedergren and G. Laplame, Frequency of insertion-deletion, transversion, and transition in the evolution of the 5s ribosomal rna, J. Mol. Evol. 7 (1976) 133–149

    Article  Google Scholar 

  21. D. Sankoff, Analytical approaches to genomic evolution, Biochimie 75 (1993) 409–413

    Article  Google Scholar 

  22. T.F. Smith and M.S. Waterman. Comparison of Biosequences. Adv. Appl. Math. (1981) 482–489

    Google Scholar 

  23. W.R. Taylor and D.T. Jones. Deriving an Amino Acid Distance Matrix, J. Theor. Biol. 164 (1993) 65–83

    Article  Google Scholar 

  24. M. Vingron and P. Argos. A fast and sensitive multiple sequence alignment algorithm. Comput. Appl. Biosci. 5 (1989) 115–121

    Google Scholar 

  25. L. Wang and D. Gusfield. Improved Approximation Algorithms for Tree Alignment, Proceedings of the 7th Combinatorial Pattern Matching conference (1996) 220–233

    Google Scholar 

  26. L. Wang and T. Jiang. On the complexity of multiple sequence alignment, J. Comp. Biol. 1 (1994) 337–348

    Article  Google Scholar 

  27. L. Wang, T. Jiang and E.L. Lawler. Aligning sequences via an evolutionary tree: complexity and approximation, Algorithmica, to appear. Also presented at the 26th ACM Symp. on Theory of Computing (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lancia, G., Ravi, R. (1999). GESTALT: Genomic Steiner Alignments. In: Crochemore, M., Paterson, M. (eds) Combinatorial Pattern Matching. CPM 1999. Lecture Notes in Computer Science, vol 1645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48452-3_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-48452-3_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66278-5

  • Online ISBN: 978-3-540-48452-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics