Skip to main content

Inverse Sequence Alignment from Partial Examples

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4645))

Abstract

When aligning biological sequences, the choice of parameter values for the alignment scoring function is critical. Small changes in gap penalties, for example, can yield radically different alignments. A rigorous way to compute parameter values that are appropriate for biological sequences is inverse parametric sequence alignment. Given a collection of examples of biologically correct alignments, this is the problem of finding parameter values that make the example alignments score close to optimal. We extend prior work on inverse alignment to partial examples and to an improved model based on minimizing the average error of the examples. Experiments on benchmark biological alignments show we can find parameters that generalize across protein families and that boost the recovery rate for multiple sequence alignment by up to 25%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balaji, S., Sujatha, S., Kumar, S.S.C., Srinivasan, N.: PALI: a database of alignments and phylogeny of homologous protein structures. Nucleic Acids Research 29(1), 61–65 (2001)

    Article  Google Scholar 

  2. Cook, W., Cunningham, W., Pulleyblank, W., Schrijver, A.: Combinatorial Optimization. John Wiley and Sons, New York (1998)

    MATH  Google Scholar 

  3. Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. In: Dayhoff, M.O. (ed.) Atlas of Protein Sequence and Structure, Washington DC. National Biomedical Research Foundation, vol. 5(3), pp. 345–352 (1978)

    Google Scholar 

  4. Do, C., Gross, S., Batzoglou, S.: CONTRAlign: discriminative training for protein sequence alignment. In: Proceedings of the 10th ACM Conference on Research in Computational Molecular Biology, pp. 160–174. ACM Press, New York (2006)

    Chapter  Google Scholar 

  5. Eppstein, D.: Setting parameters by example. SIAM Journal on Computing 32(3), 643–653 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  6. Gusfield, D., Stelling, P.: Parametric and inverse-parametric sequence alignment with XPARAL. Methods in Enzymology 266, 481–494 (1996)

    Article  Google Scholar 

  7. Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. National Academy of Sciences USA 89, 10915–10919 (1992)

    Article  Google Scholar 

  8. Kececioglu, J., Kim, E.: Simple and fast inverse alignment. In: Proc. 10th ACM Conference on Research in Computational Molecular Biology, pp. 441–455. ACM Press, New York (2006)

    Chapter  Google Scholar 

  9. Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology 247, 536–540 (1995)

    Google Scholar 

  10. Sun, F., Fernández-Baca, D., Yu, W.: Inverse parametric sequence alignment. Journal of Algorithms 53, 36–54 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  11. Wheeler, T., Kececioglu, J.: Multiple alignment by aligning alignments. In: Proc. 15th Conference on Intelligent Systems for Molecular Biology (2007)

    Google Scholar 

  12. Yu, C.-N., Joachims, T., Elber, R., Pillardy, J.: Support vector training of protein alignment models. In: Proceedings of the 11th ACM Conference on Research in Computational Molecular Biology, pp. 253–267. ACM Press, New York (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Raffaele Giancarlo Sridhar Hannenhalli

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, E., Kececioglu, J. (2007). Inverse Sequence Alignment from Partial Examples. In: Giancarlo, R., Hannenhalli, S. (eds) Algorithms in Bioinformatics. WABI 2007. Lecture Notes in Computer Science(), vol 4645. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74126-8_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74126-8_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74125-1

  • Online ISBN: 978-3-540-74126-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics