Skip to main content

RNA Structural Alignments, Part I: Sankoff-Based Approaches for Structural Alignments

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1097))

Abstract

Simultaneous alignment and secondary structure prediction of RNA sequences is often referred to as “RNA structural alignment.” A class of the methods for structural alignment is based on the principles proposed by Sankoff more than 25 years ago. The Sankoff algorithm simultaneously folds and aligns two or more sequences. The advantage of this algorithm over those that separate the folding and alignment steps is that it makes better predictions. The disadvantage is that it is slower and requires more computer memory to run. The amount of computational resources needed to run the Sankoff algorithm is so high that it took more than a decade before the first implementation of a Sankoff style algorithm was published. However, with the faster computers available today and the improved heuristics used in the implementations the Sankoff-based methods have become practical. This chapter describes the methods based on the Sankoff algorithm. All the practical implementations of the algorithm use heuristics to make them run in reasonable time and memory. These heuristics are also described in this chapter.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Gardner PP, Wilm A, Washietl S (2005) A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Res 33(8):2433–2439

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  2. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23(21):2947–2948

    Article  CAS  PubMed  Google Scholar 

  3. Washietl S, Hofacker IL (2004) Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. J Mol Biol 342(1): 19–30

    Article  CAS  PubMed  Google Scholar 

  4. Menzel P, Gorodkin J, Stadler PF (2009) The tedious task of finding homologous noncoding RNA genes. RNA 15(12):2075–2082

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  5. Sankoff D (1985) Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J Appl Math 45(5): 810–825

    Article  Google Scholar 

  6. Klein RJ, Eddy SR (2003) RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinformatics 4(1):44

    Article  PubMed Central  PubMed  Google Scholar 

  7. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89(22):10915–10919

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  8. Higgins DG, Sharp PM (1988) CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73(1):237–244

    Article  CAS  PubMed  Google Scholar 

  9. Hofacker IL, Bernhart SH, Stadler PF (2004) Alignment of RNA base pairing probability matrices. Bioinformatics 20(14):2222–2227

    Article  CAS  PubMed  Google Scholar 

  10. Bradley RK, Pachter L, Holmes I (2008) Specific alignment of structured RNA: stochastic grammars and sequence annealing. Bioinformatics 24(23):2677–2683

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  11. Gorodkin J, Heyer LJ, Stormo GD (1997) Finding the most significant common sequence and structure motifs in a set of RNA sequences. Nucleic Acids Res 25(18):3724–3732

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  12. Gorodkin J, Stricklin SL, Stormo GD (2001) Discovering common stem-loop motifs in unaligned RNA sequences. Nucleic Acids Res 29(10):2135–2144

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  13. Havgaard JH, Lyngso RB, Stormo GD, Gorodkin J (2005) Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21(9): 1815–1824

    Article  CAS  PubMed  Google Scholar 

  14. Havgaard JH, Torarinsson E, Gorodkin J (2007) Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. PLoS Comput Biol 3(10):1896–1908

    CAS  PubMed  Google Scholar 

  15. Mathews DH, Turner DH (2002) Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J Mol Biol 317(2):191–203

    Article  CAS  PubMed  Google Scholar 

  16. Mathews D (2004) Predicting the secondary structure common to two RNA sequences with Dynalign. Curr Protoc Bioinformatics. Unit 12.4

    Google Scholar 

  17. Mathews DH (2005) Predicting a set of minimal free energy RNA secondary structures common to two sequences. Bioinformatics 21(10):2246–2253

    Article  CAS  PubMed  Google Scholar 

  18. Harmanci AO, Sharma G, Mathews DH (2007) Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign. BMC Bioinformatics 8:130

    Article  PubMed Central  PubMed  Google Scholar 

  19. Will S, Reiche K, Hofacker IL, Stadler PF, Backofen R (2007) Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering. PLoS Comput Biol 3(4):e65

    Article  PubMed Central  PubMed  Google Scholar 

  20. Will S, Joshi T, Hofacker IL, Stadler PF, Backofen R (2012) LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. RNA 18(5):900–914

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  21. Gorodkin J, Hofacker IL, Torarinsson E, Yao Z, Havgaard JH, Ruzzo WL (2010) De novo prediction of structured RNAs from genomic sequences. Trends Biotechnol 28(1):9–19

    Article  CAS  PubMed  Google Scholar 

  22. Kiryu H, Tabei Y, Kin T, Asai K (2007) Murlet: a practical multiple alignment tool for structural RNA sequences. Bioinformatics 23(13):1588–1598

    Article  CAS  PubMed  Google Scholar 

  23. Torarinsson E, Havgaard JH, Gorodkin J (2007) Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23(8):926–932

    Article  CAS  PubMed  Google Scholar 

  24. Dowell RD, Eddy SR (2004) Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction. BMC Bioinformatics 5(1):71

    Article  PubMed Central  PubMed  Google Scholar 

  25. Rivas E, Lang R, Eddy SR (2012) A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more. RNA 18(2):193–212

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  26. Dowell RD, Eddy SR (2006) Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints. BMC Bioinformatics 7:400

    Article  PubMed Central  PubMed  Google Scholar 

  27. Holmes I (2005) Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics 6:73

    Article  PubMed Central  PubMed  Google Scholar 

  28. Torarinsson E, Sawera M, Havgaard JH, Fredholm M, Gorodkin J (2006) Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure. Genome Res 16(7):885–889

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  29. Uzilov AV, Keegan JM, Mathews DH (2006) Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics 7(1):173

    Article  PubMed Central  PubMed  Google Scholar 

  30. Torarinsson E, Lindgreen S (2008) WAR: Webserver for aligning structural RNAs. Nucleic Acids Res 36(Web server issue):W79–W84

    Google Scholar 

  31. Gorodkin J, Hofacker IL (2011) From structure prediction to genomic screens for novel non-coding RNAs. PLoS Comput Biol 7(8):e1002100

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  32. Meyer IM, Mikls I (2007) SimulFold: simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework. PLoS Comput Biol 3(8):e149

    Article  PubMed Central  PubMed  Google Scholar 

  33. Menzel P, Seemann SE, Gorodkin J (2012) RILogo: visualising RNA-RNA interactions. Bioinformatics 28(19):2523–2526

    Article  CAS  PubMed  Google Scholar 

  34. Gardner PP, Daub J, Tate J, Moore BL, Osuch IH, Griffiths-Jones S, Finn RD, Nawrocki EP, Kolbe DL, Eddy SR, Bateman A (2011) Rfam: Wikipedia, clans and the “decimal” release. Nucleic Acids Res 39(Database issue):D141–D145

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  35. Widmann J, Stombaugh J, McDonald D, Chocholousova J, Gardner P, Iyer MK, Liu Z, Lozupone CA, Quinn J, Smit S, Wikman S, Zaneveld JR, Knight R (2012) RNASTAR: an RNA STructural Alignment Repository that provides insight into the evolution of natural and artificial RNAs. RNA 18(7):1319– 1327

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  36. Breaker RR (2011) Prospects for riboswitch discovery and analysis. Mol Cell 43(6):867–879

    Article  CAS  PubMed  Google Scholar 

  37. Ding Y, Lawrence CE (2003) A statistical sampling algorithm for RNA secondary structure prediction Nucleic Acids Res 31(24):7280–7301

    CAS  Google Scholar 

  38. Voss B (2006) Structural analysis of aligned RNAs. Nucleic Acids Res 34(19):5471– 5481

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  39. Harmanci AO, Sharma G, Mathews DH (2009) Stochastic sampling of the RNA structural alignment space. Nucleic Acids Res 37(12):4063–4075

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  40. Höner zu Siederdissen C, Bernhart SH, Stadler PF, Hofacker IL (2011) A folding algorithm for extended RNA secondary structures. Bioinformatics 27(13):i129– i136

    Google Scholar 

  41. Washietl S, Hofacker IL, Stadler PF, Kellis M (2012) RNA folding with soft constraints: reconciliation of probing data and thermodynamic secondary structure prediction. Nucleic Acids Res 40(10):4261–4272

    Article  CAS  PubMed Central  PubMed  Google Scholar 

Download references

Acknowledgements

This work is supported by the Danish Council for Independent Research (Technology and Production Sciences), the Danish Council for Strategic Research (Programme Commission on Strategic Growth Technologies), as well as the Danish Center for Scientific Computing.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this protocol

Cite this protocol

Havgaard, J.H., Gorodkin, J. (2014). RNA Structural Alignments, Part I: Sankoff-Based Approaches for Structural Alignments. In: Gorodkin, J., Ruzzo, W. (eds) RNA Sequence, Structure, and Function: Computational and Bioinformatic Methods. Methods in Molecular Biology, vol 1097. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-709-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-709-9_13

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-708-2

  • Online ISBN: 978-1-62703-709-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics