Advertisement

A Worst-Case and Practical Speedup for the RNA Co-folding Problem Using the Four-Russians Idea

  • Yelena Frid
  • Dan Gusfield
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6293)

Abstract

The computational formulation for finding the optimal simultaneous alignment and fold (optimal Co-fold) of RNA sequences was first introduced by Sankoff in 1985. Since then the importance of Co-Folding has grown as conservation of structure and its relationship to function have been widely observed in RNA. For two sequences, the computation time of Sankoff’s Algorithm is θ(N 6). Existing literature on cofolding attempts to improve efficiency through simplifying the original problem formulation.

We present here a practical and worst-case speed up using the Four-Russians method, without placing any added constraints on the types of alignments or folds allowed. Our algorithm, Fast Cofold, finds the optimal Co-fold in O(N 6/log(N 2))-time, a speedup which is observed in practice.

Because the solution matrix produced by our algorithm is identical to the one produced by the Sankoff algorithm, the contribution of the algorithm lays not only in its standalone practicality but also in the ability to implement it alongside heuristic speed ups leading to even greater reductions in time.

Keywords

Branch Point Asymptotic Time Great Speedup Branch Function Russian Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Backofen, R., Landau, G.M., Möhl, M., Tsur, D., Weimann, O.: Fast RNA structure alignment for crossing input structures. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 236–248. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Backofen, R., Tsur, D., Zakov, S., Ziv-Ukelson, M.: Sparse RNA folding: Time and space efficient algorithms. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 249–262. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  3. 3.
    Dowell, R., Eddy, S.: Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints. BMC Bioinformatics 7(1), 400 (2006)CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Eddy, S.R.: Computational genomics of noncoding RNA genes. Cell 109(2), 137–140 (2002)CrossRefPubMedGoogle Scholar
  5. 5.
    Eddy, S.R., Durbin, R.: RNA sequence analysis using covariance models. Nucl. Acids Res. 22(11), 2079–2088 (1994)CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Frid, Y., Gusfield, D.: A simple, practical and complete O(n\(^{\mbox{3}}\)/log(n)) -time algorithm for RNA folding using the four russians speedup. In: Salzberg, S.L., Warnow, T. (eds.) WABI 2009. LNCS, vol. 5724, pp. 97–107. Springer, Heidelberg (2009)Google Scholar
  7. 7.
    Gorodkin, J., Heyer, L.J., Stormo, G.D.: Finding common sequence and structure motifs in a set of RNA sequences. In: ISMB, pp. 120–123 (1997)Google Scholar
  8. 8.
    Hofacker, I.L., Fontana, W., Stadler, P.F., Bonhoeffer, S.L., Tacker, M., Schuster, P.: Fast folding and comparison of RNA secondary structures. Chemical Monthly 125, 167–188 (1994)CrossRefGoogle Scholar
  9. 9.
    Mathews, D.H., Turner, D.H.: Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. Journal of Molecular Biology 317(2), 191–203 (2002)CrossRefPubMedGoogle Scholar
  10. 10.
    Nussinov, R., Pieczenik, G., Griggs, J.R., Kleitman, D.J.: Algorithms for loop matchings. SIAM Journal on Applied Mathematics 35(1), 68–82 (1978)CrossRefGoogle Scholar
  11. 11.
    Pedersen, J.S., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh, K., Lander, E.S., Kent, J., Miller, W., Haussler, D.: Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput Biol. 2(4), e33 (2006)Google Scholar
  12. 12.
    Rivas, E., Eddy, S.: Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics 2(1), 8 (2001)CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Rose, D., Hackermuller, J., Washietl, S., Reiche, K., Hertel, J., FindeiSZ, S., Stadler, P., Prohaska, S.: Computational rnomics of drosophilids. BMC Genomics 8(1), 406 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Sankoff, D.: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM Journal on Applied Mathematics 45(5), 810–825 (1985)CrossRefGoogle Scholar
  15. 15.
    Seemann, S.E., Gorodkin, J., Backofen, R.: Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments. In: NAR (2008)Google Scholar
  16. 16.
    Torarinsson, E., Yao, Z., Wiklund, E.D., Bramsen, J.B., Hansen, C., Kjems, J., Tommerup, N., Ruzzo, W.L., Gorodkin, J.: Comparative genomics beyond sequence-based alignments: RNA structures in the encode regions. Genome Res. 18(2), 242–251 (2008)CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Torarinsson, E., Havgaard, J.H., Gorodkin, J.: Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23(8), 926–932 (2007)CrossRefPubMedGoogle Scholar
  18. 18.
    Washietl, S., Hofacker, I.L.: Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. Journal of Molecular Biology 342(1), 19–30 (2004)CrossRefPubMedGoogle Scholar
  19. 19.
    Ziv-Ukelson, M., Gat-Viks, I., Wexler, Y., Shamir, R.: A faster algorithm for RNA co-folding. In: Crandall, K.A., Lagergren, J. (eds.) WABI 2008. LNCS (LNBI), vol. 5251, pp. 174–185. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Yelena Frid
    • 1
  • Dan Gusfield
    • 1
  1. 1.Department of Computer ScienceU.C. DavisUSA

Personalised recommendations