A Faster and More Space-Efficient Algorithm for Inferring Arc-Annotations of RNA Sequences Through Alignment

  • Jesper Jansson
  • See-Kiong Ng
  • Wing-Kin Sung
  • Hugo Willy
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3240)


This paper considers the problem of inferring the optimal nested arc-annotation of a sequence given another nested arc-annotated sequence by maximizing the weighted alignment between the bases and arcs in the two sequences. The problem has a direct application in predicting the secondary structure of an RNA sequence given a closely related sequence whose secondary structure is already known. The currently most efficient algorithm for this problem requires O(nm 3) time and O(nm 2) space where n is the length of the sequence with known arc-annotation while m is the length of the sequence to be inferred. We present an improved algorithm which runs in min {O(nm 2 logn), O(nm 3)} time and min {O(m 2 + mn), O(m 2 logn)} space. The time improvement is achieved by applying sparsification to the dynamic programming algorithm, while the space is reduced to a more practical quadratic complexity by using a Hirschberg-like traceback technique together with a simple compression.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alber, J., Gramm, J., Guo, J., Niedermeier, R.: Towards optimally solving the longest common subsequence problem for sequences with nested arc annotations in linear time. In: CPM, pp. 99–114 (2002)Google Scholar
  2. 2.
    Bafna, V., Muthukrishnan, S., Ravi, R.: Computing similarity between RNA strings. In: CPM, vol. 937, pp. 1–16 (1995)Google Scholar
  3. 3.
    Carey, R.B., Stormo, G.D.: Graph-theoretic approach to RNA modeling using comparative data. In: ISMB, pp. 75–80 (1995)Google Scholar
  4. 4.
    Evans, P.A.: Algorithms and Complexity for Annotated Sequence Analysis. PhD Thesis, University of Victoria (1999)Google Scholar
  5. 5.
    Fu, W., Hon, W.K., Sung, W.K.: On all-substrings alignment problems. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 80–89. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Gramm, J., Guo, J., Niedermeier, R.: Pattern matching for arc-annotated sequences. In: Agrawal, M., Seth, A.K. (eds.) FSTTCS 2002. LNCS, vol. 2556, pp. 182–193. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Grate, L., Herbster, M., Hughey, R., Mian, I.S., Noller, H., Haussler, D.: RNA modeling using Gibbs sampling and stochastic context free grammars. In: ISMB, pp. 138–146 (1994)Google Scholar
  8. 8.
    Grate, L.: Automatic RNA secondary structure determination with stochastic context-free grammars. In: ISMB, pp. 136–144 (1995)Google Scholar
  9. 9.
    Gutell, R.R., Larsen, N., Woese, C.R.: Lessons from an evolving rRNA: 16S and 23S rRNA structures from a comparative perspective. Microbiological Reviews 58(1), 10–26 (1994)Google Scholar
  10. 10.
    Hirschberg, D.S.: Algorithms for the longest common subsequence problem. J. Association of Computing Machinery 24(4), 664–675 (1977)zbMATHMathSciNetGoogle Scholar
  11. 11.
    Jiang, T., Lin, G.H., Ma, B., Zhang, K.: The longest common subsequence problem for arc-annotated sequences. CPM 1848, 154–165 (2000), To appear in Journal of Discrete AlgorithmsMathSciNetGoogle Scholar
  12. 12.
    Konings, D.A.M., Gutell, R.R.: A comparison of thermodynamic foldings with comparatively derived structures of 16s and 16s-like rRNAs. RNA 1, 559–574 (1995)Google Scholar
  13. 13.
    Lin, G.H., Chen, Z.Z., Jiang, T., Wen, J.: The longest common subsequence problem for sequences with nested arc annotation. Journal of Computer and System Sciences 65, 465–480 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Lin, G.H., Ma, B., Zhang, K.: Edit distance between two RNA structures. In: RECOMB, pp. 211–200 (2001)Google Scholar
  15. 15.
    Lyngsø, R.B., Zuker, M., Pedersen, C.N.S.: Internal loops in RNA secondary structure prediction. In: RECOMB, pp. 260–267 (1999)Google Scholar
  16. 16.
    Nussinov, R., Jacobson, A.B.: Fast algorithm for predicting the secondary structure of single stranded RNA. PNAS 77(11), 6309–6313 (1980)CrossRefGoogle Scholar
  17. 17.
    Sakakibara, Y., Brown, M., Hughey, R., Mian, I.S., Sjölander, K., Underwood, R.C., Haussler, D.: Recent methods for RNA modeling using stochastic contextfree grammars. In: Proc. of the Asilomar Conference on Combinatorial Pattern Matching (1994)Google Scholar
  18. 18.
    Tabaska, J.E., Gabow, H.N., Cary, R.B., Stormo, G.D.: An RNA folding method capable of identifying pseudoknots and base triples. Bioinformatics 14(8), 691–699 (1998)CrossRefGoogle Scholar
  19. 19.
    Zhang, K.: Computing similarity between RNA secondary structures. In: IEEE International Joint Symposia on Intelligence and Systems, pp. 126–132 (1998)Google Scholar
  20. 20.
    Zuker, M.: Prediction of RNA secondary structure by energy minimization. Methods in Molecular Biology 25, 267–294 (1994)Google Scholar
  21. 21.
    Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acid Res. 9, 133–148 (1981)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Jesper Jansson
    • 1
  • See-Kiong Ng
    • 2
  • Wing-Kin Sung
    • 1
  • Hugo Willy
    • 1
  1. 1.Department of Computer ScienceNational University of SingaporeSingapore
  2. 2.Institute for Infocomm ResearchSingapore

Personalised recommendations