Pattern Matching in RNA Structures

  • Kejie Li
  • Reazur Rahman
  • Aditi Gupta
  • Prasad Siddavatam
  • Michael Gribskov
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4983)


RNA plays key roles in many biological processes, and its function depends largely on its three-dimensional structure. We describe a comparative approach to learning biologically important RNA structures, including those that are not the predicted minimum free energy (MFE) structure. Our approach identifies the greatest conserved structure(s) in a set of RNA sequences, even in the presence of sequences that have no conserved features. We convert RNA structures to a graph representation (XIOS RNA graph) that includes pseudoknots, and mutually exclusive structures, thereby simultaneously representing ensembles of RNA structures. By modifying existing algorithms for maximal subgraph isomorphism, we can identify the similar portions of the graphs and integrate this with MFE structure prediction tools to identify biologically relevant near-MFE conserved structures.


Minimum Free Energy Depth First Search Isomorphous Subgraph Relevance Ranking Stochastic Context Free Grammar 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Birney, E., Stamatoyannopoulos, J.A., Dutta, A., Guigo, R., Gingeras, T.R., Margulies, E.H., Weng, Z., Snyder, M., Dermitzakis, E.T., Thurman, R.E., et al.: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447(7146), 799–816 (2007)CrossRefGoogle Scholar
  2. 2.
    Zarrinkar, P.P., Williamson, J.R.: The kinetic folding pathway of the Tetrahymena ribozyme reveals possible similarities between RNA and protein folding. Nature structural biology 3(5), 432–438 (1996)CrossRefGoogle Scholar
  3. 3.
    Doherty, E.A., Doudna, J.A.: The P4-P6 domain directs higher order folding of the Tetrahymena ribozyme core. Biochemistry 36(11), 3159–3169 (1997)CrossRefGoogle Scholar
  4. 4.
    Zuker, M.: On finding all suboptimal foldings of an RNA molecule. Science 244(4900), 48–52 (1989)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Wuchty, S., Fontana, W., Hofacker, I.L., Schuster, P.: Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers 49(2), 145–165 (1999)CrossRefGoogle Scholar
  6. 6.
    Staple, D.W., Butcher, S.E.: Pseudoknots: RNA structures with diverse functions. PLoS biology 3(6), 213 (2005)CrossRefGoogle Scholar
  7. 7.
    Reeder, J., Giegerich, R.: Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics. BMC bioinformatics 5, 104 (2004)CrossRefGoogle Scholar
  8. 8.
    Mathews, D.H., Disney, M.D., Childs, J.L., Schroeder, S.J., Zuker, M., Turner, D.H.: Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proceedings of the National Academy of Sciences 101(19), 7287–7292 (2004)CrossRefGoogle Scholar
  9. 9.
    Gan, H.H., Pasquali, S., Schlick, T.: Exploring the repertoire of RNA secondary motifs using graph theory; implications for RNA design. Nucl. Acids Res. 31(11), 2926–2943 (2003)CrossRefGoogle Scholar
  10. 10.
    Kim, N., Shiffeldrim, N., Gan, H.H., Schlick, T.: Candidates for Novel RNA Topologies. Journal of molecular biology 341(5), 1129–1144 (2004)CrossRefGoogle Scholar
  11. 11.
    Ivo, L.F.H., Peter, F.S., Sebastian, B.L., Manfred, T., Peter, S.: Sebastian, Tacker Manfred, and Schuster Peter: Fast Folding and Comparison of RNA Secondary Structures. MonatshChem 125, 167–188 (1994)Google Scholar
  12. 12.
    Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic acids research 9(1), 133–148 (1981)CrossRefGoogle Scholar
  13. 13.
    Yan, X., Han, J.: gSpan: Graph-Based Substructure Pattern Mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), p. 721. IEEE Computer Society, Los Alamitos (2002)Google Scholar
  14. 14.
    Yan, X., Han, J.: CloseGraph: Mining closed frequent graph patterns. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, D.C. ACM, New York (2003)Google Scholar
  15. 15.
    Zaki, M.J.: Efficiently mining frequent trees in a forest. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, Edmonton, Alberta, Canada, ACM Press, New York (2002)Google Scholar
  16. 16.
    Jaeger, J.A., Turner, D.H., Zuker, M.: Improved predictions of secondary structures for RNA. Proceedings of the National Academy of Sciences of the United States of America 86(20), 7706–7710 (1989)CrossRefGoogle Scholar
  17. 17.
    Wang, Z., Zhang, K.: Alignment between Two RNA Structures. In: Mathematical Foundations of Computer Science 2001, p. 690 (2001)Google Scholar
  18. 18.
    Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al.: Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nature genetics 25(1), 25–29 (2000)CrossRefGoogle Scholar
  19. 19.
    Grate, L., Herbster, M., Hughey, R., Haussler, D., Mian, I.S., Noller, H.: RNA modeling using Gibbs sampling and stochastic context free grammars. In: Proceedings / International Conference on Intelligent Systems for Molecular Biology; ISMB, vol. 2, pp. 138–146 (1994)Google Scholar
  20. 20.
    Lowe, T.M., Eddy, S.R.: tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence. Nucl. Acids Res. 25(5), 955–964 (1997)CrossRefGoogle Scholar
  21. 21.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. Journal of molecular biology 215(3), 403–410 (1990)Google Scholar
  22. 22.
    Pudlák, P., Rödl, V., Savický, P.: Graph complexity. Acta Informatica 25(5), 515–535 (1988)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Kejie Li
    • 1
  • Reazur Rahman
    • 1
  • Aditi Gupta
    • 1
  • Prasad Siddavatam
    • 1
  • Michael Gribskov
    • 1
  1. 1.Department of Biological SciencesPurdue University, Lilly Hall of Life SciencesUSA

Personalised recommendations