Skip to main content
Log in

Predicting RNA Secondary Structure Using Profile Stochastic Context-Free Grammars and Phylogenic Analysis

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Stochastic context-free grammars (SCFGs) have been applied to predicting RNA secondary structure. The prediction of RNA secondary structure can be facilitated by incorporating with comparative sequence analysis. However, most of existing SCFG-based methods lack explicit phylogenic analysis of homologous RNA sequences, which is probably the reason why these methods are not ideal in practical application. Hence, we present a new SCFG-based method by integrating phylogenic analysis with the newly defined profile SCFG. The method can be summarized as: 1) we define a new profile SCFG, M, to depict consensus secondary structure of multiple RNA sequence alignment; 2) we introduce two distinct hidden Markov models, λ and λ′, to perform phylogenic analysis of homologous RNA sequences. Here, λ is for non-structural regions of the sequence and λ′ is for structural regions of the sequence; 3) we merge λ and λ′ into M to devise a combined model for prediction of RNA secondary structure. We tested our method on data sets constructed from the Rfam database. The sensitivity and specificity of our method are more accurate than those of the predictions by Pfold.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Storz G. An expanding universe of noncoding RNAs. Science, 2002, 296(5571): 1260–1263.

    Article  Google Scholar 

  2. Eddy S R. Non-coding RNA genes and modern RNA world. Nat. RevGenet, 2001, 2(12): 919–929.

    Article  Google Scholar 

  3. Huttenhofer A, Schattner P, Polacek N. Non-coding RNAs: Hope or hype? TRENDS in Genetics, 2005, 21(5): 289–297.

    Article  Google Scholar 

  4. Furtig B et al. NMR spectroscopy of RNA. Chembiochem, 2003, 4(10): 936–962.

    Article  Google Scholar 

  5. Gardner P P, Giegerich G. A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics, 2004, 5: 140–157.

    Article  Google Scholar 

  6. Zuker M, Stiegler P. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research, 1981, 9(1): 133–148.

    Article  Google Scholar 

  7. Hofacker I, Fekete M, Stadler P. Secondary structure prediction for aligned RNA sequences. Journal of Molecular Biology, 2002, 319(5): 1059–1066.

    Article  Google Scholar 

  8. Sakakibara Y et al. Stochastic context-free grammars for tRNA modeling. Nucleic Acids Research, 1994, 22(23): 5112–5120.

    Article  Google Scholar 

  9. Eddy S R, Durbin R. RNA sequence analysis using covariance models. Nucleic Acids Research, 1994, 22(11): 2079–2088.

    Article  Google Scholar 

  10. Dowell R, Eddy S. Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction. BMC Bioinformatics, 2004, 5: 71–84.

    Article  Google Scholar 

  11. Dowell R, Eddy S. Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints. BMC Bioinformatics, 2006, 7: 400–417.

    Article  Google Scholar 

  12. Knudsen B, Hein J. RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics, 1999, 15(6): 446–454.

    Article  Google Scholar 

  13. Knudsen B, Hein J. Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Research, 2003, 31(13): 3423–3428.

    Article  Google Scholar 

  14. Do C B, Woods D A, Batzoglou S. CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics, 2006, 22(14): e90–e98.

    Article  Google Scholar 

  15. Durbin R et al. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge: Cambridge University Press, 1998, pp.233–297.

    MATH  Google Scholar 

  16. Pace N R, Thomas B C, Woese C R. Probing RNA Structure, Function, and History by Comparative Analysis. The RNA World, 2nd edition, NY: Cold Spring Harbor Laboratory Press, 1999, pp.113–141.

    Google Scholar 

  17. Sam G J, Alex B et al. Rfam: An RNA family database. Nucleic Acids Research, 2003, 31(1): 439–441.

    Article  Google Scholar 

  18. Xiaoyong Fang et al. The detection and assessment of possible RNA secondary structure using multiple sequence alignment. In Proc. the 22nd Annual ACM Symposium on Applied Computing, Seoul, Korea, March 11–15, 2007, pp.133–137.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao-Yong Fang.

Additional information

Supported by the National Natural Science Foundation of China under Grant No. 60673018.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 251 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, XY., Luo, ZG. & Wang, ZH. Predicting RNA Secondary Structure Using Profile Stochastic Context-Free Grammars and Phylogenic Analysis. J. Comput. Sci. Technol. 23, 582–589 (2008). https://doi.org/10.1007/s11390-008-9154-7

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-008-9154-7

Keywords

Navigation