Abstract
Studying the structure of RNA sequences is an important problem that helps in understanding the functional properties of RNA. Pseudoknot is one type of RNA structures that cannot be modeled with Context Free Grammars (CFG) because it exhibits crossing dependencies. Pseudoknot structures have functional importance since they appear, for example, in viral genome RNAs and ribozyme active sites. Tree Adjoining Grammars (TAG) is one example of a grammatical model that is more expressive than CFG and has the capability of dealing with crossing dependencies. In this paper, we describe a new inference algorithm for TAGRNA, a sub-model of TAG. We also introduce an RNA structure identification framework, TAGRNAInf, within which the TAGRNA inference algorithm constitutes the core of the training phase. We present the results of using the proposed framework for identifying RNA sequences with pseudoknot structures. Our results outperform those reported in [14] for the same problem that employs a different grammatical formalism.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Al Seesi, S.: Pseudoknot Identification through Learning TAG RNA , BECAT-CSE Technical Report, University of Connecticut (April 2008)
Akutsu, T.: Dynamic Programming Algorithms for RNA Secondary Structure Prediction with Pseudoknots. Discrete Applied Mathematics 104, 45–62 (2000)
Ambros, V., Bartel, B., Bartel, D.P., Burge, C.B., Carrington, J.C., Chen, X., Dreyfuss, G., Eddy, S.R., Griffiths-Jones, S., Marshall, M., Matzke, M., Ruvkun, G., Tuschl, T.: A Uniform System for microRNA Annotation. RNA 9(3), 277–279 (2003)
van Batenburg, F.H.D., Gultyaev, A.P., Pleij, C.W.A., Ng, J., Oliehoek, J.: Pseudobase: a Database with RNA Pseudoknots. Nucl. Acids Res. 28(1), 201–204 (2000)
Brazma, A., Jonassen, I., Vilo, J., Ukkonen, E.: Pattern Discovery in Biosequences. In: Honavar, V., Slutzki, G. (eds.) ICGI 1998. LNCS (LNAI), vol. 1433, pp. 255–270. Springer, Heidelberg (1998)
Buratti, E., Dhir, A., Lewandowska, M.A., Baralle, F.E.: RNA Structure is a Key Regulatory Element in Pathological ATM and CFTR Pseudoexon Inclusion Events. Nucl. Acids Res. 35(13), 4369–4383 (2007)
Cai, L., Malmberg, R., Wu, Y.: Stochastic Modeling of RNA Pseudoknotted Structures: a Grammatical Approach. Bioinformatics 19(supp. 1), 66–73 (2003)
Dirks, R.M., Pierce, N.A.: A Partition Function Algorithm for Nucleic Acid Secondary Structure Including Pseudoknots. J. Comput. Chem. 24(13), 1664–1677 (2003)
Gilbert, W.: The RNA World. Nature 319, 618 (1986)
Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S.R., Bateman, A.: Rfam: Annotating Non-coding RNAs in Complete Genomes. Nucl. Acids Res. 33, D121–D124 (2005)
Holbrook, S.R.: RNA Structure: the Long and the Short of it. Current Opinion in Structural Biology 15, 302–308 (2005)
Joshi, A.K., Levy, L., Takahashi, M.: Tree Adjunct Grammars. Journal of Computer and System Sciences 10, 136–163 (1975)
Laxminarayana, J.A., Nagaraja, G., Balaji, P.V.: Identification of Pseudoknots in RNA Secondary Structures: A Grammatical Inference Approach. In: Mukherjee, D.P., Pal, S. (eds.) Proceedings of 5th International Conference on Advances in Pattern Recognition (2003)
Laxminarayana, J.A., Nagaraja, G., Balaji, P.V.: Inference of a Subclass of Even Linear Languages and its Application to Pseudoknot Identification. In: Department of Computer Science and Engineering, Indian Institute of Technology, Bombay, India (manuscript, 2003)
Paillart, J.C., Skripkin, E., Ehresmann, B., Ehresmann, C., Marquet, R.: In vitro Evidence for a Long Range Pseudoknot in the 5’-Untranslated and Matrix Coding regions of HIV-1 Genomic RNA. J. Biol. Chem. 277, 5995–6004 (2002)
Pedersen, J.S., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh, K., Lander, E.S., Kent, J., Miller, W., Haussler, D.: Identification and Classification of Conserved RNA Secondary Structures in the Human Genome. Public Library of Science. Computational Biology 2(4), 33 (2006)
Rajasekaran, S.: Tree-Adjoining Language Parsing in o(n6) Time. SIAM Journal on Computing 25(4), 862–873 (1996)
Reeder, J., Giegerich, R.: Design, Implementation and Evaluation of a Practical Pseudoknot Folding Algorithm Based on Thermodynamics. BMC Bioinformatics 5, 104 (2004)
Rivas, E., Eddy, S.: The Language of RNA: a Formal Grammar that Includes Pseudoknots. Bioinformatics 16(4), 334–340 (2000)
Robertson, M.P., Igel, H., Baertsch, R., Haussler, D., Ares Jr., M., Scott, W.G.: The Structure of a Rigorously Conserved RNA Element within the SARS Virus Genome. Public Library of Science: Biology 3(1), 5 (2004)
Sakakibara, Y., Brown, M., Hughey, R., Mian, I.S., Sjolander, K., Underwood, R.C., Haussler, D.: Stochastic Context-Free Grammars for tRNA Modeling. Nucl. Acids Res. 22, 5112–5120 (1994)
Sakakibara, Y.: Grammatical Inference in Bioinformatics. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1051–1062 (2005)
Searls, D.: The Linguistics of DNA. Am. Scient. 80, 579–591 (1992)
Takakura, T., Asakawa, H., Seki, S., Kobayashi, S.: Efficient Tree Grammar Modeling of RNA Secondary Structures from Alignment Data. In: Proceedings of posters of RECOMB 2005, pp. 339–340 (2005)
Tanaka, Y., Hori, T., Tagaya, M., Sakamoto, T., Kurihara, Y., Katahira, M., Uesugi, S.: Imino Proton NMR Analysis of HDV Ribozymes: Nested Double Pseudoknot Structure and Mg2+ Ion-Binding Site Close to the Catalytic Core in Solution. Nucl. Acids Res. 30, 766–774 (2002)
Uemura, Y., Hasegawa, A., Kobayashi, S., Yokomori, T.: Tree Adjoining Grammars for RNA Structure Prediction. Theoretical Computer Science 210(2), 277–303 (1999)
Vijay-Shanker, K., Joshi, A.K.: Some Computational Properties of Tree Adjoining Grammars. In: 23 rd Meeting of the Association for Computational Linguistics, pp. 82–93 (1985)
Williams, K.P., Bartel, D.P.: The tmRNA Website. Nucl. Acids Res. 26(1), 163–165 (1998)
Williams, K.P.: The tmRNA Website: Invasion by an Intron. Nucl. Acids Res. 30(1), 179–182 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Seesi, S.A., Rajasekaran, S., Ammar, R. (2008). Pseudoknot Identification through Learning TAGRNA . In: Chetty, M., Ngom, A., Ahmad, S. (eds) Pattern Recognition in Bioinformatics. PRIB 2008. Lecture Notes in Computer Science(), vol 5265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88436-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-88436-1_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88434-7
Online ISBN: 978-3-540-88436-1
eBook Packages: Computer ScienceComputer Science (R0)