Abstract
MicroRNAs (miRNAs) are noncoding RNAs of ~22 nucleotides that play versatile regulatory roles in multicelluler organisms. Since the cloning methods for miRNAs identification are biased towards abundant miRNAs, the computational approaches provide useful complements to identify miRNAs which are highly constrained by tissue- and time-specifically expression manners. In this paper, we propose a novel Support Vector Machine (SVM) based detector, named MiR-PD, to identify pre-miRNAs in plants. The classifier is constructed based on twelve features of pre-miRNAs, inclusive of five global features and seven sub-structure features. Trained on 790 plant pre-miRNAs and 7,900 pseudo pre-miRNAs, MiR-PD achieves 96.43% five-fold cross-validation accuracy. Tested on the newly identified 441 plant pre-miRNAs and 62,883 pseudo pre-miRNAs, MiR-PD reports an accuracy of 99.71% with 77.55% sensitivity and 99.87% specificity, suggesting a feasible genome-wide application of this miRNAs detector so as to identify novel miRNAs (especially for those species-specific miRNAs) in plants without relying on phylogenetical conservation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Batuwita, R., Palade, V.: microPred: effective classification of pre-miRNAs for human miRNA gene prediction. Bioinformatics 25(8), 989 (2009)
Bentwich, I., Avniel, A., Karov, Y., Aharonov, R., Gilad, S., Barad, O., Barzilai, A., Einat, P., Einav, U., Meiri, E., et al.: Identification of hundreds of conserved and nonconserved human microRNAs. Nature Genetics 37(7), 766–770 (2005)
Bonnet, E., Wuyts, J., Rouzé, P., Van de Peer, Y.: Detection of 91 potential conserved plant microRNAs in Arabidopsis thaliana and Oryza sativa identifies important target genes. PNAS 101(31), 11511 (2004)
Bonnet, E., Wuyts, J., Rouzé, P., Van de Peer, Y.: Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences. Bioinformatics (2004)
Carrington, J.C., Ambros, V.: Role of microRNAs in plant and animal development. Science 301(5631), 336 (2003)
Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2001)
Chang, D., Wang, C., Chen, J.: Using a kernel density estimation based classifier to predict species-specific microRNA precursors. BMC Bioinformatics 9(suppl.12), 2 (2008)
Cullen, B.: Viruses and microRNAs. Nature Genetics 38, S25–S30 (2006)
Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S., Bateman, A.: Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Research 33(Database Issue), D121 (2005)
Griffiths-Jones, S., Saini, H., Dongen, S., Enright, A.: miRBase: tools for microRNA genomics. Nucleic Acids Research (2007)
Hertel, J., Stadler, P.: Hairpins in a Haystack: recognizing microRNA precursors in comparative genomics data. Bioinformatics 22(14), e197 (2006)
Hofacker, I., Fekete, M., Stadler, P.: Secondary structure prediction for aligned RNA sequences. Journal of Molecular Biology 319(5), 1059–1066 (2002)
Hsieh, C., Chang, D., Hsueh, C., Wu, C., Oyang, Y.: Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm. BMC Bioinformatics 11(suppl.1), 52 (2010)
Jones-Rhoades, M., Bartel, D.: Computational identification of plant microRNAs and their targets, including a stress-induced miRNA. Molecular Cell 14(6), 787–799 (2004)
Kwang Loong, S., Mishra, S.: De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures. Bioinformatics (2007)
Lai, E., Tomancak, P., Williams, R., Rubin, G.: Computational identification of Drosophila microRNA genes. Genome Biol. 4(7), R42 (2003)
Lee, R., Feinbaum, R., Ambros, V.: The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75(5), 843–854 (1993)
Lim, L., Glasner, M., Yekta, S., Burge, C., Bartel, D.: Vertebrate microRNA genes. Science 299(5612), 1540 (2003)
Lim, L., Lau, N., Weinstein, E., Abdelhakim, A., Yekta, S., Rhoades, M., Burge, C., Bartel, D.: The microRNAs of Caenorhabditis elegans. Genes & Development 17(8), 991 (2003)
Osuna, E., Freund, R., Girosi, F.: Support vector machines: Training and applications. CBCL-144 (1997)
Pedersen, J., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh, K., Lander, E., Kent, J., Miller, W., Haussler, D.: Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput. Biol. 2(4), e33 (2006)
Reinhart, B., Slack, F., Basson, M., Pasquinelli, A., Bettinger, J., Rougvie, A., Horvitz, H., Ruvkun, G.: The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 403(6772), 901–906 (2000)
Sewer, A., Paul, N., Landgraf, P., Aravin, A., Pfeffer, S., Brownstein, M., Tuschl, T., Van Nimwegen, E., Zavolan, M.: Identification of clustered microRNAs using an ab initio prediction method. BMC Bioinformatics 6(1), 267 (2005)
Wang, X., Zhang, J., Li, F., Gu, J., He, T., Zhang, X., Li, Y.: MicroRNA identification based on sequence and structure alignment. Bioinformatics 21(18), 3610 (2005)
Wang, X., Reyes, J., Chua, N., Gaasterland, T.: Prediction and identification of Arabidopsis thaliana microRNAs and their mRNA targets. Genome Biology 5(9), R65 (2004)
Washietl, S., Hofacker, I., Stadler, P.: Fast and reliable prediction of noncoding RNAs. Proceedings of the National Academy of Sciences 102(7), 2454 (2005)
Xue, C., Li, F., He, T., Liu, G., Li, Y., Zhang, X.: Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics 6(1), 310 (2005)
Zhang, B., Pan, X., Cox, S., Cobb, G., Anderson, T.: Evidence that miRNAs are different from other RNAs. Cellular and Molecular Life Sciences 63(2), 246–254 (2006)
Zuker, M.: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research 31(13), 3406 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, Y., Jin, C., Zhou, M., Zhou, A. (2012). An SVM-Based Approach to Discover MicroRNA Precursors in Plant Genomes. In: Cao, L., Huang, J.Z., Bailey, J., Koh, Y.S., Luo, J. (eds) New Frontiers in Applied Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 7104. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28320-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-28320-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28319-2
Online ISBN: 978-3-642-28320-8
eBook Packages: Computer ScienceComputer Science (R0)