Abstract
MicroRNAs (miRNAs) are single-stranded, small, noncoding RNAs of about 22 nucleotides in length, which control gene expression at the posttranscriptional level through translational inhibition, degradation, adenylation, or destabilization of their target mRNAs. Although hundreds of miRNAs have been identified in various species, many more may still remain unknown. Therefore, discovery of new miRNA genes is an important step for understanding miRNA-mediated posttranscriptional regulation mechanisms. It seems that biological approaches to identify miRNA genes might be limited in their ability to detect rare miRNAs and are further limited to the tissues examined and the developmental stage of the organism under examination. These limitations have led to the development of sophisticated computational approaches attempting to identify possible miRNAs in silico. In this chapter, we discuss computational problems in miRNA prediction studies and review some of the many machine learning methods that have been tried to address the issues.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bentwich I, Avniel A, Karov Y et al (2005) Identification of hundreds of conserved and nonconserved human microRNAs. Nat Genet 37:766–770
Ng KLS, Mishra SK (2007) De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures. Bioinformatics 23:1321–1330
van der Burgt A, Fiers MWJE, Nap J-P et al (2009) In silico miRNA prediction in metazoan genomes: balancing between sensitivity and specificity. BMC Genomics 10:204
Janssen S, Schudoma C, Steger G et al (2011) Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction. BMC Bioinformatics 12:429
Hofacker IL (2003) Vienna RNA secondary structure server. Nucleic Acids Res 31:3429–3431
Lindow M, Gorodkin J (2007) Principles and limitations of computational microRNA gene and target finding. DNA Cell Biol 26:339–351
Pang KC, Frith MC, Mattick JS (2006) Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function. Trends Genet 22:1–5
Guerra-Assunção JA, Enright AJ (2010) MapMi: automated mapping of microRNA loci. BMC Bioinformatics 11:133
Pasquinelli AE, Reinhart BJ, Slack F et al (2000) Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA. Nature 408:86–89
McGinnis S, Madden TL (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 32:W20–W25
Liang H, Li W-H (2009) Lowly expressed human microRNA genes evolve rapidly. Mol Biol Evol 26:1195–1198
Berezikov E, Guryev V, van de Belt J et al (2005) Phylogenetic shadowing and computational identification of human microRNA genes. Cell 120:21–24
Boffelli D, McAuliffe J, Ovcharenko D et al (2003) Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science (New York, NY) 299:1391–1394
Lim LP, Lau NC, Weinstein EG et al (2003) The microRNAs of Caenorhabditis elegans. Genes Dev 17:991–1008
Gerlach D, Kriventseva EV, Rahman N et al (2009) miROrtho: computational survey of microRNA genes. Nucleic Acids Res 37:D111–D117
Artzi S, Kiezun A, Shomron N (2008) MiRNAminer: a tool for homologous microRNA gene search. BMC Bioinformatics 9:39
Nam J-W, Kim J, Kim S-K et al (2006) ProMiR II: a web server for the probabilistic prediction of clustered, nonclustered, conserved and nonconserved microRNAs. Nucleic Acids Res 34:W455–W458
Nam J-W, Shin K-R, Han J et al (2005) Human microRNA prediction through a probabilistic co-learning model of sequence and structure. Nucleic Acids Res 33:3570–3581
Huang T-H, Fan B, Rothschild MF et al (2007) MiRFinder: an improved approach and software implementation for genome-wide fast microRNA precursor scans. BMC Bioinformatics 8:341
Brameier M, Wiuf C (2007) Ab initio identification of human microRNAs based on structure motifs. BMC Bioinformatics 8:478
Allmer J, Yousef M (2012) Computational methods for ab initio detection of microRNAs. Front Genet 3:209
Ding J, Zhou S, Guan J (2010) MiRenSVM: towards better prediction of microRNA precursors using an ensemble SVM classifier with multi-loop features. BMC Bioinformatics 11(Suppl 1):S11
Bentwich I (2008) Identifying human microRNAs. Curr Top Microbiol Immunol 320: 257–269
Xue C, Li F, He T et al (2005) Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics 6:310
Jiang P, Wu H, Wang W et al (2007) MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features. Nucleic Acids Res 35:W339–W344
Wu Y, Wei B, Liu H et al (2011) MiRPara: a SVM-based software tool for prediction of most probable microRNA coding regions in genome scale sequences. BMC Bioinformatics 12:107
Kozomara A, Griffiths-Jones S (2011) miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res 39:D152–D157
Ritchie W, Gao D, Rasko JEJ (2012) Defining and providing robust controls for microRNA prediction. Bioinformatics (Oxford, England) 28:1058–1061
Saçar MD, Hamzeiy H, Allmer J (2013) Can MiRBase provide positive data for machine learning for the detection of MiRNA hairpins? J Integr Bioinform 10(2):215
Pruitt KD, Tatusova T, Maglott DR (2005) NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 33:D501–D504
Bhaskar H, Hoyle DC, Singh S (2006) Machine learning in bioinformatics: a brief survey and recommendations for practitioners. Comput Biol Med 36:1104–1125
Larrañaga P, Calvo B, Santana R et al (2006) Machine learning in bioinformatics. Brıef Bıoınform 7:86–112
Zhang Y-Q, Rajapakse JC, Zhang B-T et al (2008) Supervised learning methods for MicroRNA studies., machine learning in bioinformatics. Wiley, New York, p 339
Mosteller F (1948) A k-sample slippage test for an extreme population. Ann Math Stat 19:58–65
Gkirtzou K, Tsamardinos I, Tsakalides P et al (2010) MatureBayes: a probabilistic algorithm for identifying the mature miRNA within novel precursors. PloS one 5:e11843
Tax DMJ (2001) One-class classification. ISBN: 90-75691-05-x
Yousef M, Jung S, Showe LC et al (2008) Learning from positive examples when the negative class is undetermined–microRNA gene identification. Algorithms Mol Biol 3:2
Bentwich I (2005) Prediction and validation of microRNAs and their targets. FEBS Lett 579: 5904–5910
Acknowledgements
This study was in part supported by an award received from the Turkish Academy of Sciences for outstanding young scientists (TUBA GEBIP, http://www.tuba.gov.tr).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Saçar, M.D., Allmer, J. (2014). Machine Learning Methods for MicroRNA Gene Prediction. In: Yousef, M., Allmer, J. (eds) miRNomics: MicroRNA Biology and Computational Analysis. Methods in Molecular Biology, vol 1107. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-748-8_10
Download citation
DOI: https://doi.org/10.1007/978-1-62703-748-8_10
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-747-1
Online ISBN: 978-1-62703-748-8
eBook Packages: Springer Protocols