Skip to main content

Machine Learning Methods for MicroRNA Gene Prediction

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1107))

Abstract

MicroRNAs (miRNAs) are single-stranded, small, noncoding RNAs of about 22 nucleotides in length, which control gene expression at the posttranscriptional level through translational inhibition, degradation, adenylation, or destabilization of their target mRNAs. Although hundreds of miRNAs have been identified in various species, many more may still remain unknown. Therefore, discovery of new miRNA genes is an important step for understanding miRNA-mediated posttranscriptional regulation mechanisms. It seems that biological approaches to identify miRNA genes might be limited in their ability to detect rare miRNAs and are further limited to the tissues examined and the developmental stage of the organism under examination. These limitations have led to the development of sophisticated computational approaches attempting to identify possible miRNAs in silico. In this chapter, we discuss computational problems in miRNA prediction studies and review some of the many machine learning methods that have been tried to address the issues.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Bentwich I, Avniel A, Karov Y et al (2005) Identification of hundreds of conserved and nonconserved human microRNAs. Nat Genet 37:766–770

    Article  PubMed  CAS  Google Scholar 

  2. Ng KLS, Mishra SK (2007) De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures. Bioinformatics 23:1321–1330

    Article  PubMed  CAS  Google Scholar 

  3. van der Burgt A, Fiers MWJE, Nap J-P et al (2009) In silico miRNA prediction in metazoan genomes: balancing between sensitivity and specificity. BMC Genomics 10:204

    Article  PubMed  Google Scholar 

  4. Janssen S, Schudoma C, Steger G et al (2011) Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction. BMC Bioinformatics 12:429

    Article  PubMed  CAS  Google Scholar 

  5. Hofacker IL (2003) Vienna RNA secondary structure server. Nucleic Acids Res 31:3429–3431

    Article  PubMed  CAS  Google Scholar 

  6. Lindow M, Gorodkin J (2007) Principles and limitations of computational microRNA gene and target finding. DNA Cell Biol 26:339–351

    Article  PubMed  CAS  Google Scholar 

  7. Pang KC, Frith MC, Mattick JS (2006) Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function. Trends Genet 22:1–5

    Article  PubMed  CAS  Google Scholar 

  8. Guerra-Assunção JA, Enright AJ (2010) MapMi: automated mapping of microRNA loci. BMC Bioinformatics 11:133

    Article  PubMed  Google Scholar 

  9. Pasquinelli AE, Reinhart BJ, Slack F et al (2000) Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA. Nature 408:86–89

    Article  PubMed  CAS  Google Scholar 

  10. McGinnis S, Madden TL (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 32:W20–W25

    Article  PubMed  CAS  Google Scholar 

  11. Liang H, Li W-H (2009) Lowly expressed human microRNA genes evolve rapidly. Mol Biol Evol 26:1195–1198

    Article  PubMed  CAS  Google Scholar 

  12. Berezikov E, Guryev V, van de Belt J et al (2005) Phylogenetic shadowing and computational identification of human microRNA genes. Cell 120:21–24

    Article  PubMed  CAS  Google Scholar 

  13. Boffelli D, McAuliffe J, Ovcharenko D et al (2003) Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science (New York, NY) 299:1391–1394

    Article  CAS  Google Scholar 

  14. Lim LP, Lau NC, Weinstein EG et al (2003) The microRNAs of Caenorhabditis elegans. Genes Dev 17:991–1008

    Article  PubMed  CAS  Google Scholar 

  15. Gerlach D, Kriventseva EV, Rahman N et al (2009) miROrtho: computational survey of microRNA genes. Nucleic Acids Res 37:D111–D117

    Article  PubMed  CAS  Google Scholar 

  16. Artzi S, Kiezun A, Shomron N (2008) MiRNAminer: a tool for homologous microRNA gene search. BMC Bioinformatics 9:39

    Article  PubMed  Google Scholar 

  17. Nam J-W, Kim J, Kim S-K et al (2006) ProMiR II: a web server for the probabilistic prediction of clustered, nonclustered, conserved and nonconserved microRNAs. Nucleic Acids Res 34:W455–W458

    Article  PubMed  CAS  Google Scholar 

  18. Nam J-W, Shin K-R, Han J et al (2005) Human microRNA prediction through a probabilistic co-learning model of sequence and structure. Nucleic Acids Res 33:3570–3581

    Article  PubMed  CAS  Google Scholar 

  19. Huang T-H, Fan B, Rothschild MF et al (2007) MiRFinder: an improved approach and software implementation for genome-wide fast microRNA precursor scans. BMC Bioinformatics 8:341

    Article  PubMed  Google Scholar 

  20. Brameier M, Wiuf C (2007) Ab initio identification of human microRNAs based on structure motifs. BMC Bioinformatics 8:478

    Article  PubMed  Google Scholar 

  21. Allmer J, Yousef M (2012) Computational methods for ab initio detection of microRNAs. Front Genet 3:209

    Article  PubMed  Google Scholar 

  22. Ding J, Zhou S, Guan J (2010) MiRenSVM: towards better prediction of microRNA precursors using an ensemble SVM classifier with multi-loop features. BMC Bioinformatics 11(Suppl 1):S11

    Article  PubMed  Google Scholar 

  23. Bentwich I (2008) Identifying human microRNAs. Curr Top Microbiol Immunol 320: 257–269

    Article  PubMed  CAS  Google Scholar 

  24. Xue C, Li F, He T et al (2005) Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics 6:310

    Article  PubMed  Google Scholar 

  25. Jiang P, Wu H, Wang W et al (2007) MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features. Nucleic Acids Res 35:W339–W344

    Article  PubMed  Google Scholar 

  26. Wu Y, Wei B, Liu H et al (2011) MiRPara: a SVM-based software tool for prediction of most probable microRNA coding regions in genome scale sequences. BMC Bioinformatics 12:107

    Article  PubMed  CAS  Google Scholar 

  27. Kozomara A, Griffiths-Jones S (2011) miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res 39:D152–D157

    Article  PubMed  CAS  Google Scholar 

  28. Ritchie W, Gao D, Rasko JEJ (2012) Defining and providing robust controls for microRNA prediction. Bioinformatics (Oxford, England) 28:1058–1061

    Article  CAS  Google Scholar 

  29. Saçar MD, Hamzeiy H, Allmer J (2013) Can MiRBase provide positive data for machine learning for the detection of MiRNA hairpins? J Integr Bioinform 10(2):215

    PubMed  Google Scholar 

  30. Pruitt KD, Tatusova T, Maglott DR (2005) NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 33:D501–D504

    Article  PubMed  CAS  Google Scholar 

  31. Bhaskar H, Hoyle DC, Singh S (2006) Machine learning in bioinformatics: a brief survey and recommendations for practitioners. Comput Biol Med 36:1104–1125

    Article  PubMed  Google Scholar 

  32. Larrañaga P, Calvo B, Santana R et al (2006) Machine learning in bioinformatics. Brıef Bıoınform 7:86–112

    Article  PubMed  Google Scholar 

  33. Zhang Y-Q, Rajapakse JC, Zhang B-T et al (2008) Supervised learning methods for MicroRNA studies., machine learning in bioinformatics. Wiley, New York, p 339

    Google Scholar 

  34. Mosteller F (1948) A k-sample slippage test for an extreme population. Ann Math Stat 19:58–65

    Article  Google Scholar 

  35. Gkirtzou K, Tsamardinos I, Tsakalides P et al (2010) MatureBayes: a probabilistic algorithm for identifying the mature miRNA within novel precursors. PloS one 5:e11843

    Article  PubMed  Google Scholar 

  36. Tax DMJ (2001) One-class classification. ISBN: 90-75691-05-x

    Google Scholar 

  37. Yousef M, Jung S, Showe LC et al (2008) Learning from positive examples when the negative class is undetermined–microRNA gene identification. Algorithms Mol Biol 3:2

    Article  PubMed  Google Scholar 

  38. Bentwich I (2005) Prediction and validation of microRNAs and their targets. FEBS Lett 579: 5904–5910

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

This study was in part supported by an award received from the Turkish Academy of Sciences for outstanding young scientists (TUBA GEBIP, http://www.tuba.gov.tr).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this protocol

Cite this protocol

Saçar, M.D., Allmer, J. (2014). Machine Learning Methods for MicroRNA Gene Prediction. In: Yousef, M., Allmer, J. (eds) miRNomics: MicroRNA Biology and Computational Analysis. Methods in Molecular Biology, vol 1107. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-748-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-748-8_10

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-747-1

  • Online ISBN: 978-1-62703-748-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics