CircNet: an encoder–decoder-based convolution neural network (CNN) for circular RNA identification

Abstract

Discrimination of circular RNA from long non-coding RNA is important to understand its role in different biological processes, disease prediction and cure. Identifying circular RNA through manual laboratories work is expensive, time-consuming and prone to errors. Development of computational methodologies for identification of circular RNA is an active area of research. State-of-the-art circular RNA identification methodologies make use of handcrafted features, which not only increase the feature space, but also extract irrelevant and redundant features. The paper in hand proposes an end-to-end deep learning-based framework named as CircNet, which does not require any handcrafted features. It takes raw RNA sequence as an input and utilises encoder–decoder based convolutional operations to learn lower-dimensional latent representation. This latent representation is further passed to another convolutional architecture to extract discriminative features followed by a classification layer. We performed extensive experimentation to highlight different regions of genome sequence that preserve the most important information for identifying circular RNAs. CircNet significantly outperforms state-of-the-art approaches with a considerable margin 10.29% in terms F1 measure.

This is a preview of subscription content, access via your institution.

Fig. 1

Figure adapted from [7]

Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. 1.

    Mattick JS, Makunin IV (2006) Non-coding RNA. Hum Mol Genet 15(suppl-1):R17–R29

    Article  Google Scholar 

  2. 2.

    Holdt LM, Kohlmaier A, Teupser D (2018) Molecular roles and function of circular RNAs in eukaryotic cells. Cell Mol Life Sci 75(6):1071–1098

    Article  Google Scholar 

  3. 3.

    Rossi E, Monti F, Bronstein M, Liò P (2019) ncRNA classification with graph convolutional networks. arXiv preprint arXiv:1905.06515

  4. 4.

    Yao D, Zhang L, Zheng M, Sun X, Yan L, Liu P (2018) Circ2Disease: a manually curated database of experimentally validated circRNAs in human disease. Sci Rep 8(1):1–6

    Article  Google Scholar 

  5. 5.

    Razzak MI, Imran M, Xu G (2020) Big data analytics for preventive medicine. Neural Comput Appl 32(9):4417–4451

    Article  Google Scholar 

  6. 6.

    Rehman A, Naz S, Razzak I (2020) Leveraging big data analytics in healthcare enhancement: trends, challenges and opportunities. arXiv preprint arXiv:2004.09010

  7. 7.

    Amin N, McGrath A, Chen Y-PP (2019) Evaluation of deep learning in non-coding RNA classification. Nat Mach Intell 1(5):246–256

    Article  Google Scholar 

  8. 8.

    Pan X, Xiong K (2015) PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features. Mol BioSyst 11(8):2219–2226

    Article  Google Scholar 

  9. 9.

    Wang Z, Lei X, Fang-Xiang W (2019) Identifying cancer-specific circRNA-RBP binding sites based on deep learning. Molecules 24(22):4035

    Article  Google Scholar 

  10. 10.

    Lee ECS, Elhassan SAM, Lim GPL, Kok WH, Tan SW, Leong EN, Tan SH, Chan EWL, Bhattamisra SK, Rajendran R et al (2019) The roles of circular RNAs in human development and diseases. Biomed Pharmacother 111:198–208

    Article  Google Scholar 

  11. 11.

    Chaabane M, Williams RM, Stephens AT, Park JW (2020) circdeep: deep learning approach for circular RNA classification from other long non-coding RNA. Bioinformatics 36(1):73–80

    Article  Google Scholar 

  12. 12.

    Huang S, Yang B, Chen BJ, Bliim N, Ueberham U, Arendt T, Janitz M (2017) The emerging role of circular RNAs in transcriptome regulation. Genomics 109(5–6):401–407

    Article  Google Scholar 

  13. 13.

    Tilgner H, Knowles DG, Johnson R, Davis CA, Chakrabortty S, Djebali S, Curado J, Snyder M, Gingeras TR, Guigó R (2012) Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res 22(9):1616–1625

    Article  Google Scholar 

  14. 14.

    Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40(12):1413

    Article  Google Scholar 

  15. 15.

    Lasda E, Parker R (2014) Circular RNAs: diversity of form and function. RNA 20(12):1829–1842

    Article  Google Scholar 

  16. 16.

    Zhang Z, Yang T, Xiao J (2018) Circular RNAs: promising biomarkers for human diseases. EBioMedicine 34:267–274

    Article  Google Scholar 

  17. 17.

    Bachmayr-Heyda A, Reiner AT, Auer K, Sukhbaatar N, Aust S, Bachleitner-Hofmann T, Mesteri I, Grunt TW, Zeillinger R, Pils D (2015) Correlation of circular RNA abundance with proliferation—exemplified with colorectal and ovarian cancer, idiopathic lung fibrosis and normal human tissues. Sci Rep 5(1):1–10

    Article  Google Scholar 

  18. 18.

    Fiannaca A, La Rosa M, La Paglia L, Rizzo R, Urso A (2017) nRC: non-coding RNA classifier based on structural features. BioData Min 10(1):27

    Article  Google Scholar 

  19. 19.

    Zhang X, Wang J, Li J, Chen W, Liu C (2018) CRlncRC: a machine learning-based method for cancer-related long noncoding RNA identification using integrated features. BMC Med Genomics 11(6):99–112

    Article  Google Scholar 

  20. 20.

    Holdt LM, Kohlmaier A, Teupser D (2018) Circular RNAs as therapeutic agents and targets. Front Physiol 9:1262

    Article  Google Scholar 

  21. 21.

    Li P, Chen S, Chen H, Mo X, Li T, Shao Y, Xiao B, Guo J (2015) Using circular RNA as a novel type of biomarker in the screening of gastric cancer. Clin Chim Acta 444:132–136

    Article  Google Scholar 

  22. 22.

    Zaghlool A, Ameur A, Wu C, Westholm JO, Niazi A, Manivannan M, Bramlett K, Nilsson M, Feuk L (2018) Expression profiling and in situ screening of circular RNAs in human tissues. Sci Rep 8(1):1–12

    Article  Google Scholar 

  23. 23.

    Zirkel A, Papantonis A (2018) Detecting circular RNAs by RNA fluorescence in situ hybridization. In: Circular RNAs. Springer, pp 69–75

  24. 24.

    Xia S, Feng J, Lei L, Jun H, Xia L, Jun Wang Yu, Xiang LL, Zhong S, Han L et al (2017) Comprehensive characterization of tissue-specific circular RNAs in the human and mouse genomes. Briefings Bioinform 18(6):984–992

    Google Scholar 

  25. 25.

    Chen L, Zhang Y-H, Huang G, Pan X, Wang SP, Huang T, Cai Y-D (2018) Discriminating cirRNAs from other lncRNAs using a hierarchical extreme learning machine (H-ELM) algorithm with feature selection. Mol Genet Genomics 293(1):137–149

    Article  Google Scholar 

  26. 26.

    Angermueller C, Lee HJ, Reik W, Stegle O (2017) DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol 18(1):67

    Article  Google Scholar 

  27. 27.

    Wang Y, Liu T, Dong X, Shi H, Zhang C, Mo Y-Y, Wang Z (2016) Predicting DNA methylation state of CPG dinucleotide using genome topological features and deep networks. Sci Rep 6:19598

    Article  Google Scholar 

  28. 28.

    Di Gangi M, Bosco GL, Rizzo R (2018) Deep learning architectures for prediction of nucleosome positioning from sequences data. BMC Bioinform 19(14):418

    Article  Google Scholar 

  29. 29.

    Tian K, Shao M, Wang Y, Guan J, Zhou S (2016) Boosting compound-protein interaction prediction by deep learning. Methods 110:64–72

    Article  Google Scholar 

  30. 30.

    Kwon S, Yoon S (2017) Deepcci: end-to-end deep learning for chemical–chemical interaction prediction. In: Proceedings of the 8th ACM international conference on bioinformatics, computational biology, and health informatics, pp 203–212

  31. 31.

    Singh R, Lanchantin J, Robins G, Qi Y (2016) Deepchrome: deep-learning for predicting gene expression from histone modifications. Bioinformatics 32(17):i639–i648

    Article  Google Scholar 

  32. 32.

    Zhou J, Troyanskaya OG (2015) Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods 12(10):931–934

    Article  Google Scholar 

  33. 33.

    Asima MN, Malik MI, Dengela A, Ahmed S (2019) A robust and precise convnet for small non-coding RNA classification (RPC-SNRC). arXiv preprint arXiv:1912.11356

  34. 34.

    Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259

  35. 35.

    Yasrab R, Naijie G, Zhang X (2017) An encoder-decoder based convolution neural network (CNN) for future advanced driver assistance system (ADAS). Appl Sci 7(4):312

    Article  Google Scholar 

  36. 36.

    Chen X, Han P, Zhou T, Guo X, Song X, Li Y (2016) circRNADb: a comprehensive database for human circular RNAs with protein-coding annotations. Sci Rep 6(1):1–6

    Article  Google Scholar 

  37. 37.

    Frankish A, Diekhans M, Ferreira A-M, Johnson R, Jungreis I, Loveland J, Mudge JM, Sisu C, Wright J, Armstrong J et al (2019) Gencode reference annotation for the human and mouse genomes. Nucleic Acids Res 47(D1):D766–D773

    Article  Google Scholar 

  38. 38.

    Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (2002) The human genome browser at UCSC. Genome Res 12(6):996–1006

    Article  Google Scholar 

  39. 39.

    Ivanov A, Memczak S, Wyler E, Torti F, Porath HT, Orejuela MR, Piechotta M, Levanon EY, Landthaler M, Dieterich C et al (2015) Analysis of intron sequences reveals hallmarks of circular RNA biogenesis in animals. Cell Rep 10(2):170–177

    Article  Google Scholar 

  40. 40.

    Wang J, Wang L (2019) Deep learning of the back-splicing code for circular RNA formation. Bioinformatics 35(24):5235–5242

    Article  Google Scholar 

  41. 41.

    Straube S, Krell MM (2014) How to evaluate an agent’s behavior to infrequent events? Reliable performance estimation insensitive to class distribution. Front Comput Neurosci 8:43

    Article  Google Scholar 

  42. 42.

    Brzezinski D, Stefanowski J (2017) Prequential AUC: properties of the area under the ROC curve for data streams with concept drift. Knowl Inf Syst 52(2):531–562

    Article  Google Scholar 

  43. 43.

    Zhang K, Pan X, Yang Y, Shen H-B (2019) CRIP: predicting circRNA-RBP-binding sites using a codon-based encoding and hybrid deep neural networks. RNA 25(12):1604–1615

    Article  Google Scholar 

  44. 44.

    Jia C, Yue B, Chen J, Leier A, Li F, Song J (2020) PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs. Bioinformatics 36(15):4276–4282. https://doi.org/10.1093/bioinformatics/btaa522

    Article  Google Scholar 

  45. 45.

    Javad Z, Omid Y, Morteza M-N, Reza E, Ali M-N (2013) PPievo: protein–protein interaction prediction from PSSM based evolutionary information. Genomics 102(4):237–242

    Article  Google Scholar 

  46. 46.

    Halder AK, Dutta P, Kundu M, Basu S, Nasipuri M (2018) Review of computational methods for virus–host protein interaction prediction: a case study on novel ebola–human interactions. Briefings Funct Genomics 17(6):381–391

    Google Scholar 

Download references

Funding

Sartorius Artificial Intelligence Lab.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Muhammad Nabeel Asim.

Ethics declarations

Conflicts of interest

Corresponding author, on the behalf of all authors declares that no conflict of interest is present.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Stricker, M., Asim, M.N., Dengel, A. et al. CircNet: an encoder–decoder-based convolution neural network (CNN) for circular RNA identification. Neural Comput & Applic (2021). https://doi.org/10.1007/s00521-020-05673-1

Download citation

Keywords

  • Circular RNA classification
  • Machine learning
  • Deep learning
  • Autoencoder