Abstract
Discrimination of circular RNA from long non-coding RNA is important to understand its role in different biological processes, disease prediction and cure. Identifying circular RNA through manual laboratories work is expensive, time-consuming and prone to errors. Development of computational methodologies for identification of circular RNA is an active area of research. State-of-the-art circular RNA identification methodologies make use of handcrafted features, which not only increase the feature space, but also extract irrelevant and redundant features. The paper in hand proposes an end-to-end deep learning-based framework named as CircNet, which does not require any handcrafted features. It takes raw RNA sequence as an input and utilises encoder–decoder based convolutional operations to learn lower-dimensional latent representation. This latent representation is further passed to another convolutional architecture to extract discriminative features followed by a classification layer. We performed extensive experimentation to highlight different regions of genome sequence that preserve the most important information for identifying circular RNAs. CircNet significantly outperforms state-of-the-art approaches with a considerable margin 10.29% in terms F1 measure.
Similar content being viewed by others
References
Mattick JS, Makunin IV (2006) Non-coding RNA. Hum Mol Genet 15(suppl-1):R17–R29
Holdt LM, Kohlmaier A, Teupser D (2018) Molecular roles and function of circular RNAs in eukaryotic cells. Cell Mol Life Sci 75(6):1071–1098
Rossi E, Monti F, Bronstein M, Liò P (2019) ncRNA classification with graph convolutional networks. arXiv preprint arXiv:1905.06515
Yao D, Zhang L, Zheng M, Sun X, Yan L, Liu P (2018) Circ2Disease: a manually curated database of experimentally validated circRNAs in human disease. Sci Rep 8(1):1–6
Razzak MI, Imran M, Xu G (2020) Big data analytics for preventive medicine. Neural Comput Appl 32(9):4417–4451
Rehman A, Naz S, Razzak I (2020) Leveraging big data analytics in healthcare enhancement: trends, challenges and opportunities. arXiv preprint arXiv:2004.09010
Amin N, McGrath A, Chen Y-PP (2019) Evaluation of deep learning in non-coding RNA classification. Nat Mach Intell 1(5):246–256
Pan X, Xiong K (2015) PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features. Mol BioSyst 11(8):2219–2226
Wang Z, Lei X, Fang-Xiang W (2019) Identifying cancer-specific circRNA-RBP binding sites based on deep learning. Molecules 24(22):4035
Lee ECS, Elhassan SAM, Lim GPL, Kok WH, Tan SW, Leong EN, Tan SH, Chan EWL, Bhattamisra SK, Rajendran R et al (2019) The roles of circular RNAs in human development and diseases. Biomed Pharmacother 111:198–208
Chaabane M, Williams RM, Stephens AT, Park JW (2020) circdeep: deep learning approach for circular RNA classification from other long non-coding RNA. Bioinformatics 36(1):73–80
Huang S, Yang B, Chen BJ, Bliim N, Ueberham U, Arendt T, Janitz M (2017) The emerging role of circular RNAs in transcriptome regulation. Genomics 109(5–6):401–407
Tilgner H, Knowles DG, Johnson R, Davis CA, Chakrabortty S, Djebali S, Curado J, Snyder M, Gingeras TR, Guigó R (2012) Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res 22(9):1616–1625
Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40(12):1413
Lasda E, Parker R (2014) Circular RNAs: diversity of form and function. RNA 20(12):1829–1842
Zhang Z, Yang T, Xiao J (2018) Circular RNAs: promising biomarkers for human diseases. EBioMedicine 34:267–274
Bachmayr-Heyda A, Reiner AT, Auer K, Sukhbaatar N, Aust S, Bachleitner-Hofmann T, Mesteri I, Grunt TW, Zeillinger R, Pils D (2015) Correlation of circular RNA abundance with proliferation—exemplified with colorectal and ovarian cancer, idiopathic lung fibrosis and normal human tissues. Sci Rep 5(1):1–10
Fiannaca A, La Rosa M, La Paglia L, Rizzo R, Urso A (2017) nRC: non-coding RNA classifier based on structural features. BioData Min 10(1):27
Zhang X, Wang J, Li J, Chen W, Liu C (2018) CRlncRC: a machine learning-based method for cancer-related long noncoding RNA identification using integrated features. BMC Med Genomics 11(6):99–112
Holdt LM, Kohlmaier A, Teupser D (2018) Circular RNAs as therapeutic agents and targets. Front Physiol 9:1262
Li P, Chen S, Chen H, Mo X, Li T, Shao Y, Xiao B, Guo J (2015) Using circular RNA as a novel type of biomarker in the screening of gastric cancer. Clin Chim Acta 444:132–136
Zaghlool A, Ameur A, Wu C, Westholm JO, Niazi A, Manivannan M, Bramlett K, Nilsson M, Feuk L (2018) Expression profiling and in situ screening of circular RNAs in human tissues. Sci Rep 8(1):1–12
Zirkel A, Papantonis A (2018) Detecting circular RNAs by RNA fluorescence in situ hybridization. In: Circular RNAs. Springer, pp 69–75
Xia S, Feng J, Lei L, Jun H, Xia L, Jun Wang Yu, Xiang LL, Zhong S, Han L et al (2017) Comprehensive characterization of tissue-specific circular RNAs in the human and mouse genomes. Briefings Bioinform 18(6):984–992
Chen L, Zhang Y-H, Huang G, Pan X, Wang SP, Huang T, Cai Y-D (2018) Discriminating cirRNAs from other lncRNAs using a hierarchical extreme learning machine (H-ELM) algorithm with feature selection. Mol Genet Genomics 293(1):137–149
Angermueller C, Lee HJ, Reik W, Stegle O (2017) DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol 18(1):67
Wang Y, Liu T, Dong X, Shi H, Zhang C, Mo Y-Y, Wang Z (2016) Predicting DNA methylation state of CPG dinucleotide using genome topological features and deep networks. Sci Rep 6:19598
Di Gangi M, Bosco GL, Rizzo R (2018) Deep learning architectures for prediction of nucleosome positioning from sequences data. BMC Bioinform 19(14):418
Tian K, Shao M, Wang Y, Guan J, Zhou S (2016) Boosting compound-protein interaction prediction by deep learning. Methods 110:64–72
Kwon S, Yoon S (2017) Deepcci: end-to-end deep learning for chemical–chemical interaction prediction. In: Proceedings of the 8th ACM international conference on bioinformatics, computational biology, and health informatics, pp 203–212
Singh R, Lanchantin J, Robins G, Qi Y (2016) Deepchrome: deep-learning for predicting gene expression from histone modifications. Bioinformatics 32(17):i639–i648
Zhou J, Troyanskaya OG (2015) Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods 12(10):931–934
Asima MN, Malik MI, Dengela A, Ahmed S (2019) A robust and precise convnet for small non-coding RNA classification (RPC-SNRC). arXiv preprint arXiv:1912.11356
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259
Yasrab R, Naijie G, Zhang X (2017) An encoder-decoder based convolution neural network (CNN) for future advanced driver assistance system (ADAS). Appl Sci 7(4):312
Chen X, Han P, Zhou T, Guo X, Song X, Li Y (2016) circRNADb: a comprehensive database for human circular RNAs with protein-coding annotations. Sci Rep 6(1):1–6
Frankish A, Diekhans M, Ferreira A-M, Johnson R, Jungreis I, Loveland J, Mudge JM, Sisu C, Wright J, Armstrong J et al (2019) Gencode reference annotation for the human and mouse genomes. Nucleic Acids Res 47(D1):D766–D773
Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (2002) The human genome browser at UCSC. Genome Res 12(6):996–1006
Ivanov A, Memczak S, Wyler E, Torti F, Porath HT, Orejuela MR, Piechotta M, Levanon EY, Landthaler M, Dieterich C et al (2015) Analysis of intron sequences reveals hallmarks of circular RNA biogenesis in animals. Cell Rep 10(2):170–177
Wang J, Wang L (2019) Deep learning of the back-splicing code for circular RNA formation. Bioinformatics 35(24):5235–5242
Straube S, Krell MM (2014) How to evaluate an agent’s behavior to infrequent events? Reliable performance estimation insensitive to class distribution. Front Comput Neurosci 8:43
Brzezinski D, Stefanowski J (2017) Prequential AUC: properties of the area under the ROC curve for data streams with concept drift. Knowl Inf Syst 52(2):531–562
Zhang K, Pan X, Yang Y, Shen H-B (2019) CRIP: predicting circRNA-RBP-binding sites using a codon-based encoding and hybrid deep neural networks. RNA 25(12):1604–1615
Jia C, Yue B, Chen J, Leier A, Li F, Song J (2020) PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs. Bioinformatics 36(15):4276–4282. https://doi.org/10.1093/bioinformatics/btaa522
Javad Z, Omid Y, Morteza M-N, Reza E, Ali M-N (2013) PPievo: protein–protein interaction prediction from PSSM based evolutionary information. Genomics 102(4):237–242
Halder AK, Dutta P, Kundu M, Basu S, Nasipuri M (2018) Review of computational methods for virus–host protein interaction prediction: a case study on novel ebola–human interactions. Briefings Funct Genomics 17(6):381–391
Funding
Sartorius Artificial Intelligence Lab.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
Corresponding author, on the behalf of all authors declares that no conflict of interest is present.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Stricker, M., Asim, M.N., Dengel, A. et al. CircNet: an encoder–decoder-based convolution neural network (CNN) for circular RNA identification. Neural Comput & Applic 34, 11441–11452 (2022). https://doi.org/10.1007/s00521-020-05673-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05673-1