Abstract
Genes are parts of a DNA sequence responsible for protein synthesis. Splicing more specifically refers to a post-transcriptional modification that is responsible for multiple protein synthesis from a single gene. The classification of the splice junction has remained quite a challenging task in the field of bioinformatics and is equally important as the synthesized proteins are responsible for the unique characteristics observed in different living organisms. In this study, we propose a state of the art algorithm in splice junction prediction from DNA sequence using a multilayered stacked RNN model, which achieves an overall accuracy of 99.95% and an AUROC score of 1.0 for exon-intron, intron-exon as well as no-junction classification.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
PREMIER Biosoft: gene splicing overview & techniques. www.premierbiosoft.com/tech_notes/gene-splicing.html. Accessed 2019
Medical Xpress: predicting how splicing errors impact disease risk. https://medicalxpress.com/news/2018-08-splicing-errors-impact-disease.html. Accessed 2019
Murray RK, Bender DA, Botham KM, Kennelly PJ, Rodwell VW, Weil PA. Harpers illustrated biochemistry, pp 352–354. Accessed 2019
Salzberg S (1995) Locating protein coding regions in human DNA using a decision tree algorithm. J Comput Biol: J Comput Mol Cell Biol 2:473–485
Ngoc Giang N, Anh Tran V, Luu Ngo D, Phan D, Lumbanraja F, Faisal MR, Abapihi B, Kubo M, Satou K (2016) DNA sequence classification by convolutional neural network. J Biomed Sci Eng 9:280–286
Damasevicius R (2008) Splice site recognition in DNA sequences using k-mer frequency based mapping for support vector machine with power series kernel. In: International conference on complex, intelligent and software intensive systems, pp 687–692, March 2008
Cervantes J, Li X, Yu W (2009) Splice site detection in DNA sequences using a fast classification algorithm. In: SMC’09 Proceedings of the 2009 IEEE international conference on systems, man and cybernetics, pp 2683–2688, October 2009
Kerdprasop N, Kerdprasop K (2010) A high recall DNA splice site prediction based on association analysis. In: International conference on applied computer science proceedings
Mandal DI (2015) A novel approach for predicting DNA splice junctions using hybrid machine learning algorithms. Soft Comput 19:3431–3444
Lee B, Lee T, Na B, Yoon S (2015) DNA-level splice junction prediction using deep recurrent neural networks. CoRR abs/1512.05135
Zhang Y, Liu X, MacLeod JN, Liu J (2016) Deepsplice: deep classification of novel splice junctions revealed by RNA-seq. In: 2016 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 330–333. IEEE, December 2016
NCBI: Genbank. ftp://ftp.ncbi.nlm.nih.gov/genbank. Accessed 2019
Revolvy: Dna codon table. https://www.revolvy.com/page/DNA-codon-table. Accessed 2019
Pham TH, Tran DH, Ho TB, Satou K, Valiente G. Qualitatively predicting acetylation & methylation areas in DNA sequences. http://www.jaist.ac.jp/~tran/nucleosome/index.htm. Accessed 2019
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Sarkar, R., Chatterjee, C.C., Das, S., Mondal, D. (2020). Splice Junction Prediction in DNA Sequence Using Multilayered RNN Model. In: Satapathy, S.C., Raju, K.S., Shyamala, K., Krishna, D.R., Favorskaya, M.N. (eds) Advances in Decision Sciences, Image Processing, Security and Computer Vision. Learning and Analytics in Intelligent Systems, vol 3. Springer, Cham. https://doi.org/10.1007/978-3-030-24322-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-24322-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24321-0
Online ISBN: 978-3-030-24322-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)