Splice Junction Prediction in DNA Sequence Using Multilayered RNN Model

  • Rahul SarkarEmail author
  • Chandra Churh Chatterjee
  • Sayantan Das
  • Dhiman Mondal
Conference paper
Part of the Learning and Analytics in Intelligent Systems book series (LAIS, volume 3)


Genes are parts of a DNA sequence responsible for protein synthesis. Splicing more specifically refers to a post-transcriptional modification that is responsible for multiple protein synthesis from a single gene. The classification of the splice junction has remained quite a challenging task in the field of bioinformatics and is equally important as the synthesized proteins are responsible for the unique characteristics observed in different living organisms. In this study, we propose a state of the art algorithm in splice junction prediction from DNA sequence using a multilayered stacked RNN model, which achieves an overall accuracy of 99.95% and an AUROC score of 1.0 for exon-intron, intron-exon as well as no-junction classification.


Codon Exon Intron mRNA Nucleotide Splice junction Transcription 


  1. 1.
    PREMIER Biosoft: gene splicing overview & techniques. Accessed 2019
  2. 2.
    Medical Xpress: predicting how splicing errors impact disease risk. Accessed 2019
  3. 3.
    Murray RK, Bender DA, Botham KM, Kennelly PJ, Rodwell VW, Weil PA. Harpers illustrated biochemistry, pp 352–354. Accessed 2019Google Scholar
  4. 4.
    Salzberg S (1995) Locating protein coding regions in human DNA using a decision tree algorithm. J Comput Biol: J Comput Mol Cell Biol 2:473–485CrossRefGoogle Scholar
  5. 5.
    Ngoc Giang N, Anh Tran V, Luu Ngo D, Phan D, Lumbanraja F, Faisal MR, Abapihi B, Kubo M, Satou K (2016) DNA sequence classification by convolutional neural network. J Biomed Sci Eng 9:280–286CrossRefGoogle Scholar
  6. 6.
    Damasevicius R (2008) Splice site recognition in DNA sequences using k-mer frequency based mapping for support vector machine with power series kernel. In: International conference on complex, intelligent and software intensive systems, pp 687–692, March 2008Google Scholar
  7. 7.
    Cervantes J, Li X, Yu W (2009) Splice site detection in DNA sequences using a fast classification algorithm. In: SMC’09 Proceedings of the 2009 IEEE international conference on systems, man and cybernetics, pp 2683–2688, October 2009Google Scholar
  8. 8.
    Kerdprasop N, Kerdprasop K (2010) A high recall DNA splice site prediction based on association analysis. In: International conference on applied computer science proceedingsGoogle Scholar
  9. 9.
    Mandal DI (2015) A novel approach for predicting DNA splice junctions using hybrid machine learning algorithms. Soft Comput 19:3431–3444CrossRefGoogle Scholar
  10. 10.
    Lee B, Lee T, Na B, Yoon S (2015) DNA-level splice junction prediction using deep recurrent neural networks. CoRR abs/1512.05135Google Scholar
  11. 11.
    Zhang Y, Liu X, MacLeod JN, Liu J (2016) Deepsplice: deep classification of novel splice junctions revealed by RNA-seq. In: 2016 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 330–333. IEEE, December 2016Google Scholar
  12. 12.
    NCBI: Genbank. Accessed 2019Google Scholar
  13. 13.
    Revolvy: Dna codon table. Accessed 2019
  14. 14.
    Pham TH, Tran DH, Ho TB, Satou K, Valiente G. Qualitatively predicting acetylation & methylation areas in DNA sequences. Accessed 2019

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Rahul Sarkar
    • 1
    Email author
  • Chandra Churh Chatterjee
    • 1
  • Sayantan Das
    • 2
  • Dhiman Mondal
    • 1
  1. 1.Jalpaiguri Government Engineering CollegeJalpaiguriIndia
  2. 2.Nil Ratan Sircar Medical College and HospitalKolkataIndia

Personalised recommendations