Application of Deep Architecture in Bioinformatics

  • Sagnik SenEmail author
  • Rangan Das
  • Swaraj Dasgupta
  • Ujjwal Maulik
Part of the Studies in Big Data book series (SBD, volume 68)


Recent discoveries in the field of biology have transformed it into a data-rich domain. This has invited multiple machine learning applications, and in particular, deep learning a set of methodologies that have rapidly evolved over the last couple of decades. Deep learning (DL) is extensively used in many domains, including bioinformatics for the analysis and classification of biomedical imaging data, sequence data from omics and biomedical signal processing. It has been used to predict protein structures, uncover gene expression regulation, classify anomalies and understand functionalities of the brain. Basic deep neural networks, which contains stacked columns of non-linear processing units, are quite versatile and has been extensively used in almost every domain of bioinformatics. Convolutional neural networks have proved to be quite effective when working with image data and are used in classifying biomedical images such as histopathology images, cell images, X-ray images, magnetic resonance images and so on. They have been used for anomaly classification, recognition, and segmentation. For areas that require dealing with sequential data, such as protein structure prediction and brain decoding, recurrent neural networks have been used extensively. Besides these, a lot of new architectures are being currently explored to address some of the common drawbacks of deep learning. Incorporation of fuzzy systems in deep learning has been done in an attempt to improve the performance of such models. Multimodal learning in deep learning is enabling modern architectures to work with heterogeneous data.


Deep architecture Bioinformatics Biomedical images Convolutional neural network Recurrent neural network 


  1. 1.
    Pauling, L., Corey, R.B., Branson, H.R.: The structure of proteins: two hydrogen-bonded helical configuration of the polypeptide chain. Proc Natl Acad Sci 37(4), 205–211 (1951)CrossRefGoogle Scholar
  2. 2.
    Ivar, B.C.: Introduction to Protein Structure. Garland Publishing, New York (1999)Google Scholar
  3. 3.
    Patel, M., Shah, H.: Protein secondary prediction using support vector machine. In: International Conference on Machine Intelligence and Research Advancement, pp. 594–598 (2013)Google Scholar
  4. 4.
    Chou, P.Y., Fasman, G.D.: Prediction of the secondary structure of proteins from their amino acid sequence. Trends Biomed. Sci. 2, 128–131 (1977)CrossRefGoogle Scholar
  5. 5.
    Hasic, H., Buza, E., Akagic, A.: A hybrid method for prediction of protein secondary structure based on multiple artificial neural networks, pp. 1195–1200. MIPRO, Opatija (2017)Google Scholar
  6. 6.
    Cheng, J., Tegge, A.N., Baldi, P.: Machine learning method for protein structure prediction. IEEE Rev. Biomed. Eng. 1, 41–49 (2008)CrossRefGoogle Scholar
  7. 7.
    Andreopoulos, W., Labudde, D.: Protein-protein interaction networks. In: Protein Purification and Analysis I: Methods and Applications. iConcept Press (2013)Google Scholar
  8. 8.
    Jaimovich, A.: Understanding protein-protein interaction network. Ph.D. Thesis. Hebrew University (2010)Google Scholar
  9. 9.
    Asai, K., Hayamizu, S., Handa, K.I.: Prediction of protein secondary structure by the hidden Markov model. Bioinformatics 9(2), 141–146 (1993)CrossRefGoogle Scholar
  10. 10.
    Zhao, Z., Gong, X.: Protein-protein interaction interface residue pair prediction based on deep learning architecture, IEEE/ACM Trans. Comput. Biol. Bioinform. (2017)Google Scholar
  11. 11.
    Krizhevsky, A., Sutskever, I., Hinto, G.E.: Imagenet classification using deep convolutional neural network. In: Advances in Neural Information Processing System, pp. 1097–1105 (2012)Google Scholar
  12. 12.
    Cireşan, D.C., et al.: Mitosis detection in breast cancer histology images with deep neural networks. In: International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, Berlin, Heidelberg (2013)CrossRefGoogle Scholar
  13. 13.
    Sarraf, S., Tofighi, G.: Deep learning-based pipeline to recognize alzheimers disease using fMRI Data. In: IEEE, Future Technologies Conference, pp. 816–820, 2016Google Scholar
  14. 14.
    Li, X., Li, W., Xu, X., Hu, W.: Cell classification using convolutional neural networks in medical hyperspectral imagery. In: 2nd International Conference on Image, Vision and Computing, pp. 501–504 (2017)Google Scholar
  15. 15.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  16. 16.
    Greff, K., Kumar Srivastava, R., Koutin, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space Odyssey (2017). arXiv:1503.04069v1MathSciNetCrossRefGoogle Scholar
  17. 17.
    Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with LSTM recurrent networks. J. Mach. Learn. Res. pp. 115–143 (2002)Google Scholar
  18. 18.
    Svozil, D., Kvasnicka, V., Pospichal, J.: Introduction to multi-layer feed forward neural network. Chemom. Intell. Lab. Syst. 39, 43–62 (1997)CrossRefGoogle Scholar
  19. 19.
    Toh, K.-A., Lu, J., Yau, W.-Y.: Global feedforward neural network learning for classification and regression. In: International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 407–422 (2001)Google Scholar
  20. 20.
    Bishop, C.M.: Neural network for pattern recognition. Oxford University Press Inc., New York (1995)Google Scholar
  21. 21.
    Schmidt, W.F., Kraaijveld, M.A., Duin, R.P.W.: Feed forward neural networks with random weights. In: 11th IAPR International Conference on Conference B: Pattern Recognition Methodology and Systems, Proceedings, vol. 2, pp. 1–4 (1992)Google Scholar
  22. 22.
    Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Springer, Berlin, pp. 5–13 (2012)Google Scholar
  23. 23.
    Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks (2013). arXiv preprint arXiv:1312.6026
  24. 24.
    Sonderby, S.K., Winther, O.: Protein secondary structure prediction with long short term memory networks (2015). arXiv:1412.7828v2
  25. 25.
    Hochreiter, S., Heusel, M., Obermayer, K.: Fast model-based protein homology detection without alignment. Bioinformatics 23(14), 1728–1736 (2007)CrossRefGoogle Scholar
  26. 26.
    Min, S., Lee, B., Yoon, S.: Deep learning in bioinformatics. Brief. Bioinf. 18(5), 851–869 (2017)Google Scholar
  27. 27.
    Baldi, P., Brunak, S., Frasconi, P., Soda, G., Pollastri, G.: Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 15(11), 937–946 (1999)CrossRefGoogle Scholar
  28. 28.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRefGoogle Scholar
  29. 29.
    Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRefGoogle Scholar
  30. 30.
    Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)CrossRefGoogle Scholar
  31. 31.
    Yaseen, A., Li, Y.: Template-based prediction of protein 8-state secondary structures. In: IEEE 3rd International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), pp. 1–2 (2013)Google Scholar
  32. 32.
    Wolfgang, K., Christian, S.: Dictionary of protein secondary structure: pattern recognition of hydrogen bond and geometrical features. Biopolymers 22(12), 2577–2637 (1983)CrossRefGoogle Scholar
  33. 33.
    Zhou, J., Troyanskaya, O.G.: Deep supervised and convolutional generative stochastic network for protein secondary structure prediction. In: Proceeding of the 31st International Conference on Machine Learning, Beijing, China, JMLR: W&CP, vol. 32, pp. 745–753 (2014)Google Scholar
  34. 34.
    Pollastri, G., Przybylski, D., Rost, B., Baldi, P.: Improving the prediction of protein secondary structure in three and eight classes using recurrent neural network and profiles, proteins: structure. Funct. Genet. 47(2), 228235 (2002)Google Scholar
  35. 35.
    Bengio, Y., Thibodeau-Laufer, E., Alain, G.: Deep generative stochastic networks trainable by backprop. In: International Conference on Machine Learning, pp. 226–234 (2014)Google Scholar
  36. 36.
    Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  37. 37.
    Du, C., Zhu, J., Zhang, B.: Learning deep generative models with doubly stochastic gradient MCMC. IEEE Trans. Neural Netw. Learn. Syst. (2017)Google Scholar
  38. 38.
    Ozair, S., Yao, L., Bengio, Y.: Multimodal transitions for generative stochastic network. arXiV: 1312.5578v4 (2014)Google Scholar
  39. 39.
    Bengio, O., Yao, L., Alain, G., Vincent, P.: Generalized denoising auto-encoders as generative models. In: Advances in Neural Information Processing Systems, pp. 899–907 (2013)Google Scholar
  40. 40.
    Salakhutdinov, R., Hinton, G.: Deep Boltzmann machines. Appearing in Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS), Clearwater Beach, Florida, USA, vol. 5 of JMLR: W&CP 5 (2009)Google Scholar
  41. 41.
    Jones, D.T.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292(2), 195–202 (1999)CrossRefGoogle Scholar
  42. 42.
    Jamel, T.M., Khammas, B.M.: Implementation of sigmoid activation function for neural network using FPGA. In: 13th Scientific Conference of Al-Ma’moon University College (2012)Google Scholar
  43. 43.
    Wang, G., Dunbrack Jr., R.L.: PISCES: a protein sequence culling server. Bioinformatics 19, 1589–1591 (2003)CrossRefGoogle Scholar
  44. 44.
    Wang, Z., Zhao, F., Peng, J., Xu, J.: Protein 8-class secondary structure prediction using conditional neural fields. Proteomics 11(19), 3786–3792 (2011)CrossRefGoogle Scholar
  45. 45.
    Ng, A.: Sparse Autoencoder. CS294A Lecture notes, vol. 72 (2011)Google Scholar
  46. 46.
    Ng, A.: Supervised learning. CS229 Lecture Notes, pp. 1–3 (2000)Google Scholar
  47. 47.
    Al-Azzawi, A.: Deep learning approach for secondary structure protein prediction based on first level features extraction using a latent cnn structure. Int. J. Adv. Comput. Sci. Appl. 8(4), 5–12 (2017)Google Scholar
  48. 48.
    Hubel, D.H., Wiesel, T.N.: Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195, 215–243 (1967)CrossRefGoogle Scholar
  49. 49.
    LeCun, Y., Bengio, Y.: Convolutional Networks for Image, Speech and Time-Series. AT and T Bell Laboratories, Dept Imformatique Recherche (1995)Google Scholar
  50. 50.
    LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst. 396–404 (1990)Google Scholar
  51. 51.
    Magnan, C.N., Baldi, P.: Perfect prediction of protein secondary structure and relative solvent accessibility. Mach. Learn. Struct. Similarity Bioinform. 30(18), 2592–2597 (2014)Google Scholar
  52. 52.
    Tavanaei, A., Maida, A.S., Kaniymattam, A., Loganantharaj, R.: Towards recognition of protein function based on its structure using deep convolutional network. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 145–149 (2016)Google Scholar
  53. 53.
    Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B., Thornton, J.M.: CATH a hierarchic classification of protein domain structures. Structure 5(8), 1093–1109 (1997)CrossRefGoogle Scholar
  54. 54.
    Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247(4), 536–540 (1995)Google Scholar
  55. 55.
    Karim, R., Al-Aziz, M.M., Shatabda, S., Rahman, M.S., Mia, M.A.K., Zaman, F., Rakin, S.: CoMOGrad and PHOG: from computer vision to fast and accurate protein tertiary structure retrieval. Sci. Rep. 5, 1–11 (2015)Google Scholar
  56. 56.
    Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M., Meng, E.C., Ferrin, T.F.: UCSF chimera a visualization system for exploratory research and analysis. J. Comput. Chem. 25(13), 1605–1612 (2004)CrossRefGoogle Scholar
  57. 57.
    Kraulis, P.K.: MOLSCRIPT: a program to produce both detail and semantic plots of protein structures. J. Appl. Crystallogr. 24, 946–950 (1991)CrossRefGoogle Scholar
  58. 58.
    Nooruddin, F., Turk, G.: Simplification and repair of polygonal models using volumetric techniques. In: IEEE Trans. Vis. Comput. Graph. 9(2), 191–205 (2003)CrossRefGoogle Scholar
  59. 59.
    Zakeri, P., Jeuris, B., Vandebril, R.: Protein fold recognition using geometric kernel data fusion. Bioinformatics 30(13), 1850–1857 (2014)CrossRefGoogle Scholar
  60. 60.
    Brylinski, M., Lingam, D.: eThread: a highly optimized machine learning based approach to meta threading and the modeling of protein tertiary structure. PLoS One 7(11), e50200 (2012)CrossRefGoogle Scholar
  61. 61.
    Lin, C., Zou, Y., Qin, J., Jiang, Y., Ke, C., Zou, Q.: Hierarchical classification of protein folds using a novel ensemble classifier. PLoS One 8(2), e56499 (2013)CrossRefGoogle Scholar
  62. 62.
    Borgwardt, K.M., Ong, C.S., Schonauer, S., Vishwanathan, S.V.N., Smola, A.J., Kriegel, H.-P.: Protein function prediction via graph kernels. Bioinformatics 21, i47–i56 (2005)CrossRefGoogle Scholar
  63. 63.
    Giard, J., Ambroise, J., Gala, L.J.: Regression applied to protein binding site prediction and comparison with classication. BMC Bioinform. 10(1), 1–12 (2009)CrossRefGoogle Scholar
  64. 64.
    Cheng, J., Baldi, P.: Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinform. 8(2), 1–9 (2007)CrossRefGoogle Scholar
  65. 65.
    Ohue, M., Matsuzaki, Y., Shimoda, T.: Highly precise protein-protein interaction prediction based on consensus between template-based and de novo docking methods. BMC Proc. 7(7), S6 (2013)Google Scholar
  66. 66.
    Gobel, U., Sander, C., Schneider, R.: Correlated mutations and residue contacts in proteins. BMC Proc. 7(7), S6 (2013)Google Scholar
  67. 67.
    Singh, R., Park, D., Xu, J., Hosur, R., Berger, B.: Struct2Net: a web service to predict protein–protein interactions using structure based approach. Nucleic Acids Res. 38(2), 508–515 (2010)CrossRefGoogle Scholar
  68. 68.
    Moult, J.B., Fidelis, K., Rost, B.: Critical assessment of methods of protein structure prediction, CASP, Round 6. Proteins (2010)Google Scholar
  69. 69.
    Lena, D.P., Nagata, K., Baldi, P.: Deep architectures for protein contact map prediction. Bioinformatics 28(19), 2449–2457 (2012)CrossRefGoogle Scholar
  70. 70.
    Larochelle, H., Bengio, Y., Louradour, J.: Exploring strategies for training deep neural networks. J. Mach. Learn. Res. 1–40 (2009)Google Scholar
  71. 71.
    Alessandro, L., Gianluca, P., Pierre, B.: Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J. Chem. Inform. Model. 53(7), 1563–1575 (2013)CrossRefGoogle Scholar
  72. 72.
    Vreven, T., Moal, H.I., Vangone, A.: Updates to the integrated protein–protein interaction benchmarks: docking benchmark version 5 and affinity benchmark version 2. J. Mol. Biol. 427(19), 3031–3041 (2015)CrossRefGoogle Scholar
  73. 73.
    Janin, J., Henrick, K., Moult, J.: Assessment of predicted interactions. CAPRI: a critical assessment of predicted interactions. Proteins Struct. Funct. Bioinform. 52(1), 2–9 (2003)Google Scholar
  74. 74.
    Sahiner, B.: Classification of mass and normal breast tissue: a convolution neural network classifier with spatial domain and texture images. Proteins Struct. Funct. IEEE Trans. Med. Imag. 15(5), 598610 (1996)CrossRefGoogle Scholar
  75. 75.
    Shaun, P.: Brain MRI Segmentation, Computational Surgery and Dual Training, pp. 45–73. Springer, US (2010)Google Scholar
  76. 76.
    Yue-Hei Ng, J., Hausknecht, M., Vijayanarasimhan, S., Vinyals, O., Monga, R., Toderici, G.: Beyond short snippets: deep networks for video classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4694–4702 (2015)Google Scholar
  77. 77.
    Ye, H., Wu, Z., Zhao, R.-W., Wang, X., Jiang, Y.-G., Xue, X.: Evaluating two-stream CNN for video classification. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 435–442 (2015)Google Scholar
  78. 78.
    Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Largescale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on International Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)Google Scholar
  79. 79.
    Cui, Z., Yang, J., Qiao, Y.: Brain MRI segmentation with patch-based CNN approach. In: Proceedings of the 35th Chinese Control Conference, pp. 27–29 (2016)Google Scholar
  80. 80.
    Kennedy, N.D., Haselgrove, C., Hodge, M.S.: CANDIShare: a resource for pediatric neuroimaging data. Neuroinformatics 10(3), 319–322 (2012)CrossRefGoogle Scholar
  81. 81.
    Leena Silvoster, M., Govindan, V.K.: Convolutional neural network based segmentation. In: Computer Networks and Intelligent Computing: 5th International Conference on Information Processing, ICIP, vol 157, pp. 190 (2011)Google Scholar
  82. 82.
    Zhang, W., Li, R., Deng, H., Wenlu, L., Lin, W., Ji, S., Shen, D.: Deep convolutional neural networks for multi-modality isointense infant brain image segmentation. NeuroImage 214–224 (2015)CrossRefGoogle Scholar
  83. 83.
    Tripoliti, E.E., Fotiadis, D.I., Argyropoulou, M.: A supervised method to assist the diagnosis and classification of the status of alzheimers disease using data from an FMRI experiment. In: Engineering in Medicine and Biology Society. EMBS 2008. 30th Annual International Conference of the IEEE, pp. 4419–4422 (2008)Google Scholar
  84. 84.
    Wang, D., Khosla, A., Gargeya, R., Irshad, H., Beck, A.H.: Deep learning for identifying metastatic breast cancer (2016). arXiv preprint arXiv:1606.05718
  85. 85.
    Quang, D., Chen, Y., Xie, X.: DANN: a deep learning approach for annotating the pathogenicity of genetic variants. Bioinformatics 31(5), 761–763 (2014)CrossRefGoogle Scholar
  86. 86.
    Kraus, O.Z., Grys, B.T., Ba, J., et al.: Automated analysis of high-content microscopy data with deep learning. Mol. Syst. Biol. 13(924 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Sagnik Sen
    • 1
    Email author
  • Rangan Das
    • 1
  • Swaraj Dasgupta
    • 1
  • Ujjwal Maulik
    • 1
  1. 1.Department of Computer Science and EngineeringJadavpur UniversityJadavpur, KolkataIndia

Personalised recommendations