Prediction of Bacteriophage Protein Locations Using Deep Neural Networks

  • Muhammad Ali
  • Farzana Afrin Taniza
  • Arefeen Rahman Niloy
  • Sanjay Saha
  • Swakkhar ShatabdaEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 755)


In phage therapy, bacteriophage proteins are used to kill bacteria that cause infection. The knowledge of the location of the bacteriophage proteins plays an important role here. In this paper, we propose a supervised learning based method to predict the locations of bacteriophage proteins. First, we address the problem of predicting whether a bacteriophage is extracellular or located in the host cell. Second, we also address the subcellular location prediction problem of the phage proteins. For the host located proteins, the proteins could either be located in cell membrane or in the cytoplasm. We have successfully used deep feed-forward neural network on a standard training dataset and achieved good results for both of the prediction problems. Our method uses an optimal set of features for classification and achieves 87.7% and 98.5% accuracy for two of the prediction problems which is 3.5% and 6.3% improved than the previous state-of-the-art results achieved for these problems, respectively.


Supervised learning Deep neural networks Feature selection Protein subcellular localization 


  1. 1.
    Deresinski, S.: Bacteriophage therapy: exploiting smaller fleas. Clin. Infect. Dis. 48(8), 1096–1101 (2009)CrossRefGoogle Scholar
  2. 2.
    Shatabda, S., Saha, S., Sharma, A., Dehzangi, A.: iphloc-es: identification of bacteriophage protein locations using evolutionary and structural features. J. Theor. Biol. 435, 229–237 (2017)CrossRefGoogle Scholar
  3. 3.
    Ding, H., Liang, Z.Y., Guo, F.B., Huang, J., Chen, W., Lin, H.: Predicting bacteriophage proteins located in host cell with feature selection technique. Comput. Biol. Med. 71, 156–161 (2016)CrossRefGoogle Scholar
  4. 4.
    Ding, H., Yang, W., Tang, H., Feng, P.M., Huang, J., Chen, W., Lin, H.: Phypred: a tool for identifying bacteriophage enzymes and hydrolases. Virologica Sinica 31(4), 350 (2016)CrossRefGoogle Scholar
  5. 5.
    Ding, H., Feng, P.M., Chen, W., Lin, H.: Identification of bacteriophage virion proteins by the anova feature selection and analysis. Mol. BioSyst. 10(8), 2229–2235 (2014)CrossRefGoogle Scholar
  6. 6.
    Sharma, R., Dehzangi, A., Lyons, J., Paliwal, K., Tsunoda, T., Sharma, A.: Predict gram-positive and gram-negative subcellular localization via incorporating evolutionary information and physicochemical features into chou’s general pseaac. IEEE Trans. NanoBiosci. 14(8), 915–926 (2015)CrossRefGoogle Scholar
  7. 7.
    Zhou, Y., Liang, Y., Lynch, K.H., Dennis, J.J., Wishart, D.S.: Phast: a fast phage search tool. Nucleic Acids Res. (2011). gkr485Google Scholar
  8. 8.
    Akhter, S., Aziz, R.K., Edwards, R.A.: Phispy: a novel algorithm for finding prophages in bacterial genomes that combines similarity-and composition-based strategies. Nucleic Acids Res. 40(16), e126–e126 (2012)CrossRefGoogle Scholar
  9. 9.
    Arndt, D., Grant, J.R., Marcu, A., Sajed, T., Pon, A., Liang, Y., Wishart, D.S.: Phaster: a better, faster version of the phast phage search tool. Nucleic Acids Res. 44(W1), W16–W21 (2016)CrossRefGoogle Scholar
  10. 10.
    McNair, K., Bailey, B.A., Edwards, R.A.: Phacts, a computational approach to classifying the lifestyle of phages. Bioinformatics 28(5), 614–618 (2012)CrossRefGoogle Scholar
  11. 11.
    Galiez, C., Magnan, C., Coste, F., Baldi, P.: ViRALpro: a new suite for identifying viral capsid and tail sequences (2015)Google Scholar
  12. 12.
    Chou, K.C.: Some remarks on protein attribute prediction and pseudo amino acid composition. J. Theor. Biol. 273(1), 236–247 (2011)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Consortium, U., et al.: UniProt: a hub for protein information. Nucleic Acids Res. (2014). gku989Google Scholar
  14. 14.
    Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)CrossRefGoogle Scholar
  15. 15.
    Yang, Y., Heffernan, R., Paliwal, K., Lyons, J., Dehzangi, A., Sharma, A., Wang, J., Sattar, A., Zhou, Y.: Spider2: a package to predict secondary structure, accessible surface area, and main-chain torsional angles by deep neural networks. Prediction of Protein Secondary Structure, pp. 55–63 (2017)Google Scholar
  16. 16.
    Dubchak, I., Muchnik, I., Mayor, C., Dralyuk, I., Kim, S.H.: Recognition of a protein fold in the context of the scop classification. Proteins Struct. Funct. Bioinform. 35(4), 401–407 (1999)CrossRefGoogle Scholar
  17. 17.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002)CrossRefGoogle Scholar
  18. 18.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  19. 19.
    Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016). arXiv:1603.04467

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Muhammad Ali
    • 1
  • Farzana Afrin Taniza
    • 1
  • Arefeen Rahman Niloy
    • 1
  • Sanjay Saha
    • 1
  • Swakkhar Shatabda
    • 1
    Email author
  1. 1.Department of Computer Science and EngineeringUnited International UniversityDhakaBangladesh

Personalised recommendations