Skip to main content

An Empirical Analysis of Instance-Based Transfer Learning Approach on Protease Substrate Cleavage Site Prediction

  • Conference paper
  • First Online:
  • 1292 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 748))

Abstract

Classical machine learning algorithms presume the supervised data emerged from the same domain. Transfer learning on the contrary to classical machine learning methods; utilize the knowledge acquired from the auxiliary domains to aid predictive capability of diverse data distribution in the current domain. In the last few decades, there is a significant amount of work done on the domain adaptation and knowledge transfer across the domains in the field of bioinformatics. The computational method for the classification of protease cleavage sites is significantly important in the inhibitors and drug design techniques. Matrix metalloproteases (MMP) are one such protease that has a crucial role in the disease process. However, the challenge in the computational prediction of MMPs substrate cleavage persists due to the availability of very few experimentally verified sites. The objective of this paper is to explore the cross-domain learning in the classification of protease substrate cleavage sites, such that the lack of availability of one-domain cleavage sites can be furnished by the other available domain knowledge. To achieve this objective, we employed the TrAdaBoost algorithm and its two variants: dynamic TrAdaBoost and multisource TrAdaBoost on the MMPs dataset available at PROSPER. The robustness and acceptability of the TrAdaBoost algorithms in the substrate site identification have been validated by rigorous experiments. The aim of these experiments is to compare the performances among learner. The experimental results demonstrate the potential of dynamic TrAdaBoost algorithms on the protease dataset by outperforming the fundamental and other variants of TrAdaBoost algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Lu, P., Takai, K., Weaver, V.M., Werb, Z.: Extracellular matrix degradation and remodeling in development and disease. Cold Spring Harb. Perspect. Biol. 3(12), 1–24 (2011)

    Article  Google Scholar 

  2. Coussens, L.M., Fingleton, B., Matrisian, L.M.: Matrix metalloproteinase inhibitors and cancer: trials and tribulations. Science 295(5564), 2387–2392 (2002)

    Article  Google Scholar 

  3. Cieplak, P., Strongin, A.Y.: Matrix metalloproteinases—from the cleavage data to the prediction tools and beyond. In: Biochimica et Biophysica Acta (BBA)—Molecular Cell Research, pp. 1–12, Jan 2017

    Google Scholar 

  4. Rögnvaldsson, T., Etchells, T.A., You, L., Garwicz, D., Jarman, I., Lisboa, P.J.G.: How to find simple and accurate rules for viral protease cleavage specificities. BMC Bioinform. 10, 149 (2009)

    Google Scholar 

  5. Yousef, M., Nebozhyn, M., Shatkay, H., Kanterakis, S., Showe, L.C., Showe, M.K.: Combining multi-species genomic data for microRNA identification using a Naïve Bayes classifier. Bioinformatics 22(11), 1325–1334 (2006)

    Article  Google Scholar 

  6. Wee, L.J.K., Tan, T.W., Ranganathan, S.: CASVM: web server for SVM-based prediction of caspase substrates cleavage sites. Bioinformatics 23(23), 3241–3243 (2007)

    Article  Google Scholar 

  7. Tan, A.C., Gilbert, D.: An empirical comparison of supervised machine learning techniques in bioinformatics, vol. 19, no. Apbc (2009)

    Google Scholar 

  8. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  9. Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th International Conference on Machine Learning, no. 1, pp. 513–520 (2011)

    Google Scholar 

  10. Iqbal, M., Xue, B., Al-Sahaf, H., Zhang, M.: Cross-domain reuse of extracted knowledge in genetic programming for image classification. IEEE Trans. Evol. Comput. PP(99), 1 (2017)

    Google Scholar 

  11. Wang, Y., et al.: Knowledge-transfer learning for prediction of matrix metalloprotease substrate-cleavage sites. Sci. Rep. 1–15 (2017)

    Google Scholar 

  12. Dai, W., Yang, Q., Xue, G.-R., Yu, Y.: Boosting for transfer learning. In: Proceedings of the 24th international conference on Machine learning—ICML ’07, pp. 193–200 (2007)

    Google Scholar 

  13. Al-Stouhi, S., Reddy, C.K.: Adaptive boosting for transfer learning using dynamic updates. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNAI, vol. 6911, no. PART 1, pp. 60–75 (2011)

    Google Scholar 

  14. Yao, Y., Doretto, G.: Boosting for transfer learning with multiple sources. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1855–1862 (2010)

    Google Scholar 

  15. Chen, C.T., Yang, E.W., Hsu, H.J., Sun, Y.K., Hsu, W.L., Yang, A.S.: Protease substrate site predictors derived from machine learning on multilevel substrate phage display data. Bioinformatics 24(23), 2691–2697 (2008)

    Article  Google Scholar 

  16. Barkan, D.T., et al.: Prediction of protease substrates using sequence and structure features. Bioinformatics 26(14), 1714–1722 (2010)

    Article  Google Scholar 

  17. Boyd, S.E., Garcia de la Banda, M., Pike, R.N., Whisstock, J.C., Rudy, G.B.: PoPS: a computational tool for modeling and predicting protease specificity. In: Proceedings/IEEE Computational Systems Bioinformatics Conference, CSB. IEEE Computational Systems Bioinformatics Conference, no. Csb, pp. 372–381 (2004)

    Google Scholar 

  18. Verspurten, J., Gevaert, K., Declercq, W., Vandenabeele, P.: SitePredicting the cleavage of proteinase substrates. Trends Biochem. Sci. 34(7), 319–323 (2009)

    Article  Google Scholar 

  19. Song, J., et al.: Cascleave: towards more accurate prediction of caspase substrate cleavage sites. Bioinformatics 26(6), 752–760 (2010)

    Article  Google Scholar 

  20. Wang, M., Zhao, X.M., Tan, H., Akutsu, T., Whisstock, J.C., Song, J.: Cascleave 2.0, a new approach for predicting caspase and granzyme cleavage targets. Bioinformatics 30(1), 71–80 (2014)

    Article  Google Scholar 

  21. Piippo, M., Lietzén, N., Nevalainen, O.S., Salmi, J., Nyman, T.A.: Pripper: prediction of caspase cleavage sites from whole proteomes. BMC Bioinform. 11(1), 320 (2010)

    Article  Google Scholar 

  22. Garay-Malpartida, H.M., Occhiucci, J.M., Alves, J., Belizário, J.E.: CaSPredictor: a new computer-based tool for caspase substrate prediction. Bioinformatics 21(SUPPL. 1), 169–176 (2005)

    Article  Google Scholar 

  23. Backes, C., Kuentzer, J., Lenhof, H.P., Comtesse, N., Meese, E.: GraBCas: a bioinformatics tool for score-based prediction of Caspase- and granzyme B-cleavage sites in protein sequences. Nucleic Acids Res. 33(SUPPL. 2), 208–213 (2005)

    Article  Google Scholar 

  24. Dönnes, P., Elofsson, A.: Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinform. 3, 25 (2002)

    Article  Google Scholar 

  25. Widmer, C., Toussaint, N.C., Altun, Y., Kohlbacher, O., Rätsch, G.: Novel machine learning methods for MHC class I binding prediction. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNBI, vol. 6282, pp. 98–109 (2010)

    Google Scholar 

  26. Song, J., et al.: PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites. PLoS ONE 7(11) (2012)

    Google Scholar 

  27. Kumar, S., Ratnikov, B.I., Kazanov, M.D., Smith, J.W., Cieplak, P.C.: CleavPredict: a platform for reasoning about matrix metalloproteinases proteolytic events. PLoS ONE 10(5), 1–19 (2015)

    Google Scholar 

  28. Rawlings, N.D., Barrett, A.J., Finn, R.: Twenty years of the MEROPS database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 44(D1), D343–D350 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Singh, D., Sisodia, D.S., Singh, P. (2019). An Empirical Analysis of Instance-Based Transfer Learning Approach on Protease Substrate Cleavage Site Prediction. In: Tanveer, M., Pachori, R. (eds) Machine Intelligence and Signal Analysis. Advances in Intelligent Systems and Computing, vol 748. Springer, Singapore. https://doi.org/10.1007/978-981-13-0923-6_6

Download citation

Publish with us

Policies and ethics