Calpain pp 111-120 | Cite as

LabCaS for Ranking Potential Calpain Substrate Cleavage Sites from Amino Acid Sequence

  • Yong-Xian Fan
  • Xiaoyong Pan
  • Yang ZhangEmail author
  • Hong-Bin ShenEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1915)


Calpains are a family of Ca2+-dependent cysteine proteases involved in many important biological processes, where they selectively cleave relevant substrates at specific cleavage sites to regulate the function of the substrate proteins. Presently, our knowledge about the function of calpains and the mechanism of substrate cleavage is still limited due to the fact that the experimental determination and validation on calpain bindings are usually laborious and expensive. This chapter describes LabCaS, an algorithm that is designed for predicting the calpain substrate cleavage sites from amino acid sequences. LabCaS is built on a conditional random field (CRF) statistic model, which trains the cleavage site prediction on multiple features of amino acid residue preference, solvent accessibility information, pair-wise alignment similarity score, secondary structure propensity, and physical-chemistry properties. Large-scale benchmark tests have shown that LabCaS can achieve a reliable recognition of the cleavage sites for most calpain proteins with an average AUC score of 0.862. Due to the fast speed and convenience of use, the protocol should find its usefulness in large-scale calpain-based function annotations of the newly sequenced proteins. The online web server of LabCaS is freely available at

Key words

Protease substrate recognition Cleavage site prediction Sequence labeling Ensemble learning Calpain Conditional random fields 



We are grateful to Mr. Wallace Chan and Dr. S M Golam Mortuza for proofreading the manuscript. This work was supported in part by the National Natural Science Foundation of China (No. 61462018, 61762026, 61671288, 91530321, 61725302, and 61603161), Guangxi Natural Science Foundation (No. 2017GXNSFAA198278), Guangxi Key Laboratory of Trusted Software (No. kx201403), Guangxi Colleges and Universities Key Laboratory of Intelligent Processing of Computer Images and Graphics (No. GIIP201502), Science and Technology Commission of Shanghai Municipality (No. 16JC1404300, 17JC1403500), and the National Science Foundation (ABI 1564756).


  1. 1.
    Campbell RL, Davies PL (2012) Structure-function relationships in calpains. Biochem J 447:335–351CrossRefGoogle Scholar
  2. 2.
    Franco SJ, Huttenlocher A (2005) Regulating cell migration: calpains make the cut. J Cell Sci 118:3829–3838CrossRefGoogle Scholar
  3. 3.
    Storr SJ, Carragher NO, Frame MC et al (2011) The calpain system and cancer. Nat Rev Cancer 11:364–374CrossRefGoogle Scholar
  4. 4.
    Bertipaglia I, Carafoli E (2007) Calpains and human disease. Subcell Biochem 45:29–53CrossRefGoogle Scholar
  5. 5.
    Croall DE, Ersfeld K (2007) The calpains: modular designs and functional diversity. Genome Biol 8:218CrossRefGoogle Scholar
  6. 6.
    Friedrich P, Bozoky Z (2005) Digestive versus regulatory proteases: on calpain action in vivo. Biol Chem 386:609–612CrossRefGoogle Scholar
  7. 7.
    Rawlings ND, Barrett AJ, Finn R (2016) Twenty years of the MEROPS database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res 44:D343–D350CrossRefGoogle Scholar
  8. 8.
    Duverle D, Takigawa I, Ono Y et al (2010) CaMPDB: a resource for calpain and modulatory proteolysis. Genome Inform 22:202–213PubMedGoogle Scholar
  9. 9.
    Bairoch A (2000) The ENZYME database in 2000. Nucleic Acids Res 28:304–305CrossRefGoogle Scholar
  10. 10.
    Igarashi Y, Eroshkin A, Gramatikova S et al (2007) CutDB: a proteolytic event database. Nucleic Acids Res 35:D546–D549CrossRefGoogle Scholar
  11. 11.
    Igarashi Y, Heureux E, Doctor KS et al (2009) PMAP: databases for analyzing proteolytic events and pathways. Nucleic Acids Res 37:D611–D618CrossRefGoogle Scholar
  12. 12.
    Tompa P, Buzder-Lantos P, Tantos A et al (2004) On the sequential determinants of calpain cleavage. J Biol Chem 279:20775–20785CrossRefGoogle Scholar
  13. 13.
    Boyd SE, Pike RN, Rudy GB et al (2005) PoPS: a computational tool for modeling and predicting protease specificity. J Bioinforma Comput Biol 3:551–585CrossRefGoogle Scholar
  14. 14.
    Verspurten J, Gevaert K, Declercq W et al (2009) SitePredicting the cleavage of proteinase substrates. Trends Biochem Sci 34:319–323CrossRefGoogle Scholar
  15. 15.
    Liu Z, Cao J, Gao X et al (2011) GPS-CCD: a novel computational program for the prediction of calpain cleavage sites. PLoS One 6:e19001CrossRefGoogle Scholar
  16. 16.
    Duverle DA, Ono Y, Sorimachi H et al (2011) Calpain cleavage prediction using multiple kernel learning. PLoS One 6(5):e19035CrossRefGoogle Scholar
  17. 17.
    Fan YX, Zhang Y, Shen HB (2013) LabCaS: labeling calpain substrate cleavage sites from amino acid sequence using conditional random fields. Proteins 81:622–634CrossRefGoogle Scholar
  18. 18.
    Shinkai-Ouchi F, Koyama S, Ono Y et al (2016) Predictions of cleavability of calpain proteolysis by quantitative structure-activity relationship analysis using newly determined cleavage sites and catalytic efficiencies of an oligopeptide array. Mol Cell Proteomics 15:1262–1280CrossRefGoogle Scholar
  19. 19.
    Roy A, Kucukural A, Zhang Y (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5:725–738CrossRefGoogle Scholar
  20. 20.
    Xu D, Zhang J, Roy A et al (2011) Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement. Proteins 79 Suppl 10:147–160CrossRefGoogle Scholar
  21. 21.
    Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195–202CrossRefGoogle Scholar
  22. 22.
    Granseth E, Von Heijne G, Elofsson A (2005) A study of the membrane-water interface region of membrane proteins. J Mol Biol 346:377–385CrossRefGoogle Scholar
  23. 23.
    Mak MW, Wang W, Kung SY (2009) Fusion of conditional random field and signalp for protein cleavage site prediction. In: In acoustics, speech and signal processing. Taipei, pp 716–721Google Scholar
  24. 24.
    Fan YX, Song J, Shen HB et al (2011) PredCSF: an integrated feature-based approach for predicting conotoxin superfamily. Protein Pept Lett 18:261–267CrossRefGoogle Scholar
  25. 25.
    Hammersley J, Clifford P (1971) Markov field on finite graphs and lattices. Unpublished manuscriptGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Guangxi Key Laboratory of Trusted Software, Guangxi Colleges and Universities Key Laboratory of Intelligent Processing of Computer Images and GraphicsGuilin University of Electronic TechnologyGuilinChina
  2. 2.Department of Medical InformaticsErasmus MCRotterdamThe Netherlands
  3. 3.Department of Computational Medicine and BioinformaticsUniversity of MichiganAnn ArborUSA
  4. 4.Institute of Image Processing and Pattern RecognitionShanghai Jiao Tong UniversityShanghaiChina
  5. 5.Key Laboratory of System Control and Information ProcessingMinistry of Education of ChinaShanghaiChina

Personalised recommendations