Abstract
Protein O-GlcNAcylation on serine and threonine residues is a significant posttranslational modification. Experimental techniques can uncover only a small portion of O-GlcNAcylation sites. Several computational algorithms have been proposed as necessary auxiliary tools to identify potential O-GlcNAcylation sites. This chapter discusses the metrics and procedures used to assess prediction tools and surveys six computational tools for the prediction of protein O-GlcNAcylation sites. Analyses of these tools using an independent test dataset indicated the advantages and disadvantages of the six existing prediction methods. We also discuss the challenges that may be faced while developing novel predictors in the future.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Torres CR, Hart GW (1984) Topography and polypeptide distribution of terminal N-acetylglucosamine residues on the surfaces of intact lymphocytes. Evidence for O-linked GlcNAc. J Biol Chem 259:3308–3317
Comer FI, Hart GW (1999) O-GlcNAc and the control of gene expression. Biochim Biophys Acta 1473:161–171
McClain DA, Crook ED (1996) Hexosamines and insulin resistance. Diabetes 45:1003–1009
Liu F et al (2004) O-GlcNAcylation regulates phosphorylation of tau: a mechanism involved in Alzheimer’s disease. Proc Natl Acad Sci U S A 101:10804–10809
Wang Z et al (2010) Enrichment and site mapping of O-linked N-acetylglucosamine by a combination of chemical/enzymatic tagging, photochemical cleavage, and electron transfer dissociation mass spectrometry. Mol Cell Proteomics 9(1):153–160
Wang J, Torii M, Liu H et al (2011) dbOGAP - an integrated bioinformatics resource for protein O-GlcNAcylation. BMC Bioinformatics 12(1):91
Gupta R, Brunak S (2002) Prediction of glycosylation across the human proteome and the correlation to protein function. Pac Symp Biocomput 2002:310–322
Jia CZ, Liu T, Wang ZP (2013) O-GlcNAcPRED: a sensitive predictor to capture protein O-GlcNAcylation sites. Mol Biosyst 9(11):2909–2913
Wu HY et al (2014) Characterization and identification of protein O-GlcNAcylation sites with substrate specificity. BMC Bioinformatics 15(16):S1
Zhao XW et al (2015) PGlcS: prediction of protein O-GlcNAcylation sites with multiple features and analysis. J Theor Biol 380(3):524
Kao HJ et al (2015) A two-layered machine learning method to identify protein O-GlcNAcylation sites with O-GlcNAc transferase substrate motifs. BMC Bioinformatics 16(18):S10
Apweiler R, Bairoch A, Wu CH et al (2004) UniProt: the Universal Protein knowledgebase. Nucleic Acids Res 32(Database issue):115–119
Lee TY, Huang HD, Hung JH et al (2006) dbPTM: an information repository of protein post-translational modification. Nucleic Acids Res 34(Database issue):622–627
Hansen JE, Lund O, Nielsen JO et al (1999) O-GLYCBASE: a revised database of O-glycosylated proteins. Nucleic Acids Res 27(1):370–372
Hornbeck PV, Kornhauser JM, Tkachev S et al (2012) PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse. Nucleic Acids Res 40(Database issue):D261
Shi SP et al (2015) Progress and challenges in predicting protein methylation sites. Mol Biosyst 11:2610–2619
Chen SA, Lee TY, YY O (2010) Incorporating significant amino acid pairs to identify O-linked glycosylation sites on transmembrane proteins and non-transmembrane proteins. BMC Bioinformatics 11(1):1–13
Caragea C, Sinapov J, Silvescu A et al (2007) Glycosylation site prediction using ensembles of Support Vector Machine classifiers. BMC Bioinformatics 8(1):438
Trinidad JC et al (2012) Global identification and characterization of both O-GlcNAcylation and phosphorylation at the murine synapse. Mol Cell Proteomics 11(8):215–229
Jochmann R et al (2013) O-GlcNAc transferase inhibits KSHV propagation and modifies replication relevant viral proteins as detected by systematic O-GlcNAcylation analysis. Glycobiology 23(10):1114–1130
Hahne H, Gholami A, Kuster B (2012) Discovery of O-GlcNAc-modified proteins in published large-scale proteome data. Mol Cell Proteom 11(10):843
Hahne H et al (2013) Proteome wide purification and identification of O-GlcNAc-modified proteins using click chemistry and mass spectrometry. J Proteome Res 12(2):927–936
Allison DF et al (2012) Modification of RelA by O-linked N-acetylglucosamine links glucose metabolism to NF-κB acetylation and transcription. Proc Natl Acad Sci U S A 109(42):16888–16893
Gawlowski T et al (2012) Modulation of dynamin-related protein 1 (DRP1) function by increased O-linked-β-N-acetylglucosamine modification (O-GlcNAc) in cardiac myocytes. J Biol Chem 287(35):30024–30034
Wang S et al (2012) Extensive crosstalk between O-GlcNAcylation and phosphorylation regulates Akt signaling. PLoS One 7(5):e37427
Floyd ZE, Stephens JM (2012) Controlling a master switch of adipocyte development and insulin sensitivity: covalent modifications of PPARγ. Biochim Biophys Acta 1822(7):1090–1095
Ji S et al (2012) O-GlcNAc modification of PPARgamma reduces its transcriptional activity. Biochem Biophys Res Commun 417(4):1158–1163
Alfaro JF et al (2012) Tandem mass spectrometry identifies many mouse brain O-GlcNAcylated proteins including EGF domain-specific O-GlcNAc transferase targets. Proc Natl Acad Sci U S A 109(19):7280–7285
Pathak S et al (2012) O-GlcNAcylation of TAB1 modulates TAK1-mediated cytokine release. EMBO J 31(6):1394–1404
Overath T et al (2012) Mapping of O-GlcNAc sites of 20 S proteasome subunits and Hsp90 by a novel biotin-cystamine tag. Mol Cell Proteom 11(8):467
Shao J et al (2009) Computational identification of protein methylation sites through bi-profile bayes feature extraction. PLoS One 4(3):e4920
Lee TY, Chen SA, Hung HY, YY O (2011) Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites. PLoS One 6(3):e17331
Hsu BK et al (2012) Incorporating evolutionary information and functional domains for identifying RNA splicing factors in humans. PLoS One 6(11):e27567
Xie D et al (2005) LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST. Nucleic Acids Res 33(Web Server issue):105–110
Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292(2):195
Altschul SF et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402
Andersen MT, Packer NH (2014) Advances in LC–MS/MS-based glycoproteomics: getting closer to system-wide site-specific mapping of the N- and O-glycoproteome. Biochim Biophys Acta 1844:1437–1452
Mcguffin LJ, Bryson K, Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16(4):404
Ward JJ et al (2004) Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337(3):635
Ahmad S, Gromiha MM, Sarai A (2003) RVP-net: online prediction of real valued accessible surface area of proteins from single sequences. Bioinformatics 19(14):1849–1851
Ahmad S, Gromiha MM, Sarai A (2003) Real value prediction of solvent accessibility from amino acid sequence. Proteins 50(4):629–635
Kenney JF, Mosak JL (1951), Mathematics of Statistics, Van Nostrand, Princeton, NJ, 2nd edn, pp. 36–41
Kawashima S et al (2008) AAindex: amino acid index database, progress report 2008. Nucleic Acids Res 36(Database issue):202–205
Tung CW, Ho SY (2008) Computational identification of ubiquitylation sites from protein sequences. BMC Bioinformatics 9:310
Cao DS, QS X, Liang YZ (2013) Propy: a tool to generate various modes of Chou’s PseAAC. Bioinformatics 29:960–962
Du P, Gu S, Jiao Y (2014) PseAAC-General: fast building various modes of general form of Chou’s pseudo-amino acid composition for large-scale protein datasets. Int J Mol Sci 15(3):3495–3506
Qiu WR, Xiao X, Lin WZ (2014) iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach. Biomed Res Int 2014(12):947416
Zhang Y, Liu B, Dong Q, Jin VX (2011) An improved profile-level domain linker propensity index for protein domain boundary prediction. Protein Pept Lett 18(1):7–16
Shao J et al (2012) Systematic analysis of human lysine acetylation proteins and accurate prediction of human lysine acetylation through bi-relative adapted binomial score Bayes feature representation. Mol Biosyst 8(11):2964–2973
Wee LJ et al (2010) SVM-based prediction of linear B-cell epitopes using Bayes feature extraction. BMC Genomics 11(4):S21
Song L et al (2014) nDNA-prot: identification of DNA-binding proteins based on unbalanced classification. BMC Bioinformatics 15:298
Li DP, Ju Y, Zou Q (2016) Protein folds prediction with hierarchical structured SVM. Curr Proteomics 13:79–85
Schwartz D (2012) Prediction of lysine posttranslational modifications using bioinformatic tools. Essays Biochem 52:165–177
Acknowledgments
This work was supported by the Fundamental Research Funds for the Central Universities (3132016306, 3132017048).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Jia, C., Zuo, Y. (2018). Computational Prediction of Protein O-GlcNAc Modification. In: Huang, T. (eds) Computational Systems Biology. Methods in Molecular Biology, vol 1754. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7717-8_14
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7717-8_14
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7716-1
Online ISBN: 978-1-4939-7717-8
eBook Packages: Springer Protocols