Abstract
Elucidating biological function of proteins is a fundamental problem in molecular biology and bioinformatics. Conventionally, protein function is annotated based on homology using sequence similarity search tools such as BLAST and FASTA. These methods perform well when obvious homologs exist for a query sequence; however, they will not provide any functional information otherwise. As a result, the functions of many genes in newly sequenced genomes are left unknown, which await functional interpretation. Here, we introduce two webservers for function prediction methods, which effectively use distantly related sequences to improve function annotation coverage and accuracy: Protein Function Prediction (PFP) and Extended Similarity Group (ESG). These two methods have been tested extensively in various benchmark studies and ranked among the top in community-based assessments for computational function annotation, including Critical Assessment of Function Annotation (CAFA) in 2010–2011 (CAFA1) and 2013–2014 (CAFA2). Both servers are equipped with user-friendly visualizations of predicted GO terms, which provide intuitive illustrations of relationships of predicted GO terms. In addition to PFP and ESG, we also introduce NaviGO, a server for the interactive analysis of GO annotations of proteins. All the servers are available at http://kiharalab.org/software.php.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402
Pearson WR (1990) Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol 183:63–98
Hawkins T, Kihara D (2007) Function prediction of uncharacterized proteins. J Bioinforma Comput Biol 5(1):1–30
Sael L, Chitale M, Kihara D (2012) Structure- and sequence-based function prediction for non-homologous proteins. J Struct Funct Genom 13(2):111–123. doi:10.1007/s10969-012-9126-6
Radivojac P, Clark WT, Oron TR, Schnoes AM, Wittkop T, Sokolov A, Graim K, Funk C, Verspoor K, Ben-Hur A, Pandey G, Yunes JM, Talwalkar AS, Repo S, Souza ML, Piovesan D, Casadio R, Wang Z, Cheng J, Fang H, Gough J, Koskinen P, Toronen P, Nokso-Koivisto J, Holm L, Cozzetto D, Buchan DWA, Bryson K, Jones DT, Limaye B, Inamdar H, Datta A, Manjari SK, Joshi R, Chitale M, Kihara D, Lisewski AM, Erdin S, Venner E, Lichtarge O, Rentzsch R, Yang H, Romero AE, Bhat P, Paccanaro A, Hamp T, Kaszner R, Seemayer S, Vicedo E, Schaefer C, Achten D, Auer F, Boehm A, Braun T, Hecht M, Heron M, Honigschmid P, Hopf TA, Kaufmann S, Kiening M, Krompass D, Landerer C, Mahlich Y, Roos M, Bjorne J, Salakoski T, Wong A, Shatkay H, Gatzmann F, Sommer I, Wass MN, Sternberg MJE, Skunca N, Supek F, Bosnjak M, Panov P, Dzeroski S, Smuc T, Kourmpetis YAI, van Dijk ADJ, Braak CJF, Zhou Y, Gong Q, Dong X, Tian W, Falda M, Fontana P, Lavezzo E, Di Camillo B, Toppo S, Lan L, Djuric N, Guo Y, Vucetic S, Bairoch A, Linial M, Babbitt PC, Brenner SE, Orengo C, Rost B, Mooney SD, Friedberg I (2013) A large-scale evaluation of computational protein function prediction. Nat Methods 10(3):221–227. http://www.nature.com/nmeth/journal/v10/n3/abs/nmeth.2340.html supplementary-information
Jiang Y, Ronnen Oron T, Clark WT, Bankapur AR, D’Andrea D, Lepore R, Funk CS, Kahanda I, Verspoor KM, Ben-Hur A, Koo E, Penfold-Brown D, Shasha D, Youngs N, Bonneau R, Lin A, Sahraeian SM, Martelli PL, Profiti G, Casadio R, Cao R, Zhong Z, Cheng J, Altenhoff A, Skunca N, Dessimoz C, Dogan T, Hakala K, Kaewphan S, Mehryary F, Salakoski T, Ginter F, Fang H, Smithers B, Oates M, Gough J, Törönen P, Koskinen P, Holm L, Chen C-T, Hsu W-L, Bryson K, Cozzetto D, Minneci F, Jones DT, Chapman S, Dukka BKC, Khan IK, Kihara D, Ofer D, Rappoport N, Stern A, Cibrian-Uhalte E, Denny P, Foulger RE, Hieta R, Legge D, Lovering RC, Magrane M, Melidoni AN, Mutowo-Meullenet P, Pichler K, Shypitsyna A, Li B, Zakeri P, ElShal S, Tranchevent L-C, Das S, Dawson NL, Lee D, Lees JG, Sillitoe I, Bhat P, Nepusz T, Romero AE, Sasidharan R, Yang H, Paccanaro A, Gillis J, Sedeño-Cortés AE, Pavlidis P, Feng S, Cejuela JM, Goldberg T, Hamp T, Richter L, Salamov A, Gabaldon T, Marcet-Houben M, Supek F, Gong Q, Ning W, Zhou Y, Tian W, Falda M, Fontana P, Lavezzo E, Toppo S, Ferrari C, Giollo M, Piovesan D, Tosatto S, del Pozo A, Fernández JM, Maietta P, Valencia A, Tress ML, Benso A, Di Carlo S, Politano G, Savino A, Rehman HU, Re M, Mesiti M, Valentini G, Bargsten JW, van Dijk AD, Gemovic B, Glisic S, Perovic V, Veljkovic V, Veljkovic N, Almeida-e-Silva DC, Vencio RZ, Sharan M, Vogel J, Kansakar L, Zhang S, Vucetic S, Wang Z, Sternberg MJ, Wass MN, Huntley RP, Martin MJ, O'Donovan C, Robinson PN, Moreau Y, Tramontano A, Babbitt PC, Brenner SE, Linial M, Orengo CA, Rost B, Greene CS, Mooney SD, Friedberg I, Radivojac P (2016) An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol 17(1):184. doi:10.1186/s13059-016-1037-6
Hawkins T, Luban S, Kihara D (2006) Enhanced automated function prediction using distantly related sequences and contextual association by PFP. Protein Sci 15(6):1550–1556. doi:10.1110/ps.062153506
Hawkins T, Chitale M, Luban S, Kihara D (2009) PFP: automated prediction of Gene Ontology functional annotations with confidence scores using protein sequence data. Proteins 74(3):566–582. doi:10.1002/prot.22172
Chitale M, Hawkins T, Park C, Kihara D (2009) ESG: extended similarity group method for automated protein function prediction. Bioinformatics 25(14):1739–1745. doi:10.1093/bioinformatics/btp309
Seok YJ, Sondej M, Badawi P, Lewis MS, Briggs MC, Jaffe H, Peterkofsky A (1997) High affinity binding and allosteric regulation of Escherichia coli glycogen phosphorylase by the histidine phosphocarrier protein, HPr. J Biol Chem 272(42):26511–26521
D’Ari L, Rabinowitz JC (1991) Purification, characterization, cloning, and amino acid sequence of the bifunctional enzyme 5,10-methylenetetrahydrofolate dehydrogenase/5,10-methenyltetrahydrofolate cyclohydrolase from Escherichia coli. J Biol Chem 266(35):23953–23958
Khan IK, Wei Q, Chapman S, Kc DB, Kihara D (2015) The PFP and ESG protein function prediction methods in 2014: effect of database updates and ensemble approaches. GigaScience 4:43. doi:10.1186/s13742-015-0083-4
Chitale M, Khan IK, Kihara D (2013) In-depth performance evaluation of PFP and ESG sequence-based function prediction methods in CAFA 2011 experiment. BMC Bioinform 14(Suppl 3):S2. doi:10.1186/1471-2105-14-S3-S2
Lopez G, Rojas A, Tress M, Valencia A (2007) Assessment of predictions submitted for the CASP7 function prediction category. Proteins 69(Suppl 8):165–174. doi:10.1002/prot.21651
Khan IK, Wei Q, Chitale M, Kihara D (2015) PFP/ESG: automated protein function prediction servers enhanced with Gene Ontology visualization tool. Bioinformatics 31(2):271–272. doi:10.1093/bioinformatics/btu646
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504. doi:10.1101/gr.1239303
Schlicker A, Domingues FS, Rahnenfuhrer J, Lengauer T (2006) A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinform 7:302. doi:10.1186/1471-2105-7-302
Chitale M, Palakodety S, Kihara D (2011) Quantification of protein group coherence and pathway assignment using functional association. BMC Bioinform 12:373–373. doi:10.1186/1471-2105-12-373
Yerneni S, Khan I, Wei Q, Kihara D (2015) IAS: interaction specific GO term associations for predicting protein–protein interaction networks. IEEE/ACM Trans Comput Biol Bioinform. doi:10.1109/TCBB.2015.2476809
Sánchez J, Mardia KV, Kent JT, Bibby JM (1982) Multivariate analysis. Academic Press, London-New York-Toronto-Sydney-San Francisco 1979. xv, 518 pp., $ 61.00. Biom J 24(5):502–502. doi:10.1002/bimj.4710240520
Acknowledgments
This work was supported partly by the National Institutes of Health (R01GM097528), the National Science Foundation (IIS1319551, DBI1262189, IOS1127027).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Wei, Q., McGraw, J., Khan, I., Kihara, D. (2017). Using PFP and ESG Protein Function Prediction Web Servers. In: Kihara, D. (eds) Protein Function Prediction. Methods in Molecular Biology, vol 1611. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7015-5_1
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7015-5_1
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7013-1
Online ISBN: 978-1-4939-7015-5
eBook Packages: Springer Protocols