Efficient Computational Prediction and Scoring of Human Protein-Protein Interactions Using a Novel Gene Expression Programming Methodology
Proteins and their interactions have been proven to play a central role in many cellular processes. Thus, many experimental methods have been developed for their prediction. These experimental methods are uneconomic and time consuming in the case of low throughput methods or inaccurate in the case of high throughput methods. To overcome these limitations, many computational methods have been developed to predict and score Protein-Protein Interactions (PPIs) using a variety of functional, sequential and structural data for each protein pair. Existing computational methods can still be enhanced in terms of classification performance and interpretability. In the present paper we present a novel Gene Expression Programming (GEP) algorithm, named as jGEPModelling 2.0, and apply it to the problem of PPI prediction and scoring. jGEPModelling2.0 is a variation of the classic GEP algorithm to make it suitable for the problem of PPI prediction and enhance its classification performance. To test its efficiency, we applied it to a public available dataset and compared it to two other state-of-the-art PPI prediction models. Experimental results proved that jGEPModelling2.0 outperformed existing methodologies in terms of classification performance and interpretability. (This paper is submitted for the CIAB2012 workshop).
KeywordsProtein Protein Interactions Human PPI scoring methods Gene Expression Programming Genetic Programming
Unable to display preview. Download preview PDF.
- 2.Rivas, J., Fortanillo, C.: Protein-Protein Interactions Essentials: Key Concepts to Building and Analyzing Interactome Networks. PLoS Computational Biololy 6(6), e1000807 (2010)Google Scholar
- 6.Thahir, M., Jaime, C., Madhavi, G.: Active learning for human protein-protein interaction prediction. BMC Bioinformatics 11(1), S57 (2010)Google Scholar
- 7.Wang, B.: Prediction of protein interactions by combining genetic algorithm with SVM method. In: IEEE Congress on Evolutionary Computation, pp. 320–325 (2007)Google Scholar
- 11.Antoniou, M.A., Georgopoulos, E.F., Theofilatos, K.A., Vassilopoulos, A.P., Likothanassis, S.D.: A Gene Expression Programming Environment for Fatigue Modeling of Composite Materials. In: Konstantopoulos, S., Perantonis, S., Karkaletsis, V., Spyropoulos, C.D., Vouros, G. (eds.) SETN 2010. LNCS (LNAI), vol. 6040, pp. 297–302. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 12.Antoniou, M.A., Georgopoulos, E.F., Theofilatos, K.A., Likothanassis, S.D.: Forecasting Euro – United States Dollar Exchange Rate with Gene Expression Programming. In: Papadopoulos, H., Andreou, A.S., Bramer, M. (eds.) AIAI 2010. IFIP AICT, vol. 339, pp. 78–85. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 13.Keshava, T., Goel, R., Kandasamy, K., et al.: Human Protein Reference Database–2009 update. Nucleic Acids Res. 37, D767–D772 (2009)Google Scholar
- 14.Ashburner, M., Ball, C., Blake, J., et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000)Google Scholar
- 15.Barrett, T., Troup, D., Wilhite, S., et al.: NCBI GEO: archive for functional genomics data sets -10 years on. Nucleic Acids Research 39, D1005–D1010 (2012)Google Scholar
- 16.Scott, M., Thomas, D., Hallet, M.: Predicting subcellular localization via protein motif co-occurrence. Genome Res. 14(10A), 1957–1966 (2004)Google Scholar