Abstract
Essential proteins are critical components of living organisms. The identification of essential proteins from protein-protein interaction (PPI) networks is beneficial for the understanding of biology mechanism. This work presents a novel information entropy of protein complex and subcellular localization based method (IECS) for essential protein identification from PPI networks. First, extract the sample by stratified sampling to calculate the information gain of the protein complex and subcellular localization. Information gain can effectively determine the importance of biological characteristics. Then calculate the biological attribute score based on the information entropy of protein complex and subcellular localization. Finally combined with the network characteristics of the node. The proposed IECS method is implemented on two Saccharomyces cerevisiae datasets (DIP and Krogan), and the experimental results show that IECS overmatches most of the traditional methods for identifying essential proteins.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Winzeler, E.A., et al.: Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 285, 901–906 (1999)
Acencio, M.L., Lemke, N.: Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinform. 10, 290 (2009)
Jeong, H., Mason, S.P., Barabasi, A.L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411, 41 (2001)
Hahn, M.W., Kern, A.D.: Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol. Biol. Evol. 22, 803–806 (2005)
Joy, M.P., Brock, A., Ingber, D.E., Huang, S.: High-betweenness proteins in the yeast protein interaction network. Biomed. Res. Int. 2005, 96–103 (2005)
Wuchty, S., Stadler, P.F.: Centers of complex networks. J. Theor. Biol. 223, 45–53 (2003)
Estrada, E., Rodriguez-Velazquez, J.A.: Subgraph centrality in complex networks. Phys. Rev. E 71, 056103 (2005)
Bonacich, P.: Power and centrality: a family of measures. Am. J. Sociol. 92, 1170–1182 (1987)
Peng, X., Wang, J., Wang, J., Wu, F.-X., Pan, Y.: Rechecking the centrality-lethality rule in the scope of protein subcellular localization interaction networks. PLoS ONE 10, e0130743 (2015)
Li, M., Wang, J., Chen, X., Wang, H., Pan, Y.: A local average connectivity-based method for identifying essential proteins from the network level. Comput. Biol. Chem. 35, 143–150 (2011)
Wang, J.X., Li, M., Wang, H., Pan, Y.: Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 1070–1080 (2012)
Luo, J., Qi, Y.: Identification of essential proteins based on a new combination of local interaction density and protein complexes. PLoS ONE 10, e0131418 (2015)
Li, M., Zhang, H., Wang, J., Pan, Y.: A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst. Biol. 6, 15 (2012)
Tang, X., Wang, J., Zhong, J., Pan, Y.: Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 11, 407–418 (2014)
Peng, W., Wang, J., Cheng, Y., Lu, Y., Wu, F., Pan, Y.: UDoNC: an algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 12, 276–288 (2015)
Lei, X., Jie, Z., Fujita, H., Zhang, A.: Predicting essential proteins based on RNA-Seq, subcellular localization and GO annotation datasets. Knowl.-Based Syst. 151, S095070511830159X (2018)
Shang, X., Wang, Y., Chen, B.: Identifying essential proteins based on dynamic protein-protein interaction networks and RNA-Seq datasets. Sci. China Inf. Sci. 59, 1–11 (2016)
Oh, S., Song, S., Grabowski, G., Zhao, H., Noonan, J.P.: Time series expression analyses using RNA-seq: a statistical approach. BioMed Res. Int. 2013(5), 203681 (2013)
Wang, G.Y.Y.H.: Decision table reduction based on conditional information entropy. Chin. J. Comput. 25, 759–766 (2002)
Lee, C., Lee, G.G.: Information gain and divergence-based feature selection for machine learning-based text categorization. Inf. Process. Manag. 42, 155–165 (2006)
Xenarios, I., Salwinski, L., Duan, X.Q.J., Higney, P., Kim, S.M., Eisenberg, D.: DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303–305 (2002)
Krogan, N.J., et al.: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440, 637–643 (2006)
Pu, S., Wong, J., Turner, B., Cho, E., Wodak, S.J.: Up-to-date catalogues of yeast protein complexes. Nucleic Acids Res. 37, 825–831 (2009)
Binder, J.X., et al.: COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database 2014, bau012 (2014)
Frazee, A.C., Jaffe, A.E., Langmead, B., Leek, J.T.: Polyester: simulating RNA-seq datasets with differential transcript expression. Bioinformatics 31, 2778–2784 (2015)
Cherry, J.M.: SGD: saccharomyces genome database. Nucleic Acids Res. 26, 73–79 (1998)
Tang, Y., Li, M., Wang, J., Pan, Y., Wu, F.-X.: CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems 127, 67–72 (2015)
Holman, A., Davis, P., Foster, J., Carlow, C., Kumar, S.: Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi. BMC Microbiol. 9, 243 (2009)
Acknowledgement
This paper is supported by the National Natural Science Foundation of China (61672334, 61502290, 61401263) and the Fundamental Research Funds for the Central Universities, Shaanxi Normal University (GK201804006, GK201901010).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, J., Lei, X., Yang, X., Guo, L. (2019). A New Method for Identification of Essential Proteins by Information Entropy of Protein Complex and Subcellular Localization. In: Tan, Y., Shi, Y., Niu, B. (eds) Advances in Swarm Intelligence. ICSI 2019. Lecture Notes in Computer Science(), vol 11656. Springer, Cham. https://doi.org/10.1007/978-3-030-26354-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-26354-6_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26353-9
Online ISBN: 978-3-030-26354-6
eBook Packages: Computer ScienceComputer Science (R0)