Identifying Human Essential Genes by Network Embedding Protein-Protein Interaction Network

Dai, Wei; Chang, Qi; Peng, Wei; Zhong, Jiancheng; Li, Yongjiang

doi:10.1007/978-3-030-20242-2_11

Identifying Human Essential Genes by Network Embedding Protein-Protein Interaction Network

Wei Dai¹⁷,
Qi Chang¹⁷,
Wei Peng^17,18,
Jiancheng Zhong¹⁹ &
…
Yongjiang Li¹⁸

Conference paper
First Online: 09 May 2019

818 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 11490))

Abstract

Essential genes play an indispensable role in cell viability and fertility. Identifying human essential genes helps us to study the functions of human genes, but also provides a way for finding potential targets for cancer and other diseases. Recently, with the publishing of human essential gene data and the availability of a large amount of biological data, some computational methods have been proposed to predict human essential genes based on genes’ DNA sequence or their topological properties in the protein-protein interaction (PPI) network. However, there is still some room to improve the prediction accuracy. In this work, we propose a novel supervised method to predict human essential genes by network embedding protein-protein interaction network. Our method extracts the features of the genes in network by mapping them to a latent space of features that maximally preserves the relationships between the genes and their network neighborhoods. After that, the features are input into a SVM classifier to predict human essential genes. Two human PPI networks are employed to evaluate the effectiveness of our method. The prediction results show that our method outperforms the method that only uses genes’ sequence information, but also is obviously superior to the method utilizing genes’ centrality properties in the network as input features.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Zhang, R., Lin, Y.: DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 37(Database issue), D455–D458 (2009)
Article Google Scholar
Clatworthy, A.E., Pierson, E., Hung, D.T.: Targeting virulence: a new paradigm for antimicrobial therapy. Nat. Chem. Biol. 3(9), 541–548 (2007)
Article Google Scholar
Furney, S., Alba, M.M., Lopez-Bigas, N.: Differences in the evolutionary history of disease genes affected by dominant or recessive mutations. BMC Genom. 7(1), 165 (2006)
Article Google Scholar
Giaever, G., et al.: Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 6869 (2002)
Article Google Scholar
Roemer, T.J.B., et al.: Large-scale essential gene identification in Candida albicans and applications to antifungal drug discovery. Mol. Microbiol. 50(1), 167–181 (2010)
Article Google Scholar
Cullen, L.M., Arndt, G.M.: Genome-wide screening for gene function using RNAi in mammalian cells. Immunol. Cell Biol. 83(3), 217–223 (2005)
Article Google Scholar
Fraser, A.: Essential human genes. Cell Syst. 1(6), 381–382 (2015)
Article Google Scholar
Hart, T., et al.: High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype-Specific Cancer Liabilities. Cell 163(6), 1515–1526 (2015)
Article Google Scholar
Wang, T., et al.: Identification and characterization of essential genes in the human genome. Science 350(6264), 1096–1101 (2015)
Article Google Scholar
Chen, Y., Xu, D.: Understanding protein dispensability through machine-learning analysis of high-throughput data. Bioinformatics 21(5), 575–581 (2005)
Article Google Scholar
Yuan, Y., et al.: Predicting the lethal phenotype of the knockout mouse by integrating comprehensive genomic data. Bioinformatics 28(9), 1246–1252 (2012)
Article Google Scholar
Lloyd, J.P., et al.: Characteristics of plant essential genes allow for within- and between-species prediction of lethal mutant phenotypes. Plant Cell 27(8), 2133 (2015)
Article Google Scholar
Wang, J., Peng, W., Wu, F.X.: Computational approaches to predicting essential proteins: a survey. PROTEOMICS-Clin. Appl. 7(1–2), 181–192 (2013)
Article Google Scholar
Jeong, H., et al.: Lethality and centrality in protein networks. Nature 411(6833), 41–42 (2001)
Article Google Scholar
Joy, M.P., et al.: High-betweenness proteins in the yeast protein interaction network. J. Biomed. Biotechnol. 2005(2), 96–103 (2005)
Article Google Scholar
Wuchty, S., Stadler, P.F.: Centers of complex networks. J. Theor. Biol. 223(1), 45–53 (2003)
Article MathSciNet Google Scholar
Vallabhajosyula, R.R., et al.: Identifying hubs in protein interaction networks. PLoS ONE 4(4), e5344 (2009)
Article Google Scholar
Bonacich, P.: Power and centrality: a family of measures. Am. J. Sociol. 92(5), 1170–1182 (1987)
Article Google Scholar
Stephenson, K., Zelen, M.: Rethinking centrality: methods and examples. Soc. Netw. 11(1), 1–37 (1989)
Article MathSciNet Google Scholar
Wang, J., et al.: Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(4), 1070–1080 (2012)
Article Google Scholar
Ernesto, E., Rodríguez-Velázquez, J.A.: Subgraph centrality in complex networks. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 71(5 Pt 2), 056103 (2005)
MathSciNet Google Scholar
Li, M., et al.: Essential proteins discovery from weighted protein interaction networks. Bioinform. Res. Appl. Proc. 6053, 89–100 (2010)
Article Google Scholar
Li, M., et al.: A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst. Biol. 6(1), 15 (2012)
Article Google Scholar
Tang, X., et al.: Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 11(2), 407–418 (2014)
Article Google Scholar
Peng, W., et al.: UDoNC: an algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 12(2), 276–288 (2015)
Article Google Scholar
Peng, W., et al.: Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks. BMC Syst. Biol. 6(1), 87 (2012)
Article Google Scholar
Zhong, J., et al.: XGBFEMF: an XGBoost-based framework for essential protein prediction. IEEE Trans. Nanobioscience 17(3), 243–250 (2018)
Article MathSciNet Google Scholar
Guo, F.B., et al.: Accurate prediction of human essential genes using only nucleotide composition and association information. Bioinformatics 33(12), 1758–1764 (2017)
Article Google Scholar
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: KDD, pp. 855–864 (2016)
Google Scholar
Mikolov, T., et al.: Distributed representations of words and phrases and their compositionality. In: International Conference on Neural Information Processing Systems (2013)
Google Scholar
Wu, J., et al.: WDL-RF: predicting bioactivities of ligand molecules acting with G protein-coupled receptors by combining weighted deep learning and random forest. Bioinformatics 34(13), 2271–2282 (2018)
Article Google Scholar
Acencio, M.L., Lemke, N.: Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinformatics 10, 290 (2009)
Article Google Scholar
Liao, J., Chin, K.: Logistic regression for disease classification using microarray data: model selection in a large p and small n case. Bioinformatics 23(15), 1945–1951 (2007)
Article Google Scholar
Cheng, J., et al.: Training set selection for the prediction of essential genes. PLoS ONE 9(1), e86805 (2014)
Article Google Scholar
Kuo-Chen, C., Hong-Bin, S.: Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-nearest neighbor classifiers. J. Proteome Res. 5(8), 1888–1897 (2006)
Article Google Scholar
Wu, G., Feng, X., Stein, L.: A human functional protein interaction network and its application to cancer data analysis. Genome Biol. 11(5), 1–23 (2010)
Article Google Scholar
Li, T., et al.: A scored human protein-protein interaction network to catalyze genomic interpretation. Nat. Methods 14(1), 61 (2016)
Article Google Scholar
Tang, Y., et al.: CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems 127, 67–72 (2015)
Article Google Scholar

Download references

Acknowledgment

This work is supported in part by the National Natural Science Foundation of China under grant No. 31560317, No. 61502214, No. 61472133, No. 61502166, No. 61702122 and No. 81560221. Natural Science Foundation of Yunnan Province of China (No. 2016FB107).

Author information

Authors and Affiliations

Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650050, China
Wei Dai, Qi Chang & Wei Peng
Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050, China
Wei Peng & Yongjiang Li
College of Engineering and Design, Hunan Normal University, Changsha, 410081, China
Jiancheng Zhong

Authors

Wei Dai
View author publications
You can also search for this author in PubMed Google Scholar
Qi Chang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Peng
View author publications
You can also search for this author in PubMed Google Scholar
Jiancheng Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Yongjiang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Peng .

Editor information

Editors and Affiliations

Georgia State University, Atlanta, GA, USA
Zhipeng Cai
Georgia State University, Atlanta, GA, USA
Pavel Skums
Central South University, Changsha, China
Min Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dai, W., Chang, Q., Peng, W., Zhong, J., Li, Y. (2019). Identifying Human Essential Genes by Network Embedding Protein-Protein Interaction Network. In: Cai, Z., Skums, P., Li, M. (eds) Bioinformatics Research and Applications. ISBRA 2019. Lecture Notes in Computer Science(), vol 11490. Springer, Cham. https://doi.org/10.1007/978-3-030-20242-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-20242-2_11
Published: 09 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20241-5
Online ISBN: 978-3-030-20242-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics