Random Walk Based Global Feature for Disease Gene Identification

Wei, Lezhen; Wu, Shuai; Zhang, Jian; Xu, Yong

doi:10.1007/978-981-10-3005-5_38

Lezhen Wei¹⁶,
Shuai Wu¹⁶,
Jian Zhang¹⁶ &
…
Yong Xu¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 663))

Included in the following conference series:

Chinese Conference on Pattern Recognition

2277 Accesses
1 Citations

Abstract

Disease gene identification is of great significance for the treatment of genetic disorders. In recent years, the rapid development of high-throughput sequencing technologies has brought great revolution for disease gene identification methods. Network-based methods are now the most efficient component for disease gene identification, while the most of current methods pay only attention to the local topological attributes regardless of the global distribution. In this paper, we proposed to apply the random walk algorithm to extract global features for each gene and finally used binary logistic regression model to identify whether a gene belongs to the given disease. We also integrate the local features and global features into a complex feature vector to improve the identification performance. The experimental results show that the global feature is of great efficiency for disease gene identification. We organize the global feature into different kinds of feature vectors and we can get higher AUC scores than other state-of-the-art methods for all these feature vectors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wang, K., Li, M., Hakonarson, H.: ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38(16), e164 (2010)
Article Google Scholar
Pan, Q., Shai, O., Lee, L.J., et al.: Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40(12), 1413–1415 (2008)
Article Google Scholar
Stelzl, U., Worm, U., Lalowski, M., et al.: A human protein-protein interaction network: a resource for annotating the proteome. Cell 122(6), 957–968 (2005)
Article Google Scholar
Simonis, N., Rual, J., Carvunis, A., et al.: Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network. Nat. Methods 6(1), 47–54 (2009)
Article Google Scholar
Consortium A I M: Evidence for network evolution in an Arabidopsis interactome map. Science 333(6042), 601–607 (2011)
Article Google Scholar
Gavin, A.C., Aloy, P., Grandi, P., et al.: Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084), 631–636 (2006)
Article Google Scholar
Krogan, N.J., Cagney, G., Yu, H., et al.: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440(7084), 637–643 (2006)
Article Google Scholar
Hawkins, R.D., Hon, G.C., Ren, B.: Next-generation genomics: an integrative approach. Nat. Rev. Genet. 11(7), 476–486 (2010)
Google Scholar
Nielsen, R., Paul, J.S., Albrechtsen, A., et al.: Genotype and SNP calling from next-generation sequencing data. Nat. Rev. Genet. 12(6), 443–451 (2011)
Article Google Scholar
Quackenbush, J.: Computational analysis of microarray data. Nat. Rev. Genet. 2(6), 418–427 (2001)
Article Google Scholar
Dahlquist, K.D., Salomonis, N., Vranizan, K., et al.: GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat. Genet. 31(1), 19–20 (2002)
Article Google Scholar
Marioni, J.C., Mason, C.E., Mane, S.M., et al.: RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18(9), 1509–1517 (2008)
Article Google Scholar
Mortazavi, A., Williams, B.A., Mccue, K., et al.: Mapping and quantifying Mammalian transcriptomes by RNA-Seq. Nat. Methods 5(7), 621–628 (2008)
Article Google Scholar
Wang, Z., Gerstein, M., Snyder, M.: RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10(1), 57–63 (2008)
Article Google Scholar
Köhler, S., Bauer, S., Horn, D., et al.: Walking the interactome for prioritization of candidate disease genes. AIDS Res. Hum. Retroviruses 21(4), 314–318 (2005)
Article Google Scholar
Wu, X., Jiang, R., Zhang, M.Q., et al.: Network-based global inference of human disease genes. Mol. Syst. Biol. 4(1), 189 (2008)
Google Scholar
Vanunu, O., Magger, O., Ruppin, E., et al.: Associating genes and protein complexes with disease via network propagation. PLoS Comput. Biol. 6(1), e1000641 (2010)
Article MathSciNet Google Scholar
Vidal, M., Cusick, M.E., Barabási, A.L.: Interactome networks and human disease: cell. Cell 144(6), 986–998 (2011)
Article Google Scholar
Aittokallio, T., Schwikowski, B.: Graph-based methods for analysing networks in cell biology. Briefings Bioinf. 7(3), 243–255 (2006)
Article Google Scholar
Pržulj, N.: Protein-protein interactions: making sense of networks via graph-theoretic modeling. Bioessays News Rev. Mol. Cell. Dev. Biol 33(2), 115–123 (2011)
Google Scholar
Hakes, L., Pinney, J.W., Robertson, D.L., et al.: Protein-protein interaction networks and biology–what’s the connection? Nat. Biotechnol. 26(1), 69–72 (2008)
Article Google Scholar
Lesage, G., Bader, G.D., Ding, H., et al.: Global mapping of the yeast genetic interaction network: discovering gene and drug function. Science 303(5659), 808–813 (2004)
Article Google Scholar
Dixon, S.J., Costanzo, M., Baryshnikova, A., et al.: Systematic mapping of genetic interaction networks. Annu. Rev. Genet. 43(43), 601–625 (2009)
Article Google Scholar
Costanzo, M., Baryshnikova, A., Bellay, J., et al.: The genetic landscape of a cell. Science 327(5964), 425–431 (2010)
Article Google Scholar
Tanaka, R.: Scale-rich metabolic networks. Phys. Rev. Lett. 94(16), 168101 (2005)
Article Google Scholar
Ravasz, E., Somera, A.L., Mongru, D.A., et al.: Hierarchical organization of modularity in metabolic networks. Science 297(5586), 1551–1555 (2002)
Article Google Scholar
Ma, H., Zeng, A.P.: Reconstruction of metabolic networks from genome data and analysis of their global structure for various organisms. Bioinformatics 19(2), 270–277 (2003)
Article Google Scholar
Prieto, C., Risueño, A., Fontanillo, C., et al.: Human gene coexpression landscape: confident network derived from tissue transcriptomic profiles. PLoS ONE 3(12), e3911 (2008)
Article Google Scholar
Stuart, J.M., Segal, E., Koller, D., et al.: A gene-coexpression network for global discovery of conserved genetic modules. Science 302(5643), 249–255 (2003)
Article Google Scholar
Guo, X., Gao, L., Wei, C., et al.: A computational method based on the integration of heterogeneous networks for predicting disease-gene associations. PLoS ONE 6(9), e24171 (2011). [SCI:000294686100018] [SCI IF = 4.092, JCR = 2]
Article Google Scholar
Chen, B., Li, M., Wang, J., et al.: A logistic regression based algorithm for identifying human disease genes. In: IEEE International Conference on Bioinformatics and Biomedicine. IEEE (2014)
Google Scholar
Chen, B., Wang, J., Li, M., et al.: Identifying disease genes by integrating multiple data sources. BMC Med. Genomics 7(Suppl 2), S2 (2014)
Article Google Scholar
Chen, Y., Wang, W., Zhou, Y., et al.: In silico gene prioritization by integrating multiple data sources. PLoS ONE 6(6), e21137 (2011)
Article Google Scholar
Burton, P.R., Clayton, D.G., Cardon, L.R., et al.: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661–678 (2007)
Article Google Scholar
Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.R., Ozenberger, B.A., Ellrott, K., Shmulevich, I., Sander, C., Stuart, J.M.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013). Cancer Genome Atlas Research Network
Article Google Scholar
Emmertstreib, F., Tripathi, S., Simoes, R.D.M., et al.: The human disease network. Proc. Natl. Acad. Sci. 1(1), 20–28 (2014)
Google Scholar

Download references

Acknowledgment

This work is surported by JCYJ20140904154645958 and CXZZ20140904154910774.

Author information

Authors and Affiliations

Shen Zhen Graduate School, Harbin Institute of Technology, Shenzhen, China
Lezhen Wei, Shuai Wu, Jian Zhang & Yong Xu

Authors

Lezhen Wei
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuai Wu .

Editor information

Editors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, China
Xuelong Li
Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China
Xilin Chen
Tsinghua University , Beijing, China
Jie Zhou
Nanjing University of Science and Technology, Nanjing, China
Jian Yang
University of Electronic Science and Technology, Chengdu, Sichuan, China
Hong Cheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, L., Wu, S., Zhang, J., Xu, Y. (2016). Random Walk Based Global Feature for Disease Gene Identification. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_38

Download citation

DOI: https://doi.org/10.1007/978-981-10-3005-5_38
Published: 22 October 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3004-8
Online ISBN: 978-981-10-3005-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics