Role of Centrality in Network-Based Prioritization of Disease Genes

Erten, Sinan; Koyutürk, Mehmet

doi:10.1007/978-3-642-12211-8_2

Sinan Erten¹⁹ &
Mehmet Koyutürk^19,20

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6023))

Included in the following conference series:

European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics

804 Accesses
12 Citations

Abstract

High-throughput molecular interaction data have been used effectively to prioritize candidate genes that are linked to a disease, based on the notion that the products of genes associated with similar diseases are likely to interact with each other heavily in a network of protein-protein interactions (PPIs). An important challenge for these applications, however, is the incomplete and noisy nature of PPI data. Random walk and network propagation based methods alleviate these problems to a certain extent, by considering indirect interactions and multiplicity of paths. However, as we demonstrate in this paper, such methods are likely to favor highly connected genes, making prioritization sensitive to the skewed degree distribution of PPI networks, as well as ascertainment bias in available interaction and disease association data. Here, we propose several statistical correction schemes that aim to account for the degree distribution of known disease and candidate genes. We show that, while the proposed schemes are very effective in detecting loosely connected disease genes that are missed by existing approaches, this improvement might come at the price of more false negatives for highly connected genes. Motivated by these results, we develop uniform prioritization methods that effectively integrate existing methods with the proposed statistical correction schemes. Comprehensive experimental results on the Online Mendelian Inheritance in Man (OMIM) database show that the resulting hybrid schemes outperform existing methods in prioritizing candidate disease genes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Brunner, H.G., van Driel, M.A.: From syndrome families to functional genomics. Nat. Rev. Genet. 5(7), 545–551 (2004)
Article Google Scholar
Glazier, A.M., Nadeau, J.H., Aitman, T.J.: Finding Genes That Underlie Complex Traits. Science 298(5602), 2345–2349 (2002)
Article Google Scholar
Lage, K., Karlberg, E., Storling, Z., Olason, P., Pedersen, A., Rigina, O., Hinsby, A., Tumer, Z., Pociot, F., Tommerup, N., Moreau, Y., Brunak, S.: A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat. Bio. 25(3), 309–316 (2007)
Article Google Scholar
Adie, E., Adams, R., Evans, K., Porteous, D., Pickard, B.: SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 22(6), 773–774 (2006)
Article Google Scholar
Turner, F., Clutterbuck, D., Semple, C.: Pocus: mining genomic sequence annotation to predict disease genes. Genome Biology 4(11), R75 (2003)
Article Google Scholar
Chen, J., Aronow, B., Jegga, A.: Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinformatics 10(1), 73 (2009)
Article Google Scholar
Oti, M., Snel, B., Huynen, M.A., Brunner, H.G.: Predicting disease genes using protein-protein interactions. J. Med. Genet. (2006), jmg.2006.041376
Google Scholar
Goh, K.I., Cusick, M.E., Valle, D., Childs, B., Vidal, M., Barabási, A.L.A.A.: The human disease network. PNAS 104(21), 8685–8690 (2007)
Article Google Scholar
Ideker, T., Sharan, R.: Protein networks in disease. Genome research 18(4), 644–652 (2008)
Article Google Scholar
Köhler, S., Bauer, S., Horn, D., Robinson, P.N.: Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82(4), 949–958 (2008)
Article Google Scholar
Vanunu, O., Sharan, R.: A propagation based algorithm for inferring gene-disease associations. In: Proceedings of German Conference on Bioinformatics (2008)
Google Scholar
Edwards, A.M., Kus, B., Jansen, R., Greenbaum, D., Greenblatt, J., Gerstein, M.: Bridging structural biology and genomics: assessing protein interaction data with known complexes. Trends in Genetics 18(10), 529–536 (2002)
Article Google Scholar
George, R.A., Liu, J.Y., Feng, L.L., Bryson-Richardson, R.J., Fatkin, D., Wouters, M.A.: Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucl. Acids Res. 34(19), e130 (2006)
Article Google Scholar
van Driel, M.A., Bruggeman, J., Vriend, G., Brunner, H.G., Leunissen, J.A.: A text-mining analysis of the human phenome. EJHG 14(5), 535–542 (2006)
Article Google Scholar
Lovász, L.: Random walks on graphs: A survey. Combinatorics, Paul Erdos is Eighty 2, 353–398 (1996)
Google Scholar
Tong, H., Faloutsos, C., Pan, J.Y.: Random walk with restart: fast solutions and applications. Knowledge and Information Systems 14(3), 327–346 (2008)
Article MATH Google Scholar
Macropol, K., Can, T., Singh, A.: Rrw: repeated random walks on genome-scale protein networks for local cluster discovery. BMC Bioinformatics 10(1), 283 (2009)
Article Google Scholar
Tong, H., Faloutsos, C.: Center-piece subgraphs: problem definition and fast solutions. In: KDD 2006: Proceedings of the 12th ACM SIGKDD, pp. 404–413. ACM, New York (2006)
Google Scholar
Nabieva, E., Jim, K., Agarwal, A., Chazelle, B., Singh, M.: Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinf. 21, i302–i310 (2005)
Article Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)
Article Google Scholar
Maglott, D., Ostell, J., Pruitt, K.D., Tatusova, T.: Entrez Gene: gene-centered information at NCBI. Nucl. Acids Res. 35(suppl. 1), D26–D31 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Electrical Engineering & Computer Science, Case Western Reserve University, Cleveland, OH, 44106, USA
Sinan Erten & Mehmet Koyutürk
Center for Proteomics & Bioinformatics, Case Western Reserve University, Cleveland, OH, 44106, USA
Mehmet Koyutürk

Authors

Sinan Erten
View author publications
You can also search for this author in PubMed Google Scholar
Mehmet Koyutürk
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute for High-Performance Computing and Networking (ICAR), Italian National Research Council (CNR), Via P. Bucci 41C, 87036, Rende, (CS), Italy
Clara Pizzuti
Department of Molecular Physiology and Biophysics, Vanderbilt University, Center for Human Genetics Research, 519 Light Hall, 37232, Nashville, TN, USA
Marylyn D. Ritchie
Department of Animal Production Epidemiology and Ecology, University of Torino, Molecular Biotechnology Center, Via Leonardo da Vinci 44, 10095, Grugliasco, (TO), Italy
Mario Giacobini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Erten, S., Koyutürk, M. (2010). Role of Centrality in Network-Based Prioritization of Disease Genes. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2010. Lecture Notes in Computer Science, vol 6023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12211-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-12211-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12210-1
Online ISBN: 978-3-642-12211-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics