Employing Publically Available Biological Expert Knowledge from Protein-Protein Interaction Information

  • Kristine A. Pattin
  • Jiang Gui
  • Jason H. Moore
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6282)


Genome wide association studies (GWAS) are now allowing researchers to probe the depths of common complex human diseases, yet few have identified single sequence variants that confer disease susceptibility. As hypothesized, this is due the fact that multiple interacting factors influence clinical endpoint. Given the number of single nucleotide polymorphisms (SNPs) combinations grows exponentially with the number of SNPs being analyzed, computational methods designed to detect these interactions in smaller datasets are thus not applicable. Providing statistical expert knowledge has exhibited an improvement in their performance, and we believe biological expert knowledge to be as capable. Since one of the strongest demonstrations of the functional relationship between genes is protein-protein interactions, we present a method that exploits this information in genetic analyses. This study provides a step towards utilizing expert knowledge derived from public biological sources to assist computational intelligence algorithms in the search for epistasis.


GWAS SNPs Protien-protein interaction Epistasis 


  1. 1.
    Moore, J., Ritchie, M.: The challenges of whole-genome approaches to common diseases. JAMA 29, 1642–1643 (2004)CrossRefGoogle Scholar
  2. 2.
    Ritchie, M., Hahn, L., Roodi, N., et al.: Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69, 138–147 (2001)CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Moore, J., White, B.: Tuning reliefF for genome-wide genetic analysis. In: Marchiori, E., Moore, J.H., Rajapakse, J.C. (eds.) EvoBIO 2007. LNCS, vol. 4447, pp. 166–175. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  4. 4.
    Eppstein, M., Payne, J., White, B., Moore, J.: Genomic mining for complex disease traits with ’random chemistry’. Genetic Programming and Evolvable Machines 8, 395–411 (2007)CrossRefGoogle Scholar
  5. 5.
    White, B., Gilbert, J., Reif, D., et al.: A statistical comparison of grammatical evolution strategies in the domain of human genetics. In: Proceedings of the IEEE Congress on Evolutionary Computing, pp. 676–682 (2005)Google Scholar
  6. 6.
    Greene, C., White, B., Moore, J.: Ant colony optimization for genome-wide genetic analysis. In: Dorigo, M., Birattari, M., Blum, C., Clerc, M., Stützle, T., Winfield, A.F.T. (eds.) ANTS 2008. LNCS, vol. 5217, pp. 37–47. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Greene, C., Gilmore, J., Kiralis, J., et al.: Optimal use of expert knowledge in ant colony optimization for the analysis of epistasis in human disease. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds.) EvoBIO 2009. LNCS, vol. 5483, pp. 92–103. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  8. 8.
    Moore, J., Andrews, P., Barney, N., et al.: Development and evaluation of an open-ended computational evolution system for the genetic analysis of susceptibility to common human diseases. In: Marchiori, E., Moore, J.H. (eds.) EvoBIO 2008. LNCS, vol. 4973, pp. 129–140. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  9. 9.
    Greene, C., Hill, D., Moore, J.: Environmental sensing of expert knowledge in a computational evolution system for complex problem solving in human genetics. In: Riolo, R., O-Reilly, U.-M., McConaghy, T. (eds.) Genetic Programming Theory and Practice, vol. VII, pp. 19–36. Springer, Heidelberg (2009)Google Scholar
  10. 10.
    Pattin, K., Moore, J.: Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases. Hum. Genet. 124, 297–312 (2009)Google Scholar
  11. 11.
    Pattin, K., Moore, J.: Role for protein-protein interaction databases in human genetics. Exp. Rev. Proteomics 6, 647–659 (2009)CrossRefGoogle Scholar
  12. 12.
    von Mering, C., Jensen, L., Snel, B., et al.: STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res. 1(33), D433–D437 (2005)Google Scholar
  13. 13.
    Jensen, L., Kuhn, M., Stark, M., et al.: STRING 8–a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 37, D412–D416 (2009)CrossRefGoogle Scholar
  14. 14.
    Andrew, A., Karagas, M., Nelson, H., et al.: Assessment of multiple DNA repair gene polymorphisms and bladder cancer susceptibility in a joint Italian and U.S. population: a comparison of alternative analytic approaches. Hum. Hered. 65, 105–118 (2008)CrossRefPubMedGoogle Scholar
  15. 15.
    Emily, M., Mailund, T., Hain, J., et al.: Using biological networks to search for interacting loci in genome-wide association studies. Eur. J. Hum. Genet. 17(10), 1231–1240 (2009)CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Bush, W., Dudek, S., Ritchie, M.: Biofilter: a knowledge-integration system for the multi-locus analysis of genome-wide association studies. In: Pac. Symp. Biocomput., pp. 368–379 (2009)Google Scholar
  17. 17.
    Shriner, D., Tesfaye, B., Padilla, M., et al.: Commonality of functional annotation: a method for prioritization of candidate genes from genome-wide linkage studies. Nucleic Acids Res. 36(4), e26 (2008)CrossRefGoogle Scholar
  18. 18.
    Saccone, S., Saccone, N., Swan, G., et al.: Systematic biological prioritization after a genome-wide association study: an application to nicotine dependence. Bioinformatics 24, 1805–1811 (2008)CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Kristine A. Pattin
    • 1
  • Jiang Gui
    • 1
  • Jason H. Moore
    • 1
  1. 1.Dartmouth Medical SchoolLebanonUSA

Personalised recommendations