Skip to main content

Phenotype Mining for Functional Genomics and Gene Discovery

  • Protocol
  • First Online:
In Silico Tools for Gene Discovery

Part of the book series: Methods in Molecular Biology ((MIMB,volume 760))

Abstract

In gene prediction, studying phenotypes is highly valuable for reducing the number of locus candidates in association studies and to aid disease gene candidate prioritization. This is due to the intrinsic nature of phenotypes to visibly reflect genetic activity, making them potentially one of the most useful data types for functional studies. However, systematic use of these data has begun only recently. ‘Comparative phenomics’ is the analysis of genotype–phenotype associations across species and experimental methods. This is an emerging research field of utmost importance for gene discovery and gene function annotation. In this chapter, we review the use of phenotype data in the biomedical field. We will give an overview of phenotype resources, focusing on PhenomicDB – a cross-species genotype–phenotype database – which is the largest available collection of phenotype descriptions across species and experimental methods. We report on its latest extension by which genotype–phenotype relationships can be viewed as graphical representations of similar phenotypes clustered together (‘phenoclusters’), supplemented with information from protein–protein interactions and Gene Ontology terms. We show that such ‘phenoclusters’ represent a novel approach to group genes functionally and to predict novel gene functions with high precision. We explain how these data and methods can be used to supplement the results of gene discovery approaches. The aim of this chapter is to assist researchers interested in understanding how phenotype data can be used effectively in the gene discovery field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tuschl, T., and Borkhardt, A. (2002) Small interfering RNAs: a revolutionary tool for the analysis of gene function and gene therapy. Mol Interv 2, 158–167.

    Article  PubMed  CAS  Google Scholar 

  2. Gunsalus, K. C., Yueh, W. C., MacMenamin, P., and Piano, F. (2004) RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects. Nucleic Acids Res 32, D406–D410.

    Article  PubMed  CAS  Google Scholar 

  3. Sonnichsen, B., Koski, L. B., Walsh, A., et al. (2005) Full-genome RNAi profiling of early embryogenesis in Caenorhabditis elegans. Nature 434, 462–469.

    Article  PubMed  CAS  Google Scholar 

  4. Kittler, R., Surendranath, V., Heninger, A. K., et al. (2007) Genome-wide resources of endoribonuclease-prepared short interfering RNAs for specific loss-of-function studies. Nat Methods 4, 337–344.

    PubMed  CAS  Google Scholar 

  5. Groth, P., and Weiss, B. (2006) Phenotype data: a neglected resource in biomedical research? Curr Bioinform 1, 347–358.

    Article  CAS  Google Scholar 

  6. Kent, J. W., Jr. (2009) Analysis of multiple phenotypes. Genet Epidemiol 33(Suppl 1 ), S33–39.

    Article  Google Scholar 

  7. Prosdocimi, F., Chisham, B., Pontelli, E., Thompson, J. D., and Stoltzfus, A. (2009) Initial implementation of a comparative data analysis ontology. Evol Bioinform Online 5, 47–66.

    PubMed  CAS  Google Scholar 

  8. Yu, B. (2009) Role of in silico tools in gene discovery. Mol Biotechnol 41, 296–306.

    Article  PubMed  CAS  Google Scholar 

  9. Gefen, A., Cohen, R., and Birk, O. S. (2009) Syndrome to gene (S2G): in-silico identification of candidate genes for human diseases. Hum Mutat 31, 229–236.

    Article  Google Scholar 

  10. Robinson, P. N., Kohler, S., Bauer, S., et al. (2008) The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet 83, 610–615.

    Article  PubMed  CAS  Google Scholar 

  11. Oti, M., Snel, B., Huynen, M. A., and Brunner, H. G. (2006) Predicting disease genes using protein–protein interactions. J Med Genet 43, 691–698.

    Article  PubMed  CAS  Google Scholar 

  12. Lage, K., Karlberg, E. O., Storling, Z. M., et al. (2007) A human phenome–interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol 25, 309–316.

    Article  PubMed  CAS  Google Scholar 

  13. van Driel, M. A., Bruggeman, J., Vriend, G., et al. (2006) A text-mining analysis of the human phenome. Eur J Hum Genet 14, 535–542.

    Article  PubMed  Google Scholar 

  14. McKusick, V. A. (2007) Mendelian Inheritance in Man and its online version, OMIM. Am J Hum Genet 80, 588–604.

    Article  PubMed  CAS  Google Scholar 

  15. Rogers, A., Antoshechkin, I., Bieri, T., et al. (2008) WormBase 2007. Nucleic Acids Res 36, D612–D617.

    Article  PubMed  CAS  Google Scholar 

  16. Smith, C. L., Goldsmith, C. A., and Eppig, J. T. (2005) The Mammalian phenotype ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biol 6, R7.

    Article  PubMed  Google Scholar 

  17. Bult, C. J., Eppig, J. T., Kadin, J. A., et al. (2008) The Mouse Genome Database (MGD): mouse biology and model systems. Nucleic Acids Res 36, D724–D728.

    Article  PubMed  CAS  Google Scholar 

  18. Oti, M., Huynen, M. A., and Brunner, H. G. (2009) The biological coherence of human phenome databases. Am J Hum Genet 85, 801–808.

    Article  PubMed  CAS  Google Scholar 

  19. Groth, P., Pavlova, N., Kalev, I., et al. (2007) PhenomicDB: a new cross-species genotype/phenotype resource. Nucleic Acids Res 35, D696–D699.

    Article  PubMed  CAS  Google Scholar 

  20. Kahraman, A., Avramov, A., Nashev, L. G., et al. (2005) PhenomicDB: a multi-species genotype/phenotype database for comparative phenomics. Bioinformatics 21, 418–420.

    Article  PubMed  CAS  Google Scholar 

  21. Groth, P., Weiss, B., Pohlenz, H. D., and Leser, U. (2008) Mining phenotypes for gene function prediction. BMC Bioinformatics 9, 136.

    Article  PubMed  Google Scholar 

  22. Drysdale, R. (2008) FlyBase: a database for the Drosophila research community. Methods Mol Biol 420, 45–59.

    Article  PubMed  CAS  Google Scholar 

  23. Guldener, U., Munsterkotter, M., Kastenmuller, G., et al. (2005) CYGD: the Comprehensive Yeast Genome Database. Nucleic Acids Res 33, D364–D368.

    Article  PubMed  CAS  Google Scholar 

  24. Sprague, J., Bayraktaroglu, L., Bradford, Y., et al. (2008) The Zebrafish Information Network: the zebrafish model organism database provides expanded support for genotypes and phenotypes. Nucleic Acids Res 36, D768–D772.

    Article  PubMed  CAS  Google Scholar 

  25. Schoof, H., Ernst, R., Nazarov, V., et al. (2004) MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics. Nucleic Acids Res 32, D373–D376.

    Article  PubMed  CAS  Google Scholar 

  26. Flockhart, I., Booker, M., Kiger, A., et al. (2006) FlyRNAi: the Drosophila RNAi screening center database. Nucleic Acids Res 34, D489–494.

    Article  Google Scholar 

  27. Sayers, E. W., Barrett, T., Benson, D. A., et al. (2010) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 38, D5–D16.

    Article  PubMed  CAS  Google Scholar 

  28. Porter, M. F. (1980) An algorithm for suffix stripping. Program 14, 130−137.

    Google Scholar 

  29. Zhao, Y., and Karypis, G. (2003) Clustering in life sciences. Methods Mol Biol 224, 183–218.

    PubMed  CAS  Google Scholar 

  30. Cirelli, C., Bushey, D., Hill, S., et al. (2005) Reduced sleep in Drosophila Shaker mutants. Nature 434, 1087–1092.

    Article  PubMed  CAS  Google Scholar 

  31. Zhao, Y., and Karypis, G. (2005) Data clustering in life sciences. Mol Biotechnol 31, 55–80.

    Article  PubMed  CAS  Google Scholar 

  32. Groth, P., Kalev, I., Kirov, I., Traikov, B., Leser, U., and Weiss, B. (2010) Phenoclustering: Online mining of cross-species phenotypes. Bioinformatics 26(15), 1924–1925.

    Google Scholar 

  33. Washington, N. L., Haendel, M. A., Mungall, C. J., et al. (2009) Linking human diseases to animal models using ontology-based phenotype annotation. PLoS Biol 7, e1000247.

    Article  PubMed  Google Scholar 

  34. Mungall, C. J., Gkoutos, G. V., Smith, C. L., et al. (2010) Integrating phenotype ontologies across multiple species. Genome Biol 11, R2.

    Article  PubMed  Google Scholar 

  35. Groth, P., Weiss, B., and Leser, U. (2010) Ontologies improve cross-species phenotype analysis. In Special Interest Group on Bio-ontologies: Semantic Applications in Life Sciences (Shah, N., Ed.). National Center for Biomedical Ontology, Boston, MA. p. 192.

    Google Scholar 

  36. Tagarelli, A., and Karypis, G. (2008) A segment-based approach to clustering multi-topic documents. In Text Mining Workshop, SIAM Datamining Conference. Atlanta, GA.

    Google Scholar 

  37. Steinbach, M., Karypis, G., and Kumar, V. (2000) A Comparison of Document Clustering Techniques. In KDD Workshop on Text Mining. Boston, MA.

    Google Scholar 

  38. Piano, F., Schetter, A. J., Morton, D. G., et al. (2002) Gene clustering based on RNAi phenotypes of ovary-enriched genes in C. elegans. Curr Biol 12, 1959–1964.

    Article  PubMed  CAS  Google Scholar 

  39. Zhao, Y., and Karypis, G. (2002) Criterion functions for document clustering, University of Minnesota, Department of Computer Science/Army HPC Research Center, Minneapolis.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philip Groth .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Groth, P., Leser, U., Weiss, B. (2011). Phenotype Mining for Functional Genomics and Gene Discovery. In: Yu, B., Hinchcliffe, M. (eds) In Silico Tools for Gene Discovery. Methods in Molecular Biology, vol 760. Humana Press. https://doi.org/10.1007/978-1-61779-176-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-176-5_10

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-61779-175-8

  • Online ISBN: 978-1-61779-176-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics