Skip to main content

Integrating Functional Genomics Data

  • Protocol
Book cover Bioinformatics

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 453))

Abstract

The revolution in high throughput biology experiments producing genome-scale data has heightened the challenge of integrating functional genomics data. Data integration is essential for making reliable inferences from functional genomics data, as the datasets are neither error-free nor comprehensive. However, there are two major hurdles in data integration: heterogeneity and correlation of the data to be integrated. These problems can be circumvented by quantitative testing of all data in the same unified scoring scheme, and by using integration methods appropriate for handling correlated data. This chapter describes such a functional genomics data integration method designed to estimate the “functional coupling” between genes, applied to the baker's yeast Saccharomyces cerevisiae. The integrated dataset outperforms individual functional genomics datasets in both accuracy and coverage, leading to more reliable and comprehensive predictions of gene function. The approach is easily applied to multicellular organisms, including human.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gollub, J., Ball, C. A., Binkley, G., et al. (2003) The Stanford Micro-array Database: data access and quality assessment tools. Nucleic Acids Res 31, 94–96.

    Article  PubMed  CAS  Google Scholar 

  2. Barrett, T., Suzek, T. O., Troup, D. B., et al. (2005) NCBI GEO: mining millions of expression profiles-database and tools. Nucleic Acids Res 33, D562–566.

    Article  PubMed  CAS  Google Scholar 

  3. Uetz, P., Giot, L., Cagney, G., et al. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevi-siae. Nature 403, 623–627.

    CAS  Google Scholar 

  4. Ito, T., Chiba, T., Ozawa, R., et al. (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA 98, 4569–4574.

    Article  PubMed  CAS  Google Scholar 

  5. Giot, L., Bader, J. S., Brouwer, C., et al. (2003) A protein interaction map of Drosophila melanogaster. Science 302, 1727–1736.

    Article  PubMed  CAS  Google Scholar 

  6. Li, S., Armstrong, C. M., Bertin, N., et al. (2004) A map of the interactome network of the metazoan C. elegans. Science 303, 540–543.

    Article  PubMed  CAS  Google Scholar 

  7. Rual, J. F., Venkatesan, K., Hao, T., et al. (2005) Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178.

    Article  PubMed  CAS  Google Scholar 

  8. Stelzl, U., Worm, U., Lalowski, M., et al. (2005) A human protein-protein interaction network: a resource for annotating the proteome. Cell 122, 957–968.

    Article  PubMed  CAS  Google Scholar 

  9. Gavin, A. C., Bosche, M., Krause, R., et al. (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147.

    Article  PubMed  CAS  Google Scholar 

  10. Ho, Y., Gruhler, A., Heilbut, A., et al. (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183.

    Article  PubMed  CAS  Google Scholar 

  11. Bouwmeester, T., Bauch, A., Ruffner, H., et al. (2004) A physical and functional map of the human TNF-alpha/NF-kappa B signal transduction pathway. Nat Cell Biol 6, 97–105.

    Article  PubMed  CAS  Google Scholar 

  12. Tong, A. H., Evangelista, M., Parsons, A. B., et al. (2001) Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364–2368.

    Article  PubMed  CAS  Google Scholar 

  13. Tong, A. H., Lesage, G., Bader, G. D., et al. (2004) Global mapping of the yeast genetic interaction network. Science 303, 808–813.

    Article  PubMed  CAS  Google Scholar 

  14. Wong, S. L., Zhang, L. V., Tong, A. H., et al. (2004) Combining biological networks to predict genetic interactions. Proc Natl Acad Sci USA 101, 15682–15687.

    Article  PubMed  CAS  Google Scholar 

  15. Kelley, R., Ideker, T. (2005) Systematic interpretation of genetic interactions using protein networks. Nat Biotechnol 23, 561–566.

    Article  PubMed  CAS  Google Scholar 

  16. Mellor, J. C., Yanai, I., Clodfelter, K. H., et al. (2002) Predictome: a database of putative functional links between proteins. Nucleic Acids Res 30, 306–309.

    Article  PubMed  CAS  Google Scholar 

  17. Troyanskaya, O. G., Dolinski, K., Owen, A. B., et al. (2003) A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharo-myces cerevisiae). Proc Natl Acad Sci USA 100, 8348–8353.

    Article  PubMed  CAS  Google Scholar 

  18. Jansen, R., Yu, H., Greenbaum, D., et al. (2003) A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302, 449–453.

    Article  PubMed  CAS  Google Scholar 

  19. von Mering, C., Huynen, M., Jaeggi, D., et al. (2003) STRING: a database of predicted functional associations between proteins. Nucleic Acids Res 31, 258–261.

    Article  Google Scholar 

  20. Bowers, P. M., Pellegrini, M., Thompson, M. J., et al. (2004) Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol 5, R35.

    Article  PubMed  Google Scholar 

  21. Lee, I., Date, S. V., Adai, A. T., et al. (2004) A probabilistic functional network of yeast genes. Science 306, 1555–1558.

    Article  PubMed  CAS  Google Scholar 

  22. Gunsalus, K. C., Ge, H., Schetter, A. J., et al. (2005) Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature 436, 861–865.

    Article  PubMed  CAS  Google Scholar 

  23. Myers, C. L., Robson, D., Wible, A., et al. (2005) Discovery of biological networks from diverse functional genomic data. Genome Biol 6, R114.

    Article  PubMed  Google Scholar 

  24. Kanehisa, M., Goto, S., Kawashima, S., et al. (2002) The KEGG databases at Genom-eNet. Nucleic Acids Res 30, 42–46.

    Article  PubMed  CAS  Google Scholar 

  25. Jensen, F. V. (2001) Bayesian Networks and Decision Graphs. Springer, New York.

    Google Scholar 

  26. Martin, A., Schneider, S., Schwer, B. (2002) Prp43 is an essential RNA-depend-ent ATPase required for release of lariat-intron from the spliceosome. J Biol Chem 277, 17743–17750.

    Article  PubMed  CAS  Google Scholar 

  27. Lebaron, S., Froment, C., Fromont-Racine, M., et al. (2005) The splicing ATPase prp43p is a component of multiple preribosomal particles. Mol Cell Biol 25, 9269–9282.

    Article  PubMed  CAS  Google Scholar 

  28. Leeds, N. B., Small, E. C., Hiley, S. L., et al. (2006) The splicing factor Prp43p, a DEAH box ATPase, functions in ribosome biogenesis. Mol Cell Biol 26, 513–522.

    Article  PubMed  CAS  Google Scholar 

  29. Combs, D. J., Nagel, R. J., Ares, M., Jr., et al. (2006) Prp43p is a DEAH-box spliceo-some disassembly factor essential for ribosome biogenesis. Mol Cell Biol 26, 523–534.

    Article  PubMed  CAS  Google Scholar 

  30. Bork, P., Jensen, L. J., von Mering, C., et al. (2004) Protein interaction networks from yeast to human. Curr Opin Struct Biol 14, 292–299.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported by grants from the N.S.F. (IIS-0325116, EIA-0219061, 0241180), N.I.H. (GM06779-01), Welch (F1515), and a Packard Fellowship (E.M.M.).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Lee, I., Marcotte, E.M. (2008). Integrating Functional Genomics Data. In: Keith, J.M. (eds) Bioinformatics. Methods in Molecular Biology™, vol 453. Humana Press. https://doi.org/10.1007/978-1-60327-429-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-60327-429-6_14

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60327-428-9

  • Online ISBN: 978-1-60327-429-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics