Strategies to Explore Functional Genomics Data Sets in NCBI’s GEO Database

  • Stephen E. Wilhite
  • Tanya BarrettEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 802)


The Gene Expression Omnibus (GEO) database is a major repository that stores high-throughput functional genomics data sets that are generated using both microarray-based and sequence-based technologies. Data sets are submitted to GEO primarily by researchers who are publishing their results in journals that require original data to be made freely available for review and analysis. In addition to serving as a public archive for these data, GEO has a suite of tools that allow users to identify, analyze, and visualize data relevant to their specific interests. These tools include sample comparison applications, gene expression profile charts, data set clusters, genome browser tracks, and a powerful search engine that enables users to construct complex queries.

Key words

Database Microarray Next-generation sequence Gene expression Epigenomics Functional genomics Data mining 



This chapter is an official contribution of the National Institutes of Health; not subject to copyright in the USA. The authors unreservedly acknowledge the expertise of the whole GEO curation and development team – Pierre Ledoux, Carlos Evangelista, Irene Kim, Kimberly Marshall, Katherine Phillippy, Patti Sherman, Michelle Holko, Dennis Troup, Maxim Tomashevsky, Rolf Muertter, Oluwabukunmi Ayanbule, Andrey Yefanov, and Alexandra Soboleva.


This research was supported by the Intramural Research Program of the NIH, National Library of Medicine.


  1. 1.
  2. 2.
    Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210PubMedCrossRefGoogle Scholar
  3. 3.
    Barrett T, Troup DB, Wilhite SE et al (2009) NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res 37:D885–890PubMedCrossRefGoogle Scholar
  4. 4.
    Sayers EW, Barrett T, Benson DA et al (2009) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 37:D5–15PubMedCrossRefGoogle Scholar
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
    Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410PubMedGoogle Scholar
  13. 13.
    Fingerman IM, McDaniel L, Zhang X et al (2011) NCBI Epigenomics: A new public resource for exploring epigenomic datasets. Nucleic Acids Res 39:D908–12PubMedCrossRefGoogle Scholar
  14. 14.
  15. 15.
    Rhead B, Karolchik D, Kuhn RM et al (2010) The UCSC Genome Browser database: update 2010. Nucleic Acids Res 38:D613–619.PubMedCrossRefGoogle Scholar
  16. 16.
  17. 17.
  18. 18.
    Bhattacharya A, De RK (2008) Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles. Bioinformatics 24:1359–1366PubMedCrossRefGoogle Scholar
  19. 19.
    Pierre M, DeHertogh B, Gaigneaux A et al (2010) Meta-analysis of archived DNA microarrays identifies genes regulated by hypoxia and involved in a metastatic phenotype in cancer cells. BMC Cancer 10:176PubMedCrossRefGoogle Scholar
  20. 20.
    Ogata Y, Suzuki H, Sakurai N et al (2010) CoP: a database for characterizing co-expressed gene modules with biological information in plants. Bioinformatics 26:1267–1268PubMedCrossRefGoogle Scholar
  21. 21.
    Liu S (2010) Increasing alternative promoter repertories is positively associated with differential expression and disease susceptibility. PLoS One 5:e9482PubMedCrossRefGoogle Scholar
  22. 22.
    Chen R, Sigdel TK, Li L et al (2010) Differentially Expressed RNA from Public Microarray Data Identifies Serum Protein Biomarkers for Cross-Organ Transplant Rejection and Other Conditions. PLoS Comput Biol 6:e1000940CrossRefGoogle Scholar
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
    McGrath-Morrow S, Rangasamy T, Cho C et al (2008) Impaired lung homeostasis in neonatal mice exposed to cigarette smoke. Am J Respir Cell Mol Biol 38:393–400PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.National Center for Biotechnology Information, National Library of MedicineNational Institutes of HealthBethesdaUSA

Personalised recommendations