Skip to main content

Algorithms and Methods for Correlating Experimental Results with Annotation Databases

  • Protocol
  • First Online:
Bioinformatics Methods in Clinical Research

Part of the book series: Methods in Molecular Biology ((MIMB,volume 593))

Abstract

An important procedure in biomedical research is the detection of genes that are differentially expressed under pathologic conditions. These genes, or at least a subset of them, are key biomarkers and are thought to be important to describe and understand the analyzed biological system (the pathology) at a molecular level. To obtain this understanding, it is indispensable to link those genes to biological knowledge stored in databases. Ontological analysis is nowadays a standard procedure to analyze large gene lists. By detecting enriched and depleted gene properties and functions, important insights on the biological system can be obtained. In this chapter, we will give a brief survey of the general layout of the methods used in an ontological analysis and of the most important tools that have been developed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schena M, Shalon D, Davis RW, et al. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467–470.

    Article  PubMed  CAS  Google Scholar 

  2. Golub TR, Slonim DK, Tamayo P, et al. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537.

    Article  PubMed  CAS  Google Scholar 

  3. Dopazo J. (2006) Functional interpretation of microarray experiments. Omics 10:398–410.

    Article  PubMed  CAS  Google Scholar 

  4. Westerhoff HV, Palsson BO. (2004) The evolution of molecular biology into systems biology. Nat Biotechnol 22:1249–1252.

    Article  PubMed  CAS  Google Scholar 

  5. Khatri P, Draghici S. (2005) Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21:3587–3595.

    Article  PubMed  CAS  Google Scholar 

  6. Ashburner M, Ball CA, Blake JA, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29.

    Article  PubMed  CAS  Google Scholar 

  7. http://www.geneontology.org/.

  8. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. (1999) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 27:29–34.

    Article  PubMed  CAS  Google Scholar 

  9. http://www.genome.jp/kegg/.

  10. http://us.expasy.org/sprot/.

  11. Apweiler R, Bairoch A, Wu CH, et al. (2004) UniProt: the Universal Protein knowledgebase. Nucleic Acids Res 32:D115–D119.

    Article  PubMed  CAS  Google Scholar 

  12. Cho RJ, Huang M, Campbell MJ, et al. (2001) Transcriptional regulation and function during the human cell cycle. Nat Genet 27:48–54.

    PubMed  CAS  Google Scholar 

  13. Khatri P, Draghici S, Ostermeier GC, Krawetz SA. (2002) Profiling gene expression using onto-express. Genomics 79:266–270.

    Article  PubMed  CAS  Google Scholar 

  14. Man MZ, Wang X, Wang Y. (2000) POWER_SAGE: comparing statistical tests for SAGE experiments. Bioinformatics 16:953–959.

    Article  PubMed  CAS  Google Scholar 

  15. Rivals I, Personnaz L, Taing L, Potier MC. (2007) Enrichment or depletion of a GO category within a class of genes: which test? Bioinformatics 23:401–407.

    Article  PubMed  CAS  Google Scholar 

  16. Draghici S, Khatri P, Martins RP, et al. (2003) Global functional profiling of gene expression. Genomics 81:98–104.

    Article  PubMed  CAS  Google Scholar 

  17. Yates F. (1984) Test of significance for 2×2 contingency tables. J. Roy Stat Soc Ser A 147:426–463.

    Article  Google Scholar 

  18. Gibbons JD, Pratt JW. (1975) P-values: interpretation and methodology. Am Stat 29:20–25.

    Article  Google Scholar 

  19. Miller RG. (1991) Simultaneous Statistical Inference. Springer-Verlag, New York.

    Google Scholar 

  20. Al-Shahrour F, Diaz-Uriarte R, Dopazo J. (2004) FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 20:578–580.

    Article  PubMed  CAS  Google Scholar 

  21. Beissbarth T, Speed TP. (2004) GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20:1464–1465.

    Article  PubMed  CAS  Google Scholar 

  22. Zeeberg BR, Feng W, Wang G, et al. (2003) GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol 4:R28.

    Article  PubMed  Google Scholar 

  23. Bonferroni CE. (1935) Il calcolo delle assicurazioni su gruppi di teste., pp. 13–60.

    Google Scholar 

  24. Perneger TV. (1998) What’s wrong with Bonferroni adjustments. BMJ 316:1236–1238.

    PubMed  CAS  Google Scholar 

  25. Draghici S. (2003) Data Analysis Tools for DNA Microarrays. Chapman and Hall/CRC Press, Boca Raton, FL.

    Google Scholar 

  26. Hochberg Y, Benjamini Y. (1990) More powerful procedures for multiple significance testing. Stat Med 9:811–818.

    Article  PubMed  CAS  Google Scholar 

  27. Holm S. (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70.

    Google Scholar 

  28. Benjamini Y, Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc Ser B Stat Methodol 57(1):289–300.

    Google Scholar 

  29. Berriz GF, King OD, Bryant B, et al. (2003) Characterizing gene sets with FuncAssociate. Bioinformatics 19:2502–2504.

    Article  PubMed  CAS  Google Scholar 

  30. Khatri P, Voichita C, Kattan K, et al. (2007) Onto-Tools: new additions and improvements in 2006. Nucleic Acids Res 35:W206–W211.

    Article  PubMed  Google Scholar 

  31. Al-Shahrour F, Minguez P, Tarraga J, et al. (2007) FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. Nucleic Acids Res 35:W91–W96.

    Article  PubMed  Google Scholar 

  32. Dennis G, Jr., Sherman BT, Hosack DA, et al. (2003) DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 4:P3.

    Article  Google Scholar 

  33. Sherman BT, Huang da W, Tan Q, et al. (2007) DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis. BMC Bioinformatics 8:426.

    Article  PubMed  CAS  Google Scholar 

  34. Reimand J, Kull M, Peterson H, et al. (2007) g:Profiler–a web-based toolset for functional profiling of gene lists from large-scale experiments. Nucleic Acids Res 35:W193–W200.

    Article  PubMed  Google Scholar 

  35. Carmona-Saez P, Chagoyen M, Tirado F, et al. (2007) GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists. Genome Biol 8:R3.

    Article  PubMed  CAS  Google Scholar 

  36. Niwa R, Slack FJ. (2007) The evolution of animal microRNA function. Curr Opin Genet Dev 17:145–150.

    Article  PubMed  CAS  Google Scholar 

  37. Saito Y, Liang G, Egger G, et al. (2006) Specific activation of microRNA-127 with downregulation of the proto-oncogene BCL6 by chromatin-modifying drugs in human cancer cells. Cancer Cell 9:435–443.

    Article  PubMed  CAS  Google Scholar 

  38. Birney E, Stamatoyannopoulos JA, et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799–816.

    Article  PubMed  CAS  Google Scholar 

  39. Eckhardt F, Lewin J, Cortese R, et al. (2006) DNA methylation profiling of human chromosomes 6, 20 and 22. Nat Genet 38:1378–1385.

    Article  PubMed  CAS  Google Scholar 

  40. Draghici S, Sellamuthu S, Khatri P. (2006) Babel’s tower revisited: a universal resource for cross-referencing across annotation databases. Bioinformatics 22:2934–2939.

    Article  PubMed  CAS  Google Scholar 

  41. http://vortex.cs.wayne.edu/projects.htm.

  42. http://babelomics.bioinfo.cipf.es.

  43. http://david.abcc.ncifcrf.gov/home.jsp.

  44. http://biit.cs.ut.ee/gprofiler/.

  45. http://genecodis.dacya.ucm.es/.

  46. http://genecodis.dacya.ucm.es/help.html.

  47. Draghici S, Khatri P, Bhavsar P, et al. (2003) Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-Translate. Nucleic Acids Res 31:3775–3781.

    Article  PubMed  CAS  Google Scholar 

  48. Hackenberg M, Matthiesen R. (2008) Annotation-Modules: a tool for finding significant combinations of multisource annotations for gene lists. Bioinformatics 24:1386–1393.

    Article  PubMed  CAS  Google Scholar 

  49. Vardhanabhuti S, Wang J, Hannenhalli S. (2007) Position and distance specificity are important determinants of cis-regulatory motifs in addition to evolutionary conservation. Nucleic Acids Res 35:3203–3213.

    Article  PubMed  CAS  Google Scholar 

  50. Neumeister P, Albanese C, Balent B, et al. (2002) Senescence and epigenetic dysregulation in cancer. Int J Biochem Cell Biol 34:1475–1490.

    Article  PubMed  CAS  Google Scholar 

  51. Shen L, Kondo Y, Guo Y, et al. (2007) Genome-wide profiling of DNA methylation reveals a class of normally methylated CpG island promoters. PLoS Genet 3:2023–2036.

    Article  PubMed  CAS  Google Scholar 

  52. Hackenberg M, Previti C, Luque-Escamilla PL, et al. (2006) CpGcluster: a distance-based algorithm for CpG-island detection. BMC Bioinformatics 7:446.

    Article  PubMed  CAS  Google Scholar 

  53. Bock C, Walter J, Paulsen M, et al. (2007) CpG island mapping by epigenome prediction. PLoS Comput Biol 3:e110.

    Article  PubMed  CAS  Google Scholar 

  54. Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15:1034–1050.

    Article  PubMed  CAS  Google Scholar 

  55. Bairoch A, Apweiler R, Wu CH, et al. (2005) The Universal Protein Resource (UniProt). Nucleic Acids Res 33:D154–D159.

    Article  PubMed  CAS  Google Scholar 

  56. Su AI, Wiltshire T, Batalov S, et al. (2004) A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA 101:6062–6067.

    Article  PubMed  CAS  Google Scholar 

  57. Bernardi G. (2001) Misunderstandings about isochores. Part 1. Gene 276:3–13.

    Article  PubMed  CAS  Google Scholar 

  58. Oliver JL, Carpena P, Hackenberg M, Bernaola-Galvan P. (2004) IsoFinder: computational prediction of isochores in genome sequences. Nucleic Acids Res 32:W287–W292.

    Article  PubMed  CAS  Google Scholar 

  59. Wright F. (1990) The ‘effective number of codons’ used in a gene. Gene 87:23–29.

    Article  PubMed  CAS  Google Scholar 

  60. http://web.bioinformatics.cicbiogune.es/AM/doc.php.

  61. http://web.bioinformatics.cicbiogune.es/AM/tutorial.html.

  62. Hackenberg M, Lasso G, Matthiesen R. (2009 Jan 7) ContDist: a tool for the analysis of quantitative gene and promoter properties. BMC Bioinformatics 10:7.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Hackenberg, M., Matthiesen, R. (2010). Algorithms and Methods for Correlating Experimental Results with Annotation Databases. In: Matthiesen, R. (eds) Bioinformatics Methods in Clinical Research. Methods in Molecular Biology, vol 593. Humana Press. https://doi.org/10.1007/978-1-60327-194-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-60327-194-3_15

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60327-193-6

  • Online ISBN: 978-1-60327-194-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics