Abstract
An important procedure in biomedical research is the detection of genes that are differentially expressed under pathologic conditions. These genes, or at least a subset of them, are key biomarkers and are thought to be important to describe and understand the analyzed biological system (the pathology) at a molecular level. To obtain this understanding, it is indispensable to link those genes to biological knowledge stored in databases. Ontological analysis is nowadays a standard procedure to analyze large gene lists. By detecting enriched and depleted gene properties and functions, important insights on the biological system can be obtained. In this chapter, we will give a brief survey of the general layout of the methods used in an ontological analysis and of the most important tools that have been developed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Schena M, Shalon D, Davis RW, et al. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467–470.
Golub TR, Slonim DK, Tamayo P, et al. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537.
Dopazo J. (2006) Functional interpretation of microarray experiments. Omics 10:398–410.
Westerhoff HV, Palsson BO. (2004) The evolution of molecular biology into systems biology. Nat Biotechnol 22:1249–1252.
Khatri P, Draghici S. (2005) Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21:3587–3595.
Ashburner M, Ball CA, Blake JA, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29.
Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. (1999) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 27:29–34.
Apweiler R, Bairoch A, Wu CH, et al. (2004) UniProt: the Universal Protein knowledgebase. Nucleic Acids Res 32:D115–D119.
Cho RJ, Huang M, Campbell MJ, et al. (2001) Transcriptional regulation and function during the human cell cycle. Nat Genet 27:48–54.
Khatri P, Draghici S, Ostermeier GC, Krawetz SA. (2002) Profiling gene expression using onto-express. Genomics 79:266–270.
Man MZ, Wang X, Wang Y. (2000) POWER_SAGE: comparing statistical tests for SAGE experiments. Bioinformatics 16:953–959.
Rivals I, Personnaz L, Taing L, Potier MC. (2007) Enrichment or depletion of a GO category within a class of genes: which test? Bioinformatics 23:401–407.
Draghici S, Khatri P, Martins RP, et al. (2003) Global functional profiling of gene expression. Genomics 81:98–104.
Yates F. (1984) Test of significance for 2×2 contingency tables. J. Roy Stat Soc Ser A 147:426–463.
Gibbons JD, Pratt JW. (1975) P-values: interpretation and methodology. Am Stat 29:20–25.
Miller RG. (1991) Simultaneous Statistical Inference. Springer-Verlag, New York.
Al-Shahrour F, Diaz-Uriarte R, Dopazo J. (2004) FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 20:578–580.
Beissbarth T, Speed TP. (2004) GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20:1464–1465.
Zeeberg BR, Feng W, Wang G, et al. (2003) GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol 4:R28.
Bonferroni CE. (1935) Il calcolo delle assicurazioni su gruppi di teste., pp. 13–60.
Perneger TV. (1998) What’s wrong with Bonferroni adjustments. BMJ 316:1236–1238.
Draghici S. (2003) Data Analysis Tools for DNA Microarrays. Chapman and Hall/CRC Press, Boca Raton, FL.
Hochberg Y, Benjamini Y. (1990) More powerful procedures for multiple significance testing. Stat Med 9:811–818.
Holm S. (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70.
Benjamini Y, Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc Ser B Stat Methodol 57(1):289–300.
Berriz GF, King OD, Bryant B, et al. (2003) Characterizing gene sets with FuncAssociate. Bioinformatics 19:2502–2504.
Khatri P, Voichita C, Kattan K, et al. (2007) Onto-Tools: new additions and improvements in 2006. Nucleic Acids Res 35:W206–W211.
Al-Shahrour F, Minguez P, Tarraga J, et al. (2007) FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. Nucleic Acids Res 35:W91–W96.
Dennis G, Jr., Sherman BT, Hosack DA, et al. (2003) DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 4:P3.
Sherman BT, Huang da W, Tan Q, et al. (2007) DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis. BMC Bioinformatics 8:426.
Reimand J, Kull M, Peterson H, et al. (2007) g:Profiler–a web-based toolset for functional profiling of gene lists from large-scale experiments. Nucleic Acids Res 35:W193–W200.
Carmona-Saez P, Chagoyen M, Tirado F, et al. (2007) GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists. Genome Biol 8:R3.
Niwa R, Slack FJ. (2007) The evolution of animal microRNA function. Curr Opin Genet Dev 17:145–150.
Saito Y, Liang G, Egger G, et al. (2006) Specific activation of microRNA-127 with downregulation of the proto-oncogene BCL6 by chromatin-modifying drugs in human cancer cells. Cancer Cell 9:435–443.
Birney E, Stamatoyannopoulos JA, et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799–816.
Eckhardt F, Lewin J, Cortese R, et al. (2006) DNA methylation profiling of human chromosomes 6, 20 and 22. Nat Genet 38:1378–1385.
Draghici S, Sellamuthu S, Khatri P. (2006) Babel’s tower revisited: a universal resource for cross-referencing across annotation databases. Bioinformatics 22:2934–2939.
Draghici S, Khatri P, Bhavsar P, et al. (2003) Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-Translate. Nucleic Acids Res 31:3775–3781.
Hackenberg M, Matthiesen R. (2008) Annotation-Modules: a tool for finding significant combinations of multisource annotations for gene lists. Bioinformatics 24:1386–1393.
Vardhanabhuti S, Wang J, Hannenhalli S. (2007) Position and distance specificity are important determinants of cis-regulatory motifs in addition to evolutionary conservation. Nucleic Acids Res 35:3203–3213.
Neumeister P, Albanese C, Balent B, et al. (2002) Senescence and epigenetic dysregulation in cancer. Int J Biochem Cell Biol 34:1475–1490.
Shen L, Kondo Y, Guo Y, et al. (2007) Genome-wide profiling of DNA methylation reveals a class of normally methylated CpG island promoters. PLoS Genet 3:2023–2036.
Hackenberg M, Previti C, Luque-Escamilla PL, et al. (2006) CpGcluster: a distance-based algorithm for CpG-island detection. BMC Bioinformatics 7:446.
Bock C, Walter J, Paulsen M, et al. (2007) CpG island mapping by epigenome prediction. PLoS Comput Biol 3:e110.
Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15:1034–1050.
Bairoch A, Apweiler R, Wu CH, et al. (2005) The Universal Protein Resource (UniProt). Nucleic Acids Res 33:D154–D159.
Su AI, Wiltshire T, Batalov S, et al. (2004) A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA 101:6062–6067.
Bernardi G. (2001) Misunderstandings about isochores. Part 1. Gene 276:3–13.
Oliver JL, Carpena P, Hackenberg M, Bernaola-Galvan P. (2004) IsoFinder: computational prediction of isochores in genome sequences. Nucleic Acids Res 32:W287–W292.
Wright F. (1990) The ‘effective number of codons’ used in a gene. Gene 87:23–29.
Hackenberg M, Lasso G, Matthiesen R. (2009 Jan 7) ContDist: a tool for the analysis of quantitative gene and promoter properties. BMC Bioinformatics 10:7.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Hackenberg, M., Matthiesen, R. (2010). Algorithms and Methods for Correlating Experimental Results with Annotation Databases. In: Matthiesen, R. (eds) Bioinformatics Methods in Clinical Research. Methods in Molecular Biology, vol 593. Humana Press. https://doi.org/10.1007/978-1-60327-194-3_15
Download citation
DOI: https://doi.org/10.1007/978-1-60327-194-3_15
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60327-193-6
Online ISBN: 978-1-60327-194-3
eBook Packages: Springer Protocols