Abstract
We show one way of using Gene Ontology (GO) to understand the biological relevance of statistical differences in gene expression data from microarray experiments. To illustrate our methodology we use the data from Pritchard et al. [2001]. Our approach involves three sequential steps: 1) analyze the data to sort genes according to how much they differ between/among organs using a linear model; 2) divide the genes based on “how much or how strongly” they differ, separating those more expressed in one organ vs. those more expressed in the other organ; 3) examine the relative frequency of GO terms in the two groups, using Fisher’s exact test, with correction for multiple testing, to assess which of the GO terms differ significantly between the groups of genes. We repeat steps 2) and 3) using a sliding window that covers all the sorted genes, so that we successively compare each group of genes against all others.
By using the GO terms, we obtain biological information about the predominant biological processes or molecular functions of the genes that are differentially expressed between organs, making it easier to evaluate the biological relevance of inter-organ differences in the expression of sets of genes. Moreover, when applied to novel situations (e.g., comparing different cancer conditions), this method can provide important hints about the biologically relevant aspects and characteristics of the differences between conditions. Finally, the proposed method is easily applied.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H., Cherry, J., Davis, A., Dolinski, K., Dwight, S., Eppig, J., Harris, M., Hill, D., Issel-Traver, L., Kasarskis, A., Lewis, S., Matese, J., Richardson, J., Ringwald, M., and Rubin, G. S. G. 2000. Gene ontology: tool for the unification of biologygene ontology: tool for the unification of biology. Nat. Genet, 25:25–29.
Benjamini, Y. and Hochberg, Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Royal Statistical Society, Series B, 57: 289–300.
Benjamini, Y., Drai, D., Kafkafi, N., Elmer, G., Golani, I. 1999. Controlling the false discovery rate in behavior genetics research. (available from http://www.math.tau.ac.il/∼ybenja)
Brown, P., O., and Botstein, D. 1999. Exploring the new world of the genome with DNA microarrays. Nature Biotechnol, 14:1675–1680.
Conde, L., Mateos, Á., Herrero, J. & Dopazo, J. 2002 Unsupervised reduction of the dimensionality followed by supervised learning with a perceptron improves the classification of conditions in DNA microarray gene expression data. Ntural Networks for Signal Processing XII. IEEE Press (New York). Eds. Boulard, Adali, Bengio, Larsen, Douglas. pp. 77–86
Dudoit, S., Shaffer, J. P., Boldrick, J. C. 2002 Multiple hypothesis testing in microarray experiments. Technical report # 110. Division of Biostatistics, UC Berkeley.
Dudoit, S., Yang, Y. H., Callow, M. J., and Speed. T. P. 2002 Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica, 12: 111–139.
Gibbons, F. D., Roth, F. P. 2002. Judging the quality of gene expression-based clustering methods using gene annotation. Genome Research, 12: 1574–1581.
Kerr, M. K., and Churchill, G. A. 2001 a. Experimental design for gene expression microarrays. Biostatistics, 2: 183–201.
Kerr, M. K. and Chucrchill, G. A. 2001 b. Statistical design and the analysis of gene expression microarray data. Genetical Research, 77: 123–128.
Miller, R. G.. 1997. Beyond Anova Chapman & Hall.
Milliken, C. A. and Johnson, D. E. 1992. Analysis of Messy Data. Chapman & Hall.
Pavlidis, P., Lewis, D. P., Noble, W. S. 2002. Exploring gene expression data with class scores. Proc. Pacific Symp. Biocomputing, 2002: 474–485.
Pritchard, C. C., Hsu, L., Nelson, P. S. 2001. Project normal: defining normal variance in mouse gene expression. PNAS, 98:13266–13271.
Westfall, P. H. and Young, S. S. 1993. Resampling-based multiple testing: examples and methods for p-value adjustment. John Wiley & Sons.
Wolfinger R. D., Gibson, G., Wolfinger, E., Bennett, L., Hamadeh, H., Bushel, P., Afshari, C., Paules, R. S. 2001. Assessing gene significance from cDNA microarray expression data via mixed models. Journal of Computational Biology, 8: 625–637.
Zhou, X., Kao, M.-C. J., Wong, W. H. 2002. Transitive functional annotation by shortest-path analysis of gene expression data. PNAS, 99: 12783–12788.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer Science + Business Media, Inc.
About this chapter
Cite this chapter
Díaz-Uriarte, R., Al-Shahrour, F., Dopazo, J. (2004). The Use of Go Terms to Understand the Biological Significance of Microarray Differential Gene Expression Data. In: Johnson, K.F., Lin, S.M. (eds) Methods of Microarray Data Analysis III. Springer, Boston, MA. https://doi.org/10.1007/0-306-48354-8_16
Download citation
DOI: https://doi.org/10.1007/0-306-48354-8_16
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4020-7582-7
Online ISBN: 978-0-306-48354-7
eBook Packages: Springer Book Archive