The Use of Go Terms to Understand the Biological Significance of Microarray Differential Gene Expression Data

Díaz-Uriarte, Ramón; Al-Shahrour, Fátima; Dopazo, Joaquín

doi:10.1007/0-306-48354-8_16

Ramón Díaz-Uriarte³,
Fátima Al-Shahrour³ &
Joaquín Dopazo³

233 Accesses

Abstract

We show one way of using Gene Ontology (GO) to understand the biological relevance of statistical differences in gene expression data from microarray experiments. To illustrate our methodology we use the data from Pritchard et al. [2001]. Our approach involves three sequential steps: 1) analyze the data to sort genes according to how much they differ between/among organs using a linear model; 2) divide the genes based on “how much or how strongly” they differ, separating those more expressed in one organ vs. those more expressed in the other organ; 3) examine the relative frequency of GO terms in the two groups, using Fisher’s exact test, with correction for multiple testing, to assess which of the GO terms differ significantly between the groups of genes. We repeat steps 2) and 3) using a sliding window that covers all the sorted genes, so that we successively compare each group of genes against all others.

By using the GO terms, we obtain biological information about the predominant biological processes or molecular functions of the genes that are differentially expressed between organs, making it easier to evaluate the biological relevance of inter-organ differences in the expression of sets of genes. Moreover, when applied to novel situations (e.g., comparing different cancer conditions), this method can provide important hints about the biologically relevant aspects and characteristics of the differences between conditions. Finally, the proposed method is easily applied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H., Cherry, J., Davis, A., Dolinski, K., Dwight, S., Eppig, J., Harris, M., Hill, D., Issel-Traver, L., Kasarskis, A., Lewis, S., Matese, J., Richardson, J., Ringwald, M., and Rubin, G. S. G. 2000. Gene ontology: tool for the unification of biologygene ontology: tool for the unification of biology. Nat. Genet, 25:25–29.
PubMed CAS Google Scholar
Benjamini, Y. and Hochberg, Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Royal Statistical Society, Series B, 57: 289–300.
Google Scholar
Benjamini, Y., Drai, D., Kafkafi, N., Elmer, G., Golani, I. 1999. Controlling the false discovery rate in behavior genetics research. (available from http://www.math.tau.ac.il/∼ybenja)
Brown, P., O., and Botstein, D. 1999. Exploring the new world of the genome with DNA microarrays. Nature Biotechnol, 14:1675–1680.
Google Scholar
Conde, L., Mateos, Á., Herrero, J. & Dopazo, J. 2002 Unsupervised reduction of the dimensionality followed by supervised learning with a perceptron improves the classification of conditions in DNA microarray gene expression data. Ntural Networks for Signal Processing XII. IEEE Press (New York). Eds. Boulard, Adali, Bengio, Larsen, Douglas. pp. 77–86
Google Scholar
Dudoit, S., Shaffer, J. P., Boldrick, J. C. 2002 Multiple hypothesis testing in microarray experiments. Technical report # 110. Division of Biostatistics, UC Berkeley.
Google Scholar
Dudoit, S., Yang, Y. H., Callow, M. J., and Speed. T. P. 2002 Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica, 12: 111–139.
Google Scholar
Gibbons, F. D., Roth, F. P. 2002. Judging the quality of gene expression-based clustering methods using gene annotation. Genome Research, 12: 1574–1581.
Article PubMed CAS Google Scholar
Kerr, M. K., and Churchill, G. A. 2001 a. Experimental design for gene expression microarrays. Biostatistics, 2: 183–201.
Article PubMed Google Scholar
Kerr, M. K. and Chucrchill, G. A. 2001 b. Statistical design and the analysis of gene expression microarray data. Genetical Research, 77: 123–128.
PubMed CAS Google Scholar
Miller, R. G.. 1997. Beyond Anova Chapman & Hall.
Google Scholar
Milliken, C. A. and Johnson, D. E. 1992. Analysis of Messy Data. Chapman & Hall.
Google Scholar
Pavlidis, P., Lewis, D. P., Noble, W. S. 2002. Exploring gene expression data with class scores. Proc. Pacific Symp. Biocomputing, 2002: 474–485.
Google Scholar
Pritchard, C. C., Hsu, L., Nelson, P. S. 2001. Project normal: defining normal variance in mouse gene expression. PNAS, 98:13266–13271.
Article PubMed CAS Google Scholar
Westfall, P. H. and Young, S. S. 1993. Resampling-based multiple testing: examples and methods for p-value adjustment. John Wiley & Sons.
Google Scholar
Wolfinger R. D., Gibson, G., Wolfinger, E., Bennett, L., Hamadeh, H., Bushel, P., Afshari, C., Paules, R. S. 2001. Assessing gene significance from cDNA microarray expression data via mixed models. Journal of Computational Biology, 8: 625–637.
Article PubMed CAS Google Scholar
Zhou, X., Kao, M.-C. J., Wong, W. H. 2002. Transitive functional annotation by shortest-path analysis of gene expression data. PNAS, 99: 12783–12788.
PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

Bioinformatics Unit, Centro Nacional de Investigaciones Oncológicas, (CNIO), Spanish National Cancer Centre, Madrid, Spain
Ramón Díaz-Uriarte, Fátima Al-Shahrour & Joaquín Dopazo

Authors

Ramón Díaz-Uriarte
View author publications
You can also search for this author in PubMed Google Scholar
Fátima Al-Shahrour
View author publications
You can also search for this author in PubMed Google Scholar
Joaquín Dopazo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Cancer Center Information Systems, Duke University Medical Center, Durham, NC
Kimberly F. Johnson
Duke Bioinformatics Shared Resource, Duke University Medical Center, Durham, NC
Simon M. Lin

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Díaz-Uriarte, R., Al-Shahrour, F., Dopazo, J. (2004). The Use of Go Terms to Understand the Biological Significance of Microarray Differential Gene Expression Data. In: Johnson, K.F., Lin, S.M. (eds) Methods of Microarray Data Analysis III. Springer, Boston, MA. https://doi.org/10.1007/0-306-48354-8_16

Download citation

DOI: https://doi.org/10.1007/0-306-48354-8_16
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4020-7582-7
Online ISBN: 978-0-306-48354-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics