Effective Analysis of Genomic Data

Nelson, Paul R.; Goulter, Andrew B.; Davis, Richard J.

doi:10.1385/1-59259-836-6:285

Paul R. Nelson³,
Andrew B. Goulter⁴ &
Richard J. Davis⁵

Part of the book series: Methods in Molecular Medicine ((MIMM,volume 104))

309 Accesses

Abstract

High-throughput biotechnology has enabled genome-wide investigation of gene expression and has the potential to identify genes that have a role to play in focal cerebral ischemia, as well as many other interventions. The advent of this technology has also led to the generation of large amounts of expensive and complex expression data. One of the major problems with the generation of so much data is locating and extracting the relevant information to aid target identification and interpretation effectively and reliably. Statistical involvement is vital. Not only does it help to ensure effective extraction of information from the data, it also increases the likelihood that the data collected will embody the information about the differential expression of interest in the first place. The goal of this chapter is to recommend an effective process for investigating gene expression data. There are five stages in this process that we believe lead to reliable results when routinely applied to an expression dataset, once it has been appropriately generated and collected: (1) biological problem definition and design selection; (2) data examination, “preprocessing,” and reexamination; (3) data analysis step I: screening for differentially expressed genes; (4) data analysis step II: verifying differential expression; and (5) biological verification, interpretation, and communication.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Fisher, R. A. (1925) Statistical Methods for Research Workers. Oliver & Boyd, Edinburgh.
Google Scholar
Fisher, R. A. (1926) The arrangement of field experiments. J. Minis. Agric. 33, 503–513.
Google Scholar
Yates, F. (1937) The Design and Analysis of Factorial Experiments. Technical Communication No. 35. Imperial Bureau of Soil Science, Harpenden, Hertfordshire, UK.
Google Scholar
Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868.
Article PubMed CAS Google Scholar
Jackson, J. E. (1980) Principal components and factor analysis: part I—principal components. J. Qual. Technol. 12, 201–213.
Google Scholar
Wold, S., Albano, C., Dunn, W. J., et al. (1984) Multivariate data analysis in chemistry, in: Chemometrics: Mathematics and Statistics in Chemistry (Kowalski, B. R., ed.), D. Reidel, Dordrecht.
Google Scholar
Smyth, G. K. and Speed, T. (2003) Normalization of cDNA microarray data. Methods 31, 265–273.
Article PubMed CAS Google Scholar
Lin, Y., Nadler, S. T., Attie, A. D., and Yandell, B. S. (2001) Mining for low-abundance transcripts in microarray data. Department of Statistics Technical Report #1031, University of Wisconsin, Madison, WI.
Google Scholar
Dudoit, S., Yang, Y. H., Callow, M. J., and Speed, T. P. (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Stat. Sin. 12, 111–140.
Google Scholar
Draper, N. and Smith, H. (1981) Applied Regression Analysis, 2nd ed. Wiley, New York.
Google Scholar
Albano, C., Dunn, W. J. III, Edlund, U., et al. (1978) Four levels of pattern recognition. Anal. Chim. Acta 103, 429–443.
Article CAS Google Scholar
Beebe, K. R., Pell, R. J., and Seasholtz, M. B. (1998) Chemometrics: A Practical Guide. Wiley, New York.
Google Scholar
Hsu, J. C. Multiple Comparisons. Chapman and Hall, London.
Google Scholar
Wetherill, G. B. Intermediate Statistical Methods (1981) Chapman and Hall, London, UK.
Google Scholar

Download references

Author information

Authors and Affiliations

Prism Training and Consultancy Ltd., Cambridge, UK
Paul R. Nelson
Exploratory Target Profiling, Pharmagene plc, Royston, Hertfordshire, UK
Andrew B. Goulter
Pharmagene plc, Royston, Hertfordshire, UK
Richard J. Davis

Authors

Paul R. Nelson
View author publications
You can also search for this author in PubMed Google Scholar
Andrew B. Goulter
View author publications
You can also search for this author in PubMed Google Scholar
Richard J. Davis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

AstraZeneca Pharmaceuticals Macclesfield, Cheshire, UK
Simon J. Read
Neurology Centre for Excellence in Drug Discovery GlaxoSmithKline Pharmaceuticals, Harlow, Essex, UK
David Virley

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Nelson, P.R., Goulter, A.B., Davis, R.J. (2005). Effective Analysis of Genomic Data. In: Read, S.J., Virley, D. (eds) Stroke Genomics. Methods in Molecular Medicine, vol 104. Humana Press. https://doi.org/10.1385/1-59259-836-6:285

Download citation

DOI: https://doi.org/10.1385/1-59259-836-6:285
Publisher Name: Humana Press
Print ISBN: 978-1-58829-333-6
Online ISBN: 978-1-59259-836-6
eBook Packages: Springer Protocols

Publish with us

Policies and ethics