Abstract
Recent literature regarding microarray technology has focused on the need to incorporate classical statistical practices in experimental design in order to utilize more robust, classical statistical methodologies in data analysis. We have demonstrated that classical statistical methods are applicable to analysis of data previously presented by Golub, et al. 1999. Our preliminary analysis of all 6817 genes involves simple t-tests for statistically significant separation of means of gene expression level in two cancer types. Our subsets of genes that distinguish AML types from ALL types are relatively consistent with those published by Golub. We select those predictor genes based on the t-values and stepwise discriminant analysis, and evaluate the resulting model’s performance in predicting 34 test samples by linear discriminant analysis. Only two samples were not correctly predicted (samples 61 and 66) with 25 predictor genes we chose. We also evaluate the parsimony of our model by evaluating, through a stepwise method, the minimum number of genes required to maintain a high level of accuracy in predicting cancer types.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bittner, Michael, Meltzer, Paul, and Trent, Jeffrey, 1999, Data analysis and integration: of steps and arrows. Nature Genetics. Vol 22, pp213–215.
Dudoit, Sandrine, Yang, Yee Hwa, Callow Matthew J., and Speed, Terence P., 2000, Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Technical Report #578, http://www.stat.Berkeley.EDU/users/terry/zarray/html/matt.html
Duggan, David J., Bittner, Michael, Chen, Yidong, Meltzer Paul, and Trent, Jeffrey M., 1999, Expression profiling using cDNA microarrays. Nature Genetics, Vol 21, pp10–14.
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D. and Lander, E.S., 1999. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, Vol 286, pp531–537.
Golub’s web site(www.genome.wi.mit.edu/MPR)
Hilsenbeck, Susan G., Friedrichs, William E., Schiff, Rachel, O’Connell, Peter, Hansen, Rhonda K., Osborne, Kent, and Fuqua, Suzanne A.W., 1999, Statistical analysis of array expression data as applied to the problem of tamoxifen resistance. J Nat. Cancer Inst, Vol 91.5, pp453–459
Kaminski, Naftali, Allard, John D., Pittet, Jean F., Zuo, Fengrong, Griffiths, Mark J.D., Morris, David, Huang, Xiaozhu, Sheppard, Dean, and Heller, Renu A., 2000, Global analysis of gene expression in pulmonary fibrosis reveals distinct programs regulating lung inflammation and fibrosis. PNAS, Vol 97.4, pp 1778–1783
Kerr, M.K. and Churchill, G.A., 2001, Statistical design and the analysis of gene expression microarray data. Genet. Res. Apr: 77(2), pp 123–128.
PUBMED http://www.ncbi.nlm.nih.gov/entrez/query.fcgi)
SAS/STAT User’s Guide (V6.04), 1990. SAS Institute, Inc., Cary, NC, USA
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media New York
About this chapter
Cite this chapter
Lu, J., Hardy, S., Tao, WL., Muse, S., Weir, B., Spruill, S. (2002). Classical Statistical Approaches to Molecular Classification of Cancer from Gene Expression Profiling. In: Lin, S.M., Johnson, K.F. (eds) Methods of Microarray Data Analysis. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0873-1_8
Download citation
DOI: https://doi.org/10.1007/978-1-4615-0873-1_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5281-5
Online ISBN: 978-1-4615-0873-1
eBook Packages: Springer Book Archive