Statistical Methods in Metabolomics

Korman, Alexander; Oh, Amy; Raskind, Alexander; Banks, David

doi:10.1007/978-1-61779-585-5_16

Alexander Korman²,
Amy Oh²,
Alexander Raskind³ &
…
David Banks²

Part of the book series: Methods in Molecular Biology ((MIMB,volume 856))

4978 Accesses
21 Citations

Abstract

Metabolomics is the relatively new field in bioinformatics that uses measurements on metabolite abundance as a tool for disease diagnosis and other medical purposes. Although closely related to proteomics, the statistical analysis is potentially simpler since biochemists have significantly more domain knowledge about metabolites. This chapter reviews the challenges that metabolomics poses in the areas of quality control, statistical metrology, and data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Statistical methods and resources for biomarker discovery using metabolomics

Article Open access 15 June 2023

Computational and statistical analysis of metabolomics data

Article 28 July 2015

Metabolomic Bioinformatic Analysis

References

Rozen, S., Cudkowicz, M., Bogdanov, M., Matson, W., Kristal, B., Beecher, C., Harrison, S., Vouros, P., Flarakos, J., Vigneau-Callahan, K., Matson, T., Newhall, K., Beal, M. F., Brown, R. H. Jr., and Kaddurah-Daouk, R. (2005) Metabolomic analyiss and signtures in motor neuron disease. Metabolomics, 1, 101–108.
Article PubMed CAS Google Scholar
Kenny, L., Dunn, W., Ellis, D., Myers, J., Baker, P., the GOPEC Consortium, and Kell, D. (2005) Novel biomarkers for pre-eclampsia detected using metabolomics and machine learning. Metabolomics, 1, 227–234.
Article Google Scholar
Murthy, A., Rajendiran, T., Poisson, L., Siddiqui, J., Lonigro, R., Alexander, D., Shuster, J., Beecher, C., Wei, J., Chinnaiya, A., and Sreekumar, A. (2010) An alternative screening tool for prostate adenocarcinoma: Biomarker discovery. MURJ, 19, 71–79.
Google Scholar
Romero, R., Mazaki-Tovi, S., Vaisbuch, E., Kusanovic, J., Nien, J., Yoon, B., Mazor, M., Luo, J., Banks, D., Ryals, J., and Beecher, C. (2010) Metabolomics in premature labor: A novel approach to identify patients at risk for preterm delivery. Journal of Maternal-Fetal and Neonatal Medicine, 23, 1344–1359.
Article PubMed CAS Google Scholar
Wishart, D. (2008) Metabolomics: Applications to food science and nutrition research. Trends in Food Science and Technology, 19, 482–493.
Article CAS Google Scholar
Romero, P., Wagg, J., Green, M., Kaiser, D., Krummenacker, M., and Karp, P. (2004) Computational prediction of human metabolic pathways from the complete human genome. Genome Biology, 6, R1–R17.
Article Google Scholar
Dunn, W., and Ellis, D. (2005) Metabolomics: Current analytical platforms and methodologies. Trends in Analytical Chemistry, 24, 285–294.
Article CAS Google Scholar
Broadhurst, D., and Kell, D. (2007) Statistical strategies for avoiding false discoveries in metabolomics and related experiments. Metabolomics, 2, 171–196.
Article Google Scholar
Baggerley, K., Morris, J., and Coombes, K. (2004). Reproducibility of SELD-TOF protein patterns in serum: Comparing datasets from different experiments. Bioinformatics, 20, 777–785.
Article Google Scholar
Kempthorne, O. (1952) Design and Analysis of Experiments, John Wiley & Sons, New York, N.Y.
Google Scholar
Bose, R., and Shimamoto, T. (1952) Classification and analysis of partially balanced incomplete block designs with two associate classes. Journal of the American Statistical Association, 47, 151–184.
Article Google Scholar
Montgomery, D. (1991) Statistical Quality Control, Wiley, New York, N.Y.
Google Scholar
Benjamini, Y., and Hochberg, Y. (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57, 289–300.
Google Scholar
Liu, R. (1995). Control charts for multivariate processes. Journal of the American Statistical Association, 90, 1380–1387.
Article Google Scholar
http://www.nist.gov/srd/nist1.cfm
Wang, K., and Gasser, T. (1997). Alignment of curves by dynamic time warping. Annals of Statistics, 25, 1251–1276.
Article Google Scholar
Katajamaa, M., and Orešič, M. (2007) Data processing for mass spectrometry-based metabolomics. Journal of Chromatography A, 1158, 318–328.
Article PubMed CAS Google Scholar
Xi, Y., and Rocke, D. (2008) Baseline correction for NMR spectroscopic metabolomics data analysis. BMC Bioinformatics, 9, 1–10, doi:10.1186/1471-2105-9-324.
Article Google Scholar
Morrison, D. (1990). Multivariate Statistical Methods, McGraw-Hill, New York, N.Y.
Google Scholar
Martello, S., and Toth, P. (1990) Knapsack Problems: Algorithms and Computer Implementation, John Wiley & Sons, New York, N.Y.
Google Scholar
Gilks, W., Richardson, S., and Spiegelhalter, D. (1996) Markov Chain Monte Carlo in Practice, Chapman & Hall/CRC, Boca Raton, FL.
Google Scholar
Vidakovic, B. (1999) Statistical Modeling by Wavelets, Wiley, New York, N.Y.
Book Google Scholar
Cameron, J. (1982) Error analysis. Encyclopedia of Statistical Sciences, vol. 2, 545–551, Wiley, New York, N.Y.
Google Scholar
Searle, S., Casella, G., and McCulloch, C. (1992) Variance Components, Wiley, New York, N.Y.
Google Scholar
Casella, G., and Berger, R. (1990) Statistical Inference, Duxbury Press, Belmont, CA.
Google Scholar
Steele, A., Hill, K., and Douglas, R. (2002). Data pooling and key comparison reference values. Metrologia, 39, 269–277.
Article Google Scholar
Milliken, G. A. and Johnson, D. E. (2000) The Analysis of Messy Data, vol. II. Wiley.
Google Scholar
Clarke, B., Fokoué, E., and Zhang, H. (2009). Principles and Theory for Data Mining and Machine Learning, Springer, New York, N.Y.
Book Google Scholar
Hastie, T., Tibshirani, R., and Friedman, J. (2009) The Elements of Statistical Learning, Springer, New York, N.Y.
Google Scholar
Fisher, R. A. (1936) The use of multiple measurements in taxonomic problems. Eugenics, 7, 179–188.
Article Google Scholar
Raudys, S. and Young, D. (2004) Results in statistical discriminant analysis: A review of the former Soviet Union literature.” Journal of Multivariate Analysis, 89, 1–35.
Article Google Scholar
Weisberg, S. (1980) Applied Linear Regression, Wiley, New York, N.Y.
Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, B, 58, 267–288.
Google Scholar
Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, B, 67, 301–320.
Article Google Scholar
Candes, E., and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n. Annals of Statistics, 35, 2313–2351.
Article Google Scholar
Vapnik, V. (1996) The Nature of Statistical Learning. Springer, New York, N.Y.
Google Scholar
Cortes, C., and Vapnik, V. (1995), “Support-vector networks,” Machine Learning, 20, 273–297.
Google Scholar
Boser, B., Guyon, I., and Vapnik, V. (1992) A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, D. Haussler, ed., pp. 144–152. ACM Press, Pittsburgh, PA.
Chapter Google Scholar
Aizerman, M., Braverman, E., and Rozonoer, L. (1964) Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, 25, 821–837.
Google Scholar
Breiman, L. (2001) Random forests. Machine Learning, 45, 5–32.
Article Google Scholar
Breiman, L., Friedman, J., Olshen, R., and Stone, C. 1984) Classification and Regression Trees. Wadsworth/Brooks Cole, Belmont, CA.
Google Scholar
Hawkins, D., Kass, G. (1982). Chapter 5: Automatic interaction detection. In Topics in Applied Multivariate Analysis, D. Hawkins, ed., pp. 269–302. Cambridge University Press, Cambridge, U.K.
Chapter Google Scholar
Quinlan, J. R. (1992). C4.5 Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA.
Google Scholar
Efron, B., and Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman & Hall/CRC, Boca Raton, FL.
Google Scholar
Simmons, K., Kinney, J., Owens, A., Kleier, D., Bloch, K., Argentar, D., Walsh, A., and Vaidyanathan, G. (2008). Comparative study of machine learning and chemometric tools for analysis of in-vivo high-throughput screening data. Journal of Chemical Information and Modeling, 48, 1663–1668.
Article PubMed CAS Google Scholar
Truong, Y., Lin, X., Beecher, C., Cutler, A. and Young, S. (2004) Learning a complex dataset using random forests and support vector machines. Proceedings fo the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 835–840.
Google Scholar
Bradley, P., and Mangasarian, O. (1998) Feature selection via concave minimization and support vector machines. International Conference on Machine Learning 15, 82–90.
Google Scholar
Fan, J., and Li, R. (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348–1360.
Article Google Scholar
Wegman, E. (1990) Hyperdimensional data analysis using parallel coordinates. Journal of the American Statistical Association, 85, 664–675.
Article Google Scholar
http://www.ggobi.org
Liu, L., Hawkins, D., Ghosh, S., and Young, S. (2003) Robust singular value decomposition analysis of microarray data. Proceedings of the National Academy of Sciences of the United States of America, 100, 13167–13172.
Article PubMed CAS Google Scholar
Stone, M. (1977) Asymptotics for and against cross-validation. Biometrika, 64, 29–35.
Article Google Scholar
Ivahkenko, A. G. (1970). Heuristic self-organization in problems of engineering cybernetics. Automatica, 6, 207–219.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistical Science, Duke University, Durham, NC, USA
Alexander Korman, Amy Oh & David Banks
Department of Pathology, University of Michigan, Ann Arbor, MI, USA
Alexander Raskind

Authors

Alexander Korman
View author publications
You can also search for this author in PubMed Google Scholar
Amy Oh
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Raskind
View author publications
You can also search for this author in PubMed Google Scholar
David Banks
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Banks .

Editor information

Editors and Affiliations

Department of Computer Science, ETH Zürich, Universitätsstr. 6, Zürich, 8092, Switzerland
Maria Anisimova

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Korman, A., Oh, A., Raskind, A., Banks, D. (2012). Statistical Methods in Metabolomics. In: Anisimova, M. (eds) Evolutionary Genomics. Methods in Molecular Biology, vol 856. Humana Press. https://doi.org/10.1007/978-1-61779-585-5_16

Download citation

DOI: https://doi.org/10.1007/978-1-61779-585-5_16
Published: 31 January 2012
Publisher Name: Humana Press
Print ISBN: 978-1-61779-584-8
Online ISBN: 978-1-61779-585-5
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Statistical Methods in Metabolomics

Abstract

Access this chapter

Similar content being viewed by others

Statistical methods and resources for biomarker discovery using metabolomics

Computational and statistical analysis of metabolomics data

Metabolomic Bioinformatic Analysis

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Navigation

Statistical Methods in Metabolomics

Abstract

Access this chapter

Similar content being viewed by others

Statistical methods and resources for biomarker discovery using metabolomics

Computational and statistical analysis of metabolomics data

Metabolomic Bioinformatic Analysis

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Search

Navigation