Cluster Analysis of Untargeted Metabolomic Experiments
Untargeted metabolite profiling based upon LC-MS methodology can be used to identify unique metabolic phenotypes associated with stress, disease or environmental exposure of cells using mathematical clustering. Here, we show how unsupervised data analysis is a powerful tool for both quality control and answering simple biological questions. We will demonstrate how to format untargeted mass spectrometry data for import into R, a programming language and software environment for statistical computing (R Development Core Team. R: A language and environment for statistical computing, reference index version 2.15. R Foundation for Statistical Computing, Vienna, 2012). Using R, we transform untargeted metabolite data using hierarchical clustering and principal component analysis (PCA) to create visual representations of change between biological samples and explore how these can be used predictively, in determining environmental stress, health and metabolic insight.
Key wordsClustering Cluster analysis Pattern recognition Untargeted metabolomics Phenotyping Data mining
The authors would also like to acknowledge that this work was part of the DOE Joint BioEnergy Institute (http://www.jbei.org) supported by the US Department of Energy, Office of Science, Office of Biological and Environmental Research, through contract DE-AC02-05CH11231 between Lawrence Berkeley National Laboratory and the US Department of Energy.
- 1.Development Core Team R (2012) R: A language and environment for statistical computing, reference index version 2.15.1. R Foundation for Statistical Computing, ViennaGoogle Scholar
- 4.Everitt B (1974) Cluster analysis. Heinemann Educational Books, LondonGoogle Scholar
- 5.Hartigan JA (1975) Clustering algorithms. Wiley, New YorkGoogle Scholar
- 6.Anderberg MR (1973) Cluster analysis for applications. Academic Press, New YorkGoogle Scholar
- 7.Murtagh F (1985) Multidimensional Clustering Algorithms. In: COMPSTAT Lectures 4. Physica-Verlag, WuerzburgGoogle Scholar
- 8.Becker RA, Chambers JM, Wilks AR (1988) The new S language. Wadsworth & Brooks/Cole Advanced Books & Software, MontereyGoogle Scholar
- 9.Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, LondonGoogle Scholar
- 13.Gordon AD (1999) Classification. Chapman and Hall / CRC, LondonGoogle Scholar