Abstract
Compositional data, data containing relative rather than absolute information, need to be transformed to the usual Euclidean geometry before the standard statistical tools can be applied. Different possible transformations and their properties are discussed. For robust multivariate methods based on a robust covariance estimation, it is crucial to use a transformation that avoids singularity issues. Moreover, the robust location and covariance estimators need to be affine equivariant in order to obtain invariance of the results from the transformation used. Here, different robust multivariate methods are discussed for compositional data analysis, like principal component and discriminant analysis, and applied to a data set from geochemistry.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aitchison, J. (1986). The statistical analysis of compositional data. London: Chapman and Hall.
Aitchison, J., & Greenacre, M. (2002). Biplots of compositional data. Journal of the Royal Statistical Society. Series C. Applied Statistics, 51, 375–392.
Aitchison, J., & Kay, J. W. (1999). Possible solutions of some essential zero problems in compositional data analysis. Available at http://dugi-doc.udg.edu/bitstream/10256/652/1/Aitchison_Kay.pdf. Cited in March 12, 2012.
Eaton, M. (1983). Multivariate statistics. A vector space approach. New York: Wiley.
Egozcue, J. J. (2009). Reply to “On the Harker Variation Diagrams;…” by J.A. Cortés. Mathematical Geosciences, 41, 829–834.
Egozcue, J. J., & Pawlowsky-Glahn, V. (2005). Groups of parts and their balances in compositional data analysis. Mathematical Geology, 37, 795–828.
Egozcue, J. J., & Pawlowsky-Glahn, V. (2006). Simplicial geometry for compositional data. In A. Buccianti, G. Mateu-Figueras, & V. Pawlowsky-Glahn (Eds.), Compositional data in the geosciences: from theory to practice (pp. 145–160). London: Geological Society.
Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G., & Barceló-Vidal, C. (2003). Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35, 279–300.
Filzmoser, P., & Hron, K. (2008). Outlier detection for compositional data using robust methods. Mathematical Geosciences, 40, 233–248.
Filzmoser, P., & Hron, K. (2009). Correlation analysis for compositional data. Mathematical Geosciences, 41, 905–919.
Filzmoser, P., Hron, K., & Reimann, C. (2009). Principal component analysis for compositional data with outliers. Environmetrics, 20, 621–635.
Filzmoser, P., Hron, K., & Reimann, C. (2012a). Interpretation of multivariate outliers for compositional data. Computational Geosciences, 39, 77–85.
Filzmoser, P., Hron, K., & Templ, M. (2012b). Discriminant analysis for compositional data and robust parameter estimation. Computational Statistics doi:10.1007/s00180-011-0279-8.
Fišerová, E., & Hron, K. (2011). On interpretation of orthonormal coordinates for compositional data. Mathematical Geosciences, 43, 455–468.
Gabriel, K. R. (1971). The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58, 453–467.
Hron, K., Filzmoser, P., & Thompson, K. (2012). Linear regression with compositional explanatory variables. Journal of Applied Statistics, 39, 1115–1128.
Hron, K., Templ, M., & Filzmoser, P. (2010). Imputation of missing values for compositional data using classical and robust methods. Computational Statistics & Data Analysis, 54, 3095–3107.
Johnson, R., & Wichern, D. (2007). Applied multivariate statistical analysis (6th ed.). London: Prentice-Hall.
Maronna, R., Martin, R. D., & Yohai, V. J. (2006). Robust statistics: theory and methods. New York: Wiley.
Martín-Fernández, J. A., Hron, K., Templ, M., Filzmoser, P., & Palarea-Albaladejo, J. (2012). Model-based replacement of rounded zeros in compositional data: classical and robust approaches. Computational Statistics & Data Analysis, 56, 2688–2704.
Mateu-Figueras, G., & Pawlowsky-Glahn, V. (2008). A critical approach to probability laws in geochemistry. Mathematical Geosciences, 40, 489–502.
Pawlowsky-Glahn, V., & Buccianti, A. (2011). Compositional data analysis: theory and applications. Chichester: Wiley.
Pawlowsky-Glahn, V., & Egozcue, J. J. (2002). BLU estimators and compositional data. Mathematical Geology, 34, 259–274.
R Development Core Team (2012). R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Reimann, C., Arnoldussen, A., Boyd, R., Finne, T. E., Koller, F., Nordgullen, O., & Englmair, P. (2007). Element contents in leaves of four plant species (birch, mountain ash, fern and spruce) along anthropogenic and geogenic concentration gradients. Science of the Total Environment, 377, 416–433.
Rousseeuw, P., & Van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41, 212–223.
Acknowledgements
The authors gratefully acknowledge the support by the Operational Program Education for Competitiveness—European Social Fund (project CZ.1.07/2.3.00/20.0170 of the Ministry of Education, Youth and Sports of the Czech Republic).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Filzmoser, P., Hron, K. (2013). Robustness for Compositional Data. In: Becker, C., Fried, R., Kuhnt, S. (eds) Robustness and Complex Data Structures. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35494-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-35494-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35493-9
Online ISBN: 978-3-642-35494-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)