Abstract
The presence of irregular data, i.e., zeroes, missings, and outliers, has consequences for plotting, descriptive statistics, estimation of parameters, and statistical tests. Since all of these methods depend on every value of the dataset, we cannot ignore them in any task. Some ad hoc procedures are needed, with the same aims as the classical methods, but which can still be computed despite the existence of irregular data. It is necessary to augment the basic concepts (e.g., the concept of expected value) to give them a meaning when there are irregular values. The current state of the art of the treatment of irregularities in compositional data analysis is far from being a closed subject and can improve a lot in the near future. The package only provides a set of tools limited to detect, represent, and briefly analyze such irregular values, either missing values, zeroes, or outliers. This chapter provides nevertheless additional background material to enable the reader to carefully treat datasets with irregular data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aitchison, J. (1986). The statistical analysis of compositional data. Monographs on statistics and applied probability. London: Chapman & Hall (Reprinted in 2003 with additional material by The Blackburn Press), 416 pp.
Filzmoser, P., & Hron, K. (2008). Outlier detection for compositional data using robust methods. Mathematical Geosciences, 40(3), 233–248.
Fry, J. M., Fry, T. R. L., & McLaren, K. R. (2000). Compositional data analysis and zeros in micro data. Applied Economics, 32(8), 953–959.
Fung, W. -K., & Bacon-Shone, J. (1993). Quasi-Bayesian modelling of multivariate outliers. Computational Statistics & Data Analysis, 16(3), 271–278.
Madigan, D., Raftery, A. E., Volinksy, C. T., & Hoeting, J. A. (1996). Bayesian model averaging. Technical report. Colorado State University.
MartÃn-Fernández, J. A., Barceló-Vidal, C., & Pawlowsky-Glahn, V. (2000). Zero replacement in compositional data sets. In H. Kiers, J. Rasson, P. Groenen, & M. Shader (Eds.), Studies in classification, data analysis, and knowledge organization (Proceedings of the 7th conference of the International Federation of Classification Societies (IFCS’2000), University of Namur, Namur, 11–14 July (pp. 155–160). Berlin: Springer, 428 pp.
MartÃn-Fernández, J. A., Barceló-Vidal, C., & Pawlowsky-Glahn, V. (2003). Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Mathematical Geology, 35(3), 253–278.
Mateu-Figueras, G., & Pawlowsky-Glahn, V. (2007). The skew-normal distribution on the simplex. Communications in Statistics—Theory and Methods, 39(6), 1787–1802.
Palarea-Albaladejo, J., & Martìn-Fernández, J. (2008). A modified em alr-algorithm for replacing rounded zeros in compositional data sets. Computers & Geosciences, 34, 2233–2251.
Rousseeuw, P., Croux, C., Todorov, V., Ruckstuhl, A., Salibian-Barrera, M., Maechler, M., et al. (2007). robustbase: Basic robust statistics. R package version 0.2-8.
Rousseeuw, P., & Leroy, A. (2003). Robust regression and outlier detection. Wiley series in probability and statistics. New York: Wiley, 239 pp.
Tjelmeland, H., & Lund, K. V. (2003). Bayesian modelling of spatial compositional data. Journal of Applied Statistics, 30(1), 87–100.
van den Boogaart, K. G., Tolosana-Delgado, R., & Bren, M. (2006). Concepts for handling zeroes and missing values in compositional data. In E. Pirard, A. Dassargues, & H. B. Havenith (Eds.), Proceedings of IAMG’06—The XI annual conference of the International Association for Mathematical Geology, University of Liège, Belgium, CD-ROM.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
van den Boogaart, K.G., Tolosana-Delgado, R. (2013). Zeroes, Missings, and Outliers. In: Analyzing Compositional Data with R. Use R!. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36809-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-36809-7_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36808-0
Online ISBN: 978-3-642-36809-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)