Skip to main content
Log in

Gaussian Asymptotic Limits for the α-transformation in the Analysis of Compositional Data

  • Published:
Sankhya A Aims and scope Submit manuscript

Abstract

Compositional data consists of vectors of proportions whose components sum to 1. Such vectors lie in the standard simplex, which is a manifold with boundary. One issue that has been rather controversial within the field of compositional data analysis is the choice of metric on the simplex. One popular possibility has been to use the metric implied by log-transforming the data, as proposed by Aitchison (Biometrika70, 57–65, 1983, 1986) and another popular approach has been to use the standard Euclidean metric inherited from the ambient space. Tsagris et al. (2011) proposed a one-parameter family of power transformations, the α-transformations, which include both the metric implied by Aitchison’s transformation and the Euclidean metric as particular cases. Our underlying philosophy is that, with many datasets, it may make sense to use the data to help us determine a suitable metric. A related possibility is to apply the α-transformations to a parametric family of distributions, and then estimate α along with the other parameters. However, as we shall see, when one follows this last approach with the Dirichlet family, some care is needed in a certain limiting case which arises (α → 0), as we found out when fitting this model to real and simulated data. Specifically, when the maximum likelihood estimator of α is close to 0, the other parameters tend to be large. The main purpose of the paper is to study this limiting case both theoretically and numerically and to provide insight into these numerical findings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aitchison, J. (1983). Principal components analysis of compositional data. Biometrika 70, 57–65.

    Article  MathSciNet  MATH  Google Scholar 

  • Aitchison, J. (1986). The Statistical Analysis of Compositional Data. Monographs on Statistics and Applied Probability. Chapman & Hall Ltd, London. Reprinted in 2003 with additional material by The Blackburn Press.

  • Baxter, M.J. (1995). Standardization and transformation in principal component analysis, with applications to archaeometry. Appl. Stat. 44, 513–527.

    Article  Google Scholar 

  • Baxter, M.J. (2001). Statistical modelling of artefact compositional data. Archaeometry 43, 131–147.

    Article  Google Scholar 

  • Baxter, M.J., Beardah, C.C., Cool, H.E.M. and Jackson, C.M. (2005). Compositional data analysis of some alkaline glasses. Math. Geol. 37, 183–196.

    Article  Google Scholar 

  • Baxter, M.J. and Freestone, I.C. (2006). Log-ratio compositional data analysis in archaeometry. Archaeometry 48, 511–531.

    Article  Google Scholar 

  • Bhattacharya, A. and Bhattacharya, R.N. (2012). Nonparametric inference on manifolds with applications to shape spaces. Cambridge University Press, Cambridge.

    Book  MATH  Google Scholar 

  • Dryden, I.L., Koloydenko, A. and Zhou, D. (2009). Non-euclidean statistics for covariance matrices, with applications to diffusion tensor imaging. Ann. Appl. Statist. 3, 1102–1123.

    Article  MathSciNet  MATH  Google Scholar 

  • Dryden, I.L., Le, H., Preston, S.P. and Wood, A.T.A. (2014). Mean shapes, projections and intrinsic limiting distributions. [Discussion contribution]. Journal of Statistical Planning and Inference 145, 25–32.

    Article  MathSciNet  MATH  Google Scholar 

  • Dryden, I.L. and Mardia, K.V. (1998). Statistical Shape Analysis. Wiley, New York.

    MATH  Google Scholar 

  • Dryden, I.L. and Mardia, K.V. (2016). Statistical Shape Analysis with Applications in r, 2nd edn. Wiley, New York.

    Book  MATH  Google Scholar 

  • Fisher, R.A. (1953). Dispersion on a sphere. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences. The Royal Society 217, 295–305.

    Article  MathSciNet  MATH  Google Scholar 

  • Fisher, N.I., Lewis, T. and Embleton, B.J.J. (1987). Statistical Analysis of Spherical Data. Cambridge University Press, Cambridge.

    Book  MATH  Google Scholar 

  • Hartigan, J.A. (1975). Clustering Algorithms. Wiley, New York.

    MATH  Google Scholar 

  • Hotz, T. and Huckemann, S. (2015). Intrinsic means on the circle: uniqueness, Locus and Asymptotics. Ann. Inst. Stat. Math. 67, 177–193.

    Article  MathSciNet  MATH  Google Scholar 

  • Kendall, D.G., Barden, D., Carne, T.K. and Le, H. (1999). Shape and Shape Theory. Wiley, New York.

    Book  MATH  Google Scholar 

  • Mardia, K.V. (1972). Statistics of Directional Data. Academic Press, London.

    MATH  Google Scholar 

  • Mardia, K.V. and Jupp, P.E. (2000). Directional Statistics. John Wiley & Sons, Chichester.

    MATH  Google Scholar 

  • Scealy, J.L. and Welsh, A.H. (2014). Colours and cocktails: compositional data analysis. 2013 Lancaster lecture. Aust. N. Z. J. Stat. 56, 145–169.

    Article  MathSciNet  MATH  Google Scholar 

  • Small, C.G. (1996). The Statistical Theory of Shape. Springer, New York.

    Book  MATH  Google Scholar 

  • Tsagris, M.T., Preston, S. and Wood, A.T.A. (2011). A data-based power transformation for compositional data. In: Proceedings of the 4th Compositional Data Analysis Workshop, Girona, Spain.

  • Tsagris, M., Preston, S. and Wood, A.T.A. (2016). Improved classification for compositional data using the α-transformation. J. Classif. 33, 243–261.

    Article  MathSciNet  MATH  Google Scholar 

  • Tsagris, M. and Stewart, C. (2018). A folded model for compositional data analysis. arXiv:1802.07330.

Download references

Acknowledgements

This work was partially supported by EPSRC grant EP/K022547/1, for which we are grateful. Partial results from this research were obtained when the second author was a PhD student at the University of Nottingham.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michail Tsagris.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pantazis, Y., Tsagris, M. & Wood, A.T.A. Gaussian Asymptotic Limits for the α-transformation in the Analysis of Compositional Data. Sankhya A 81, 63–82 (2019). https://doi.org/10.1007/s13171-018-00160-1

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13171-018-00160-1

Keywords and phrases

AMS (2000) subject classification

Navigation