Abstract
Multivariate abundance data are commonly collected in ecology, and used to explore questions of “community composition”—how relative abundance of different taxa changes with environmental conditions. In this paper, we propose a log-linear marginal modeling approach for analyzing such compositional count data, via generalized estimating equations. This method exploits the multiplicative nature of log-linear models for counts, by reparameterizing models that describe marginal effects on mean abundance. This allows partitioning into “main effects” and compositional effects, which is appealing for interpretation. We apply the proposed approach to reanalyze compositional counts of benthic invertebrates from Delaware Bay, and data of invertebrate communities inhabiting Acacia plants in eastern Australia. In both cases we resort to a resampling approach to make inferences about regression parameters, because the number of clusters was not large compared to cluster size.
Similar content being viewed by others
References
Aitchison J (1986) The statistical analysis of compositional data. Chapman & Hall, London
Anderson MJ (2001) A new method for non-parametric multivariate analysis of variance. Austral Ecol 26: 32–46
Andrew NR, Hughes L (2005) Arthropod community structure along a latitudinal gradient: implications for future impacts of climate change. Austral Ecol 30: 281–297
Billheimer D, Cardoso T, Freeman E, Guttorp P, Ko H, Silkey M (1997) Natural variability of benthic species composition in the Delaware Bay. Environ Ecol Stat 4: 95–115
Billheimer D, Guttorp P, Fagan WF (2001) Statistical interpretation of species composition. J Am Stat Assoc 96: 1205–1214
Chaganty N (1997) An alternative approach to the analysis of longitudinal data via generalized estimating equations. J Stat Plan Inference 63: 39–54
Crowder M (1995) On the use of a working correlation matrix in using generalised linear models for repeated measures. Biometrika 82(2): 407–410
Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge
Drum M, McCullagh P (1993) Regression models for discrete longitudinal responses: comment. Stat Sci 8(3): 300–301
Duong T (2005) ks: Kernel smoothing. http://web.maths.unsw.edu.au/~tduong, R package version 1.3.4
Efron B, Tibshirani R (1993) An introduction to the bootstrap. Chapman & Hall, New York
Hardin JW, Hilbe JM (2002) Generalized estimating equations. Chapman & Hall, Boca Raton
Hilbe JM (2007) Negative binomial regression. Cambridge University Press, Cambridge
Lahiri SN (2003) Resampling methods for dependent data. Springer, New York
Lawless JF (1987) Negative binomial and mixed Poisson regression. Can J Stat 15: 209–225
Leps J, Smilauer P (2003) Multivariate analysis of ecological data using CANOCO. The Univeristy Press, Cambridge
Liang KY, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73: 13–22
Mancl LA, DeRouen TA (2001) A covariance estimator for GEE with improved small-sample properties. Biometrics 57(1): 126–134
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, London
Pan W (2001) Akaike’s information criterion in generalized estimating equations. Biometrics 57: 120–125
Shults J, Chaganty NR (1998) Analysis of serially correlated data using quasi-least squares. Biometrics 54: 1622–1630
Warton DI (2005) Many zeros does not mean zero inflation: comparing the goodness-of-fit of parametric models to multivariate abundance data. Environmetrics 16(3): 275–289
Warton DI (2008) Raw data graphing: an informative but under-utilized tool for the analysis of multivariate abundances. Austral Ecol 33(3): 290–300
Warton DI (in press) Regularized sandwich estimators for analysis of high dimensional data using generalized estimating equations. Biometrics
Zeger SL, Liang KY (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42: 121–130
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Warton, D.I., Guttorp, P. Compositional analysis of overdispersed counts using generalized estimating equations. Environ Ecol Stat 18, 427–446 (2011). https://doi.org/10.1007/s10651-010-0145-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10651-010-0145-9