Compositional analysis of overdispersed counts using generalized estimating equations

Warton, David I.; Guttorp, Peter

doi:10.1007/s10651-010-0145-9

Compositional analysis of overdispersed counts using generalized estimating equations

Published: 16 May 2010

Volume 18, pages 427–446, (2011)
Cite this article

Environmental and Ecological Statistics Aims and scope Submit manuscript

David I. Warton¹ &
Peter Guttorp²

314 Accesses
6 Citations
Explore all metrics

Abstract

Multivariate abundance data are commonly collected in ecology, and used to explore questions of “community composition”—how relative abundance of different taxa changes with environmental conditions. In this paper, we propose a log-linear marginal modeling approach for analyzing such compositional count data, via generalized estimating equations. This method exploits the multiplicative nature of log-linear models for counts, by reparameterizing models that describe marginal effects on mean abundance. This allows partitioning into “main effects” and compositional effects, which is appealing for interpretation. We apply the proposed approach to reanalyze compositional counts of benthic invertebrates from Delaware Bay, and data of invertebrate communities inhabiting Acacia plants in eastern Australia. In both cases we resort to a resampling approach to make inferences about regression parameters, because the number of clusters was not large compared to cluster size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aitchison J (1986) The statistical analysis of compositional data. Chapman & Hall, London
Google Scholar
Anderson MJ (2001) A new method for non-parametric multivariate analysis of variance. Austral Ecol 26: 32–46
Google Scholar
Andrew NR, Hughes L (2005) Arthropod community structure along a latitudinal gradient: implications for future impacts of climate change. Austral Ecol 30: 281–297
Article Google Scholar
Billheimer D, Cardoso T, Freeman E, Guttorp P, Ko H, Silkey M (1997) Natural variability of benthic species composition in the Delaware Bay. Environ Ecol Stat 4: 95–115
Article Google Scholar
Billheimer D, Guttorp P, Fagan WF (2001) Statistical interpretation of species composition. J Am Stat Assoc 96: 1205–1214
Article Google Scholar
Chaganty N (1997) An alternative approach to the analysis of longitudinal data via generalized estimating equations. J Stat Plan Inference 63: 39–54
Article Google Scholar
Crowder M (1995) On the use of a working correlation matrix in using generalised linear models for repeated measures. Biometrika 82(2): 407–410
Article Google Scholar
Davison AC, Hinkley DV (1997) Bootstrap methods and their application. Cambridge University Press, Cambridge
Google Scholar
Drum M, McCullagh P (1993) Regression models for discrete longitudinal responses: comment. Stat Sci 8(3): 300–301
Article Google Scholar
Duong T (2005) ks: Kernel smoothing. http://web.maths.unsw.edu.au/~tduong, R package version 1.3.4
Efron B, Tibshirani R (1993) An introduction to the bootstrap. Chapman & Hall, New York
Google Scholar
Hardin JW, Hilbe JM (2002) Generalized estimating equations. Chapman & Hall, Boca Raton
Book Google Scholar
Hilbe JM (2007) Negative binomial regression. Cambridge University Press, Cambridge
Google Scholar
Lahiri SN (2003) Resampling methods for dependent data. Springer, New York
Google Scholar
Lawless JF (1987) Negative binomial and mixed Poisson regression. Can J Stat 15: 209–225
Article Google Scholar
Leps J, Smilauer P (2003) Multivariate analysis of ecological data using CANOCO. The Univeristy Press, Cambridge
Book Google Scholar
Liang KY, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73: 13–22
Article Google Scholar
Mancl LA, DeRouen TA (2001) A covariance estimator for GEE with improved small-sample properties. Biometrics 57(1): 126–134
Article PubMed CAS Google Scholar
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, London
Google Scholar
Pan W (2001) Akaike’s information criterion in generalized estimating equations. Biometrics 57: 120–125
Article PubMed CAS Google Scholar
Shults J, Chaganty NR (1998) Analysis of serially correlated data using quasi-least squares. Biometrics 54: 1622–1630
Article Google Scholar
Warton DI (2005) Many zeros does not mean zero inflation: comparing the goodness-of-fit of parametric models to multivariate abundance data. Environmetrics 16(3): 275–289
Article Google Scholar
Warton DI (2008) Raw data graphing: an informative but under-utilized tool for the analysis of multivariate abundances. Austral Ecol 33(3): 290–300
Article Google Scholar
Warton DI (in press) Regularized sandwich estimators for analysis of high dimensional data using generalized estimating equations. Biometrics
Zeger SL, Liang KY (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42: 121–130
Article PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics and Statistics, Evolution and Ecology Research Centre, The University of New South Wales, Sydney, NSW, 2052, Australia
David I. Warton
Department of Statistics, The University of Washington, Seattle, WA, USA
Peter Guttorp

Authors

David I. Warton
View author publications
You can also search for this author in PubMed Google Scholar
Peter Guttorp
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David I. Warton.

Electronic Supplementary Material

ESM 1 (PDF 69 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Warton, D.I., Guttorp, P. Compositional analysis of overdispersed counts using generalized estimating equations. Environ Ecol Stat 18, 427–446 (2011). https://doi.org/10.1007/s10651-010-0145-9

Download citation

Received: 04 February 2009
Revised: 10 January 2010
Published: 16 May 2010
Issue Date: September 2011
DOI: https://doi.org/10.1007/s10651-010-0145-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compositional analysis of overdispersed counts using generalized estimating equations

Abstract

Access this article

Similar content being viewed by others

A simple algorithm for computing the probabilities of count models based on pure birth processes

Violating the normality assumption may be the lesser of two evils

Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

ESM 1 (PDF 69 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Compositional analysis of overdispersed counts using generalized estimating equations

Abstract

Access this article

Similar content being viewed by others

A simple algorithm for computing the probabilities of count models based on pure birth processes

Violating the normality assumption may be the lesser of two evils

Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

ESM 1 (PDF 69 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation