Abstract
Standard regression analyses are often plagued with problems encountered when one tries to make inference going beyond main effects, using datasets that contain dozens of variables that are potentially correlated. This situation arises, for example, in environmental deprivation studies, where a large number of deprivation scores are used as covariates, yielding a potentially unwieldy set of interrelated data from which teasing out the joint effect of multiple deprivation indices is difficult. We propose a method, based on Dirichlet-process mixture models that addresses these problems by using, as its basic unit of inference, a profile formed from a sequence of continuous deprivation measures. These deprivation profiles are clustered into groups and associated via a regression model to an air pollution outcome. The Bayesian clustering aspect of the proposed modeling framework has a number of advantages over traditional clustering approaches in that it allows the number of groups to vary, uncovers clusters and examines their association with an outcome of interest and fits the model as a unit, allowing a region’s outcome potentially to influence cluster membership. The method is demonstrated with an analysis UK Indices of Deprivation and PM10 exposure measures corresponding to super output areas (SOA’s) in greater London.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
ABELLAN, J. J., FECHT, D., BEST, N., RICHARDSON, S. and BRIGGS, D. (2007): Bayesian analysis of the multivariate geographical distribution of the socio-economic environment in england. Environmetrics, 18(7):745–758. 10.1002/env.872.
BRIGGS, D., ABELLAN, J.J. and FECHT, D. (2008): Environmental inequity in england: small area associations between socio-economic status and environmental pollution. Soc Sci Med, 67(10):1612–29.
BROWN, P. (1995): Race, class, and environmental health: a review and systematization of the literature. Environ Res, 69(1):15–30.
DAHL, D. (2006): Bayesian Inference for Gene Expression and Proteomics, chapter Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model. Cambridge University Press.
GILKS, W., RICHARDSON, S. and SPIEGELHALTER, D. (1996): Markov chain Monte Carlo in practice. Chapman and Hall.
GREEN, P.J. and RICHARDSON, S. (2001): Modelling heterogeneity with and without the Dirichlet process. Scandinavian Journal of Statistics, 28(2):355–375.
ISHWARAN, H. and JAMES, L. (2001): Gibbs sampling methods for stick-breaking priors. J. Amer. Statist. Assoc., 96:161–173.
KAUFMAN, L. and ROUSSEEUW, P.J. (2005): Finding groups in data : an introduction to cluster analysis. Wiley series in probability and mathematical statistics. Wiley-Interscience, Hoboken, N.J..
MEDVEDOVIC, M. and SIVAGANESAN, S. (2002): Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics, 18(9):1194–206.
MOLITOR, J., PAPATHOMAS, M., JERRETT, M. and RICHARDSON, S. (accepted): Bayesian profile regression with an application to the national survey of children’s health.
NEAL, R.M. (2000): Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9(2):249–265.
NOBLE, M. WRIGHT, G, DIBBEN, C., SMITH, G., MCLENNAN, D., ANTTILA, G. and SDRC TEAM (2004): The English Indices of Deprivation. Office of the Deputy Prime Minister, Neighbourhood Renewal Unit.
R DEVELOPMENT CORE TEAM (2006): R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Molitor, J., Fortunato, L., Molitor, NT., Richardson, S. (2010). Examining the Association between Deprivation Profiles and Air Pollution in Greater London using Bayesian Dirichlet Process Mixture Models. In: Lechevallier, Y., Saporta, G. (eds) Proceedings of COMPSTAT'2010. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2604-3_25
Download citation
DOI: https://doi.org/10.1007/978-3-7908-2604-3_25
Published:
Publisher Name: Physica-Verlag HD
Print ISBN: 978-3-7908-2603-6
Online ISBN: 978-3-7908-2604-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)