Current Epidemiology Reports

, Volume 5, Issue 2, pp 160–165 | Cite as

Environmental Exposure Mixtures: Questions and Methods to Address Them

  • Ghassan B. Hamra
  • Jessie P. Buckley
Epidemiologic Methods (R Maclehose, Section Editor)
Part of the following topical collections:
  1. Topical Collection on Epidemiologic Methods


Purpose of This Review

This review provides a summary of statistical approaches that researchers can use to study environmental exposure mixtures. Two primary considerations are the form of the research question and the statistical tools best suited to address that question. Because the choice of statistical tools is not rigid, we make recommendations about when each tool may be most useful.

Recent Findings

When dimensionality is relatively low, some statistical tools yield easily interpretable estimates of effect (e.g., risk ratio, odds ratio) or intervention impacts. When dimensionality increases, it is often necessary to compromise this interpretablity in favor of identifying interesting statistical signals from noise; this requires applying statistical tools that are oriented more heavily towards dimension reduction via shrinkage and/or variable selection.


The study of complex exposure mixtures has prompted development of novel statistical methods. We suggest that further validation work would aid practicing researchers in choosing among existing and emerging statistical tools for studying exposure mixtures.


Complex mixtures Environmental epidemiology Bayesian methods Machine learning 


Funding Information

JPB was supported by funding from the National Institutes of Health (U24 OD023382).

Compliance with Ethical Standards

Conflict of Interest

The authors declare that they have no conflicts of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.


Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

  1. 1.
    Wild CP. Complementing the genome with an “exposome”: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol Biomark Prev. 2005;14(8):1847–50.CrossRefGoogle Scholar
  2. 2.
    Wild CP. The exposome: from concept to utility. Int J Epidemiol. 2012;41(1):24–32.CrossRefPubMedGoogle Scholar
  3. 3.
    •• Braun JM, et al. What can epidemiological studies tell us about the impact of chemical mixtures on human health? Environ Health Perspect. 2016;124(1):A6–9. Provides an overview of a recent NIEHS workshop and key areas of research interest.CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Hamra GB, Guha N, Cohen A, Laden F, Raaschou-Nielsen O, Samet JM, et al. Outdoor particulate matter exposure and lung cancer: a systematic review and meta-analysis. Environ Health Perspect. 2014;122(9):906–11.PubMedPubMedCentralGoogle Scholar
  5. 5.
    Chung Y, Dominici F, Wang Y, Coull BA, Bell ML. Associations between long-term exposure to chemical constituents of fine particulate matter (PM2.5) and mortality in Medicare enrollees in the eastern United States. Environ Health Perspect. 2015;123(5):467–74.PubMedPubMedCentralGoogle Scholar
  6. 6.
    Howard GJ, Webster TF. Contrasting theories of interaction in epidemiology and toxicology. Environ Health Perspect. 2013;121(1):1–6.PubMedGoogle Scholar
  7. 7.
    Czarnota J, Gennings C, Wheeler DC. Assessment of weighted quantile sum regression for modeling chemical mixtures and cancer risk. Cancer Informat. 2015;14(Suppl 2):159–71.Google Scholar
  8. 8.
    Wolff MS, Engel SM, Berkowitz GS, Ye X, Silva MJ, Zhu C, et al. Prenatal phenol and phthalate exposures and birth outcomes. Environ Health Perspect. 2008;116(8):1092–7.CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Varshavsky JR, Zota AR, Woodruff TJ. A novel method for calculating potency-weighted cumulative phthalates exposure with implications for identifying racial/ethnic disparities among U.S. reproductive-aged women in NHANES 2001-2012. Environ Sci Technol. 2016;50(19):10616–24.CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    VanderWeele TJ. On the distinction between interaction and effect modification. Epidemiology. 2009;20(6):863–71.CrossRefPubMedGoogle Scholar
  11. 11.
    Bobb JF, Valeri L, Claus Henn B, Christiani DC, Wright RO, Mazumdar M, et al. Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures. Biostatistics. 2015;16(3):493–508.CrossRefPubMedGoogle Scholar
  12. 12.
    Hamra G, MacLehose R, Richardson D. Markov chain Monte Carlo: an introduction for epidemiologists. Int J Epidemiol. 2013;42(2):627–34.CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    MacLehose RF, Hamra GB. Applications of Bayesian methods to epidemiologic research. Curr Epidemiol Rep. 2014;1–7.Google Scholar
  14. 14.
    Gelman A, Hill J, Yajima M. Why we (usually) don’t have to worry about multiple comparisons. J Res Educ Effect. 2012;5(2):189–211.Google Scholar
  15. 15.
    MacLehose RF, Dunson DB, Herring AH, Hoppin JA. Bayesian methods for highly correlated exposure data. Epidemiology. 2007;18(2):199–207.CrossRefPubMedGoogle Scholar
  16. 16.
    Hamra G, Richardson D, MacLehose R, Wing S. Integrating informative priors from experimental research with Bayesian methods: an example from radiation epidemiology. Epidemiology. 2013;24(1):90–5.CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Hamra GB, et al. Lung cancer risk associated with regulated and unregulated chrysotile asbestos fibers. Epidemiology. 2016.Google Scholar
  18. 18.
    Wold H. Partial least squares. In: Encyclopedia of statistical sciences. Hoboken: John Wiley & Sons, Inc.; 2004.Google Scholar
  19. 19.
    Wold S, Sjostrom M, Eriksson L. PLS-regression: a basic tool of chemometrics. Chemom Intell Lab Syst. 2001;58(2):109–30.CrossRefGoogle Scholar
  20. 20.
    Zou H, Hastie T. Regularization and variable selection via the elastic net. J.R. Statist Soc B. 2005;67(Part 2):301–20.CrossRefGoogle Scholar
  21. 21.
    Li Q, Lin N. The Bayesian elastic net. Bayesian Anal. 2010;5(1):151–70.CrossRefGoogle Scholar
  22. 22.
    Park T, Casella G. The Bayesian Lasso. J Am Stat Assoc. 2008;103(482):681–6.CrossRefGoogle Scholar
  23. 23.
    Chadeau-Hyam M, Campanella G, Jombart T, Bottolo L, Portengen L, Vineis P, et al. Deciphering the complex: methodological overview of statistical models to derive OMICS-based biomarkers. Environ Mol Mutagen. 2013;54(7):542–57.CrossRefPubMedGoogle Scholar
  24. 24.
    • Stafoggia M, et al. Statistical approaches to address multi-pollutant mixtures and multiplee: the state of the science. Curr Environ Health Rep. 2017;4(4):481–90. Provides an overview of methods that can be applied to higher dimensional mixtures problems, such as exposomics.CrossRefPubMedGoogle Scholar
  25. 25.
    Ho TK. Random decision forests, in Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1—Volume 1). 1995; IEEE Computer Society. pp. 278.Google Scholar
  26. 26.
    Chipman HA, George EI, McCulloch RE. BART: Bayesian additive regression trees. Ann Appl Stat. 2010;4(1):266–98.CrossRefGoogle Scholar
  27. 27.
    Breiman L. Bagging predictors. Mach Learn. 1996;24(2):123–40.Google Scholar
  28. 28.
    Freund Y and Schapire RE. Experiments with a new boosting algorithm, in Proceedings of the Thirteenth International Conference on International Conference on Machine Learning. 1996; Morgan Kaufmann Publishers Inc., Bari. pp. 148–156.Google Scholar
  29. 29.
    Valeri L, et al. The joint effect of prenatal exposure to metal mixtures on neurodevelopmental outcomes at 20–40 months of age: evidence from Rural Bangladesh. Environ Health Perspect. 2017;125(6):067015.CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Herring AH. Nonparametric bayes shrinkage for assessing exposures to mixtures subject to limits of detection. Epidemiology. 2010;21(Suppl 4):S71–6.CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Lenters V, Portengen L, Rignell-Hydbom A, Jönsson BA, Lindh CH, Piersma AH, et al. Prenatal phthalate, perfluoroalkyl acid, and organochlorine exposures and term birth weight in three birth cohorts: multi-pollutant models based on elastic net regression. Environ Health Perspect. 2016;124(3):365–72.PubMedGoogle Scholar
  32. 32.
    Liaw A, Wiener M. Classification and regression by randomForest. R News. 2002;2(3):18–22.Google Scholar
  33. 33.
    Agay-Shay K, Martinez D, Valvi D, Garcia-Esteban R, Basagaña X, Robinson O, et al. Exposure to endocrine-disrupting chemicals during pregnancy and weight at 7 years of age: a multi-pollutant approach. Environ Health Perspect. 2015;123(10):1030–7.CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Austin E, Coull B, Thomas D, Koutrakis P. A framework for identifying distinct multipollutant profiles in air pollution data. Environ Int. 2012;45:112–21.CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Milligan GW. Cluster analysis, in encyclopedia of statistical sciences. Hoboken: John Wiley & Sons, Inc; 2004.Google Scholar
  36. 36.
    Keil AP, et al. A Bayesian approach to the g-formula. Stat Methods Med Res. 2017; 962280217694665.Google Scholar
  37. 37.
    Zanobetti A, Austin E, Coull BA, Schwartz J, Koutrakis P. Health effects of multi-pollutant profiles. Environ Int. 2014;71:13–9.CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Robins J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math Model. 1986;7(9–12):1393–512.CrossRefGoogle Scholar
  39. 39.
    Snowden JM, Mortimer KM, Kang Dufour MS, Tager IB. Population intervention models to estimate ambient NO2 health effects in children with asthma. J Expo Sci Environ Epidemiol. 2015;25(6):567–73.CrossRefPubMedGoogle Scholar
  40. 40.
    Moore K, Neugebauer R, Lurmann F, Hall J, Brajer V, Alcorn S, et al. Ambient ozone concentrations and cardiac mortality in Southern California 1983-2000: application of a new marginal structural model approach. Am J Epidemiol. 2010;171(11):1233–43.CrossRefPubMedGoogle Scholar
  41. 41.
    Bello GA, Arora M, Austin C, Horton MK, Wright RO, Gennings C. Extending the distributed lag model framework to handle chemical mixtures. Environ Res. 2017;156:253–64.CrossRefPubMedGoogle Scholar
  42. 42.
    Liu SH, et al. Lagged kernel machine regression for identifying time windows of susceptibility to exposures of complex mixtures. Biostatistics. 2017.Google Scholar
  43. 43.
    Richardson DB, MacLehose RF, Langholz B, Cole SR. Hierarchical latency models for dose-time-response associations. Am J Epidemiol. 2011;173(6):695–702.CrossRefPubMedPubMedCentralGoogle Scholar
  44. 44.
    Pollack AZ, Perkins NJ, Mumford SL, Ye A, Schisterman EF. Correlated biomarker measurement error: an important threat to inference in environmental epidemiology. Am J Epidemiol. 2013;177(1):84–92.CrossRefPubMedGoogle Scholar
  45. 45.
    Basagana X, et al. Measurement error in epidemiologic studies of air pollution based on land-use regression models. Am J Epidemiol. 2013;178(8):1342–6.CrossRefPubMedGoogle Scholar
  46. 46.
    MacLehose RF, et al. Bayesian methods for correcting misclassification: an example from birth defects epidemiology. Epidemiology. 2009;20(1):27–35.CrossRefPubMedPubMedCentralGoogle Scholar
  47. 47.
    Cole SR, Chu H, Greenland S. Multiple-imputation for measurement-error correction. Int J Epidemiol. 2006;35(4):1074–81.CrossRefPubMedGoogle Scholar
  48. 48.
    Kuchenhoff H, Mwalili SM, Lesaffre E. A general method for dealing with misclassification in regression: the misclassification SIMEX. Biometrics. 2006;62(1):85–96.CrossRefPubMedGoogle Scholar
  49. 49.
    Rosner B, Willett WC, Spiegelman D. Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Stat Med. 1989;8(9):1051–69. discussion 1071-3CrossRefPubMedGoogle Scholar
  50. 50.
    Keller JP, Drton M, Larson T, Kaufman JD, Sandler DP, Szpiro AA. Covariate-adaptive clustering of exposures for air pollution epidemiology cohorts. Ann Appl Stat. 2017;11(1):93–113.CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    Dionisio KL, Chang HH, Baxter LK. A simulation study to quantify the impacts of exposure measurement error on air pollution health risk estimates in copollutant time-series models. Environ Health. 2016;15(1):114.CrossRefPubMedPubMedCentralGoogle Scholar
  52. 52.
    Carpenter B, et al. Stan: a probabilistic programming language 2017. 2017;76(1):32.Google Scholar
  53. 53.
    Rue H, Martino S, Chopin N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J R Stat Soc Ser B Stat Methodol. 2009;71:319–92.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of EpidemiologyJohns Hopkins Bloomberg School of Public HealthBaltimoreUSA
  2. 2.Department of Environmental Health and EngineeringJohns Hopkins Bloomberg School of Public HealthBaltimoreUSA

Personalised recommendations