Environmental and Ecological Statistics

, Volume 15, Issue 1, pp 71–78 | Cite as

Detecting pattern in biological stressor response relationships using model based cluster analysis

  • Ilya Lipkovich
  • Eric P. Smith
  • Keying Ye


Environmental monitoring of aquatic systems is needed to estimate the quality of the systems, to evaluate standards and to study stressor–response relationships. Monitoring programs often focus on the collection of biological, chemical and physical measures of the system. An important concern is the effect of chemical and physical stressors on the biological community. Evaluation of relationships may be difficult as the extent of the relationship is not known. From a management perspective, interest is on what factors affect the biological community and where these factors have an influence. The focus of this paper is on the use of regression based cluster analysis as a tool for finding relationships between a single biological response and a suite of environmental stressors. The approach to cluster analysis uses a penalized regression classification likelihood and Markov Chain Model Composition Monte Carlo. This approach allows for simultaneous development of regression models and clustering of the regression models. The method is applied to the analysis of a data set describing stressors/response relationship in Ohio.


Bayesian methods Cluster analysis Markov Chain MonteCarlo (MCMC) simulation Regression Water quality 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Banfield JD and Raftery AE (1993). Model based Gaussian and non-Gaussian clustering. Biometrics 49: 803–821 CrossRefGoogle Scholar
  2. Bensmail H, Celeux G, Raftery AE and Robert C (1997). Inference in model-based cluster analysis. Stat Comput 7: 1–10 CrossRefGoogle Scholar
  3. Dyer SD, White-Hull C, Carr GC, Smith EP and Wang X (2000). Bottom-up and top-down approaches to assess multiple stressors over large geographic areas. Environ Toxicol Chem 19(4–2): 1066–1075 CrossRefGoogle Scholar
  4. Fraley C, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Technical Report No. 329, Department of Statistics, University of WashingtonGoogle Scholar
  5. Gabriel KR (1971). The biplot-graphic display of matrices with application to principal component analysis. Biometrika 58: 453–467 CrossRefGoogle Scholar
  6. Lipkovich I (2002) Bayesian model averaging and variable selection in multivariate ecological models. PhD Dissertation, Virginia Polytechnic InstituteGoogle Scholar
  7. Lipkovich I and Smith EP (2002). Biplot and SVD macros for EXCEL. J Stat Software 7: 5 Google Scholar
  8. Madigan D and Raftery AE (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc 89(428): 1535–1546 CrossRefGoogle Scholar
  9. McLachlan GJ and Peel D (2000). Finite mixture models. Wiley, New York Google Scholar
  10. Norton SB (1999) Using biological monitoring data to distinguish among types of stress in streams of the Eastern Corn Belt Plains ecoregion. Ph.D. thesis, George Mason UniversityGoogle Scholar
  11. Raftery AE (1995) Bayesian model selection in social research (with discussion). In Marsden PV (ed) Sociological methodology. Blackwells Publishers, Cambridge, pp 111–195Google Scholar
  12. Thisted RA (1988). Elements of statistical computing. Chapman and Hall, London Google Scholar
  13. Wedel M and Kamakura WA (1999). Market segmentation, methodological and conceptual foundation, 2nd ed. Kluwer, Dordrecht Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.Eli Lilly and CompanyLilly Corporate CenterIndianapolisUSA
  2. 2.Department of StatisticsVirginia TechBlacksburgUSA
  3. 3.Department of Management Science and StatisticsUniversity of Texas at San AntonioSan AntonioUSA

Personalised recommendations