, Volume 79, Issue 2, pp 137–153 | Cite as

MAUP sensitivity analysis of ecological bias in health studies

  • Andrew Swift
  • Lin Liu
  • James Uber


Ecological bias introduced by spatial data aggregation causes significant variation in correlation statistics between pathogen exposures and illness rates. Modifiable areal unit problem sensitivity analysis is introduced to investigate the impact of spatial aggregation on ecological bias. Simulation produces numerical estimates for the relative magnitudes of components that effect ecological bias: (1) spatial autocorrelation of exposure concentrations; (2) scaling; (3) zoning; (4) network-clustered structure of illness events; (5) clustering of exposure measurements; and (6) the statistical distribution of exposure concentrations. These six components are mixed and used to compare random illness patterns to patterns determined from a dose–response model. Of the six, spatial autocorrelation of exposure data has the greatest influence on ecological bias. Spatial aggregation can cause high correlations in random illness patterns. More importantly, if pathogen concentrations are randomly distributed in space, then there is a greater likelihood that data aggregation might obscure a strong association.


GIS Aggregation MAUP Bias 



This research was funded by EPA grant number R831629.


  1. Amrhein, C., & Reynolds, H. (1996). Using spatial statistics to assess aggregation effects. Geographical Systems, 3, 143–158.Google Scholar
  2. Anselin, L. (1988). Spatial econometrics: Methods and models. Dordrecht: Martinus Nijhoff.CrossRefGoogle Scholar
  3. Anselin, L., & Cho, W. (2002). Spatial effects and ecological inference. Political Analysis, 10(3), 276–297.CrossRefGoogle Scholar
  4. Arbia, G. (1989). Statistical effect of data transformations: A proposed general framework. In M. Goodchild & S. Gopal (Eds.), The accuracy of spatial data bases (pp. 249–259). London: Taylor and Fransis.Google Scholar
  5. Armstrong, M., Rushton, G., & Zimmerman, D. (1999). Geographically masking health data to preserve confidentiality. Statistics in Medicine, 18, 497–525.CrossRefGoogle Scholar
  6. Bell, B., Hoskins, R., Pickle, L., & Wartenberg, D. (2006). Current practices in spatial analysis of cancer data: mapping health statistics to inform policymakers and the public. International Journal of Health Geographics, 5(49). Google Scholar
  7. Besag, J., & Newell, J. (1991). The detection of clusters in rare diseases. Journal of the Royal Statistical Society, Series A, 154, 143–155.CrossRefGoogle Scholar
  8. Best, N., Ickstadt, K., & Wolpert, R. (2000). Spatial poisson regression for health and exposure data measured at disparate resolutions. Journal of the American Statistical Association, 95(452), 1076–1088.CrossRefGoogle Scholar
  9. Boscoe, F., Ward, M., & Reynolds, P. (2004). Current practices in spatial analysis of cancer data: data characteristics and data sources for geographic studies of cancer. International Journal of Health Geographics, 3(28).Google Scholar
  10. Brody, H., Rip, M., Vinten-Johansen, P., Paneth, N., & Rachman, S. (2000). Map-making and myth-making in broad street: the london cholera epidemic, 1854. The Lancet, 356(9223), 64–68.CrossRefGoogle Scholar
  11. Cayo, M., & Talbot, T. (2003). Positional error in automated geocoding of residential addresses. International Journal of Health Geographics, 2(10).Google Scholar
  12. Clark, W., & Avery, K. (1976). The effects of data aggregation in statistical analysis. Geographic Analysis, 8, 428–438.CrossRefGoogle Scholar
  13. Clarke, K., McLafferty, S., & Tempalski, B. (1996) On epidemiology and geographic information systems: A review and discussion of future directions. Emerging Infectious Diseases, 2(2).Google Scholar
  14. Cockings, S., Dunn, C., Bhopal, R., & Walker, D. (2004). Users perspectives on epidemiological, GIS and point pattern approaches to analyzing environment and health data. Health and Place, 10(2), 169–182.CrossRefGoogle Scholar
  15. Cressie, N. (1991). Statistics for spatial data. New York: Wiley.Google Scholar
  16. Cromley, E., & McLafferty, S. (2002). GIS and public health. New York: The Guilford Press.Google Scholar
  17. Diggle, P. (2003). Statistical analysis of spatial point patterns (2nd ed.). NewYork: Academic Press.Google Scholar
  18. Fisher, R., Thornton, H., & Mackenzie, W. (1922). The accuracy of the plating method of estimating the density of bacterial populations, with particular reference to the use of Thornton’s agar medium with soil samples. Annals Applied Botany, 9, 325–359.CrossRefGoogle Scholar
  19. Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2000). Quantitative geography: Perspectives on spatial data analysis. Thousand Oaks: Sage Publications.Google Scholar
  20. Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002). Geographically weighted regression: The analysis of spatially varying relationships. Colorado: Wiley.Google Scholar
  21. Fotheringham, A. S., & Wong, D. (1991). The modifiable areal unit problem in multivariate statistical analysis. Environment and Planning A, 23(1), 1025–1044.CrossRefGoogle Scholar
  22. Fotheringham, A. S., & Zhan, F. (1996). A comparison of three exploratory methods for cluster detection in spatial point patterns. Geographical Analysis, 28, 200–218.CrossRefGoogle Scholar
  23. Freedman, D. (1999). From association to causation: Some remarks on the history of statistics. Statistical Science, 14(3), 243–258.CrossRefGoogle Scholar
  24. Freedman, D., Klein, S., Ostland, M., & Roberts, M. (1998). Review of: A solution to the ecological inference problem. by G. King. Journal of the American Statistical Association, 93(444), 1518–1522.CrossRefGoogle Scholar
  25. Gatrell, A., Bailey, T., Diggle, P., & Rowlingson, B. (1996). Spatial point pattern analysis and its application in geographical epidemiology. Transactions of the Institute of British Geographers, 21, 256–274.CrossRefGoogle Scholar
  26. Gehlke, C., & Biehl, K. (1934). Effects of grouping upon the size of the correlation coefficient in census tract material. Journal of the American Statistical Association, 29(185), 169–170.CrossRefGoogle Scholar
  27. Goodchild, M. (1993). Data models and data quality: Problems and prospects. In M. Goodchild, B. Parks, & L. Steyaert (Eds.), Environmental modeling with GIS. Oxford: Oxford University Press.Google Scholar
  28. Gottway, C., & Young, L. (2002). Combining incompatible spatial data. Journal of the American Statistical Association, 97(458), 632.CrossRefGoogle Scholar
  29. Griffith, D. (2003). Spatial autocorrelation and spatial filtering. New York: Springer.CrossRefGoogle Scholar
  30. Griffith, D., & Amrhein, C. (1997). Multivariate statistical analysis for geographers. Upper Saddle River, NJ: Prentice Hall.Google Scholar
  31. Griffith, D., & Haining, R. (2006). Beyond Mule kicks: The poisson distribution in geographical analysis. Geographical Analysis, 38(2), 123–139.CrossRefGoogle Scholar
  32. Griffith, D., Wong, D., & Whitfield, T. (2003). Exploring relationships between the global and regional measures of spatial autocorrelation. Journal of Regional Science, 43(4), 683–710.CrossRefGoogle Scholar
  33. Grubesic, T., & Matisziw, T. (2006). On the use of ZIP codes and ZIP code tabulation areas (ZCTAs) for the spatial analysis of epidemiological data. International Journal of Health Geographics, 5.Google Scholar
  34. Haining, R., Griffith, D., & Bennett, R. (1983). Simulating two-dimensional autocorrelated surfaces. Geographical Analysis, 15, 247–255.CrossRefGoogle Scholar
  35. Hennekens, C., & Buring, J. (1987). Epidemiology in medicine. Philadelphia: Lippincott Williams and Wilkins.Google Scholar
  36. Holt, J., Lo, C., & Hodler, T. (2004). Dasymetric estimation of population density and areal interpolation of Census data. Cartography and Geographic Information Science, 31, 103–121.CrossRefGoogle Scholar
  37. Holt, D., Steel, D., Tranmer, M., & Wrigley, N. (1996). Aggregation and ecological effects in geographically based data. Geographical Analysis, 28, 244–261.CrossRefGoogle Scholar
  38. Jackson, C., Best, N., & Richardson, S. (2006). Improving ecological inference using individual level data. Statistics in Medicine, 25, 2136–2159.CrossRefGoogle Scholar
  39. Jacquez, G. (2004). Current practices in spatial analysis of cancer data: flies in the ointment, or, the limitations of spatial analysis. International Journal of Health Geographics, 3.Google Scholar
  40. Jacquez, G., & Jacquez, J. (1999). Disease clustering for uncertain locations. In A. Lawson, A. Biggeri, D. Bohning, E. Lesaffre, J. Viel, & R. Bertollini (Eds.), Disease mapping and risk assessment for public health decision making (pp. 151–168). London: Wiley.Google Scholar
  41. King, G. (1997). A solution to the ecological inference problem. Reconstructing individual behavior from aggregate data. Princeton: Princeton University Press.Google Scholar
  42. Kottas, A., Duan, J., & Gelfand, A. (2007). Modeling disease incidence data with spatial and spatio-temporal Dirichlet process mixtures. Biometrical Journal, 50(1), 29–42.CrossRefGoogle Scholar
  43. Kulldorff, M., Feuer, E., Miller, B., & Freedman, S. (1997). Breast cancer clusters in the northeast United States: A geographic analysis. Am J Epidemiol, 146, 161–170.CrossRefGoogle Scholar
  44. Mazumdar, S., Rushton, G., Smith, B., Zimmerman, D. & Donham, K. (2008). Geocoding accuracy and the recovery of relationships between environmental exposures and health. International Journal of Health Geographics, 7(13).Google Scholar
  45. Meade, M., & Earickson, R. (2000). Medical geography. New York: The Guilford Press.Google Scholar
  46. Mennis, J. (2003). Generating surface models of population using dasymetric mapping. The Professional Geographer, 55, 31–42.Google Scholar
  47. Moon, Z., & Farmer, F. (2001). Population density surface: A new approach to an old problem. Society and Natural Resources, 14(1), 39–51.CrossRefGoogle Scholar
  48. Okabe, A., Boots, B., & Sugihara, K. (1992). Spatial tessellations: Concepts and applications of Voronoi diagrams. London: Wiley.Google Scholar
  49. Openshaw, S. (1984). The modifiable areal unit problem. Norwich, UK: Geo Books.Google Scholar
  50. Openshaw, S., Charlton, M., Wymer, C., & Craft, A. (1987). A mark 1 geographical analysis machine for the automated analysis of point data sets. International J Geographical Information Systems, 1, 335–358.CrossRefGoogle Scholar
  51. Openshaw, S., & Taylor, P. (1979). A million or so correlation coefficients: Three experiments on the modifiable areal unit problem. In N. Wrigley & R. Bennet (Eds.), Statistical applications in the spatial sciences. London: Pion.Google Scholar
  52. Paez, A., & Scott, D. (2004). Spatial statistics for urban analysis: A review of techniques with examples. GeoJournal, 61(1), 53–67.CrossRefGoogle Scholar
  53. Quinn, B., & Allen, R. (2001). Using GIS to successfully create organizational change in Cincinnati. ESRI International Conference Proceedings. Google Scholar
  54. Reynolds, H. (1998). The modifiable area unit problem: Empirical analysis by statistical simulation. PhD thesis, Department of Geography University of Toronto. Last Accessed, August 2005.
  55. Robinson, W. (1950). Ecological correlations and the behavior of individuals. American Sociological Review, 15(3), 351–357.CrossRefGoogle Scholar
  56. Rushton, G., & Lolonis, P. (1996). Exploratory spatial analysis of birth defect rates in an urban population. Statistics in Medicine, 15, 717–726.CrossRefGoogle Scholar
  57. Searle, S. (1971). Linear models. London: Wiley.Google Scholar
  58. Shi, X. (2007). Evaluating the uncertainty caused by Post Office Box addresses in environmental health studies: A restricted Monte Carlo approach. International Journal of Geographical Information Science, 21(3), 325–340.CrossRefGoogle Scholar
  59. Steyerberg, E., Eijkemans, M., & Habbema, J. (1999). Stepwise selection in small data sets: A simulation study of bias in logistic regression analysis. Journal of Clinical Epidemiology, 52(10), 935–942.CrossRefGoogle Scholar
  60. Sui, D. Z., & Giardino, J. (1995). Application of GIS in environmental equity analysis: A multi-scale and multi-zoning scheme study for the City of Houston, Texas, USA. In GIS/LIS95, Annual Conference and Exposition Proceedings Volume II, Nashville, TN, pp. 950–959.Google Scholar
  61. Swift, A., Liu, L., & Uber, J. (2008). Reducing MAUP bias of correlation statistics between water quality and GI illness. Computers, Environment and Urban Systems, 32, 134–148.CrossRefGoogle Scholar
  62. Tobler, W. (1989). Frame independent spatial analysis. In M. Goodchild & S. Gopal (Eds.), Accuracy of spatial databases (pp. 115–122). London: Taylor and Francis.Google Scholar
  63. Tranmer, M., & Steel, D. (1998). Using census data to investigate the causes of the ecological fallacy. Environment and Planning A, 30, 817831.CrossRefGoogle Scholar
  64. Van Beurden, A., & Douven, W. (1999). Aggregation issues of spatial information in environmental research. International Journal of Geographical Information Science, 13(5), 513–527.CrossRefGoogle Scholar
  65. Wakefield, J. (2003). Sensitivity analyses for ecological regression. Biometrics, 59, 9–17.CrossRefGoogle Scholar
  66. Wakefield, J. (2007). Disease mapping and spatial regression with count data. Biostatistics, 8(2), 158–183.CrossRefGoogle Scholar
  67. Wakefield, J., & Shaddick, G. (2006). Health-exposure modelling and the ecological fallacy. Biostatistics, 7(3), 438–455.CrossRefGoogle Scholar
  68. Waring, J. (2005). Beyond blame: cultural barriers to medical incident reporting. Social Science and Medicine, 60(9), 1927–1935.CrossRefGoogle Scholar
  69. Wong, D. (1996). Aggregation effects in geo-referenced data. In S. Arlinghaus, D. Griffith, W. Drake, & J. Nystuen (Eds.), Practical handbook of spatial statistics (pp. 83–106). Boca Raton: CRC Press.Google Scholar
  70. Zandbergen, P., & Chakraborty, J. (2006). Improving environmental exposure analysis using cumulative distribution functions and individual geocoding. International Journal of Health Geographics, 5(23), 1–15.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  1. 1.Department of GeographyUniversity of CincinnatiCincinnatiUSA
  2. 2.School of Geography and PlanningSun Yat-sen UniversityGuangzhouPeople’s Republic of China
  3. 3.School of EnergyEnvironmental, Biological and Medical Engineering, University of CincinnatiCincinnatiUSA

Personalised recommendations