# Geographic sampling of urban soils for contaminant mapping: how many samples and from where

## Abstract

Properly sampling soils and mapping soil contamination in urban environments requires that impacts of spatial autocorrelation be taken into account. As spatial autocorrelation increases in an urban landscape, the amount of duplicate information contained in georeferenced data also increases, whether an entire population or some type of random sample drawn from that population is being analyzed, resulting in conventional power and sample size calculation formulae yielding incorrect sample size numbers vis-à-vis model-based inference. Griffith (in *Annals, Association of American Geographers, 95*, 740–760, 2005) exploits spatial statistical model specifications to formulate equations for estimating the necessary sample size needed to obtain some predetermined level of precision for an analysis of georeferenced data when implementing a tessellation stratified random sampling design, labeling this approach model-informed, since a model of latent spatial autocorrelation is required. This paper addresses issues of efficiency associated with these model-based results. It summarizes findings from a data collection exercise (soil samples collected from across Syracuse, NY), as well as from a set of resampling and from a set of simulation experiments following experimental design principles spelled out by Overton and Stehman (in *Communications in Statistics: Theory and Methods, 22*, 2641–2660). Guidelines are suggested concerning appropriate sample size (i.e., how many) and sampling network (i.e., where).

## Keywords

Effective sample size Geographic sampling Heavy metals Spatial autocorrelation Variance inflation## References

- Anselin, L. (1988).
*Spatial econometrics: methods and models*. Dordrecht: Martinus Nijhoff.Google Scholar - Cressie, N. (1991).
*Statistics for spatial data*. New York: Wiley.Google Scholar - Griffith, D. (1988).
*Advanced spatial statistics*. Dordrecht: Martinus Nijhoff.Google Scholar - Griffith, D. (1992). Simplifying the normalizing factor in spatial autoregressions for irregular lattices.
*Papers in Regional Science, 71*, 71–86.CrossRefGoogle Scholar - Griffith, D. (2000). A linear regression solution to the spatial autocorrelation problem.
*Journal of Geographical Systems, 2*, 141–156.CrossRefGoogle Scholar - Griffith, D. (2003).
*Spatial autocorrelation and spatial filtering: gaining understanding through theory and scientific visualization*. Berlin: Springer.Google Scholar - Griffith, D. (2005). Effective geographic sample size in the presence of spatial autocorrelation.
*Annals, Association of American Geographers, 95*, 740–760.CrossRefGoogle Scholar - Griffith, D. (2006). Statistical efficiency of model-informed geographic sampling designs. In C. Mário, & M. Painho (Eds.),
*Proceedings of accuracy 2006*,*Proceedings of the 7th international symposium on spatial accuracy assessment in a natural resources and environmental sciences*. Lisboa, Portugal: Instituto Geográfico Português, pp. 91–98.Google Scholar - Grondona, M., & Cressie, N. (1991). Using spatial considerations in the analysis of experiments.
*Technometrics, 33*, 381–392.CrossRefGoogle Scholar - de Gruijter, J., Brus, D., Bierkens, M., & Knotters, M. (2006).
*Sampling for natural resource monitoring*. New York: Springer.Google Scholar - Johnson, D., Hager, J., Hunt, A., Griffith, D., Blount, S., Ellsworth, S., Hintz, J., Lucci, R., Mittiga, A., Prokhorova, D., Tidd, L., Millones, M., & Vincent, M. (2005). Field methods for mapping urban metal distributions in house dusts and surface soils of Syracuse, NY, USA,
*Science in China*(Series C: Life Sciences), 48 (Suppl.), pp. 192–199.Google Scholar - Kelly, K., & Maxwell, S. (2003). Sample size for multiple regression: obtaining regression coefficients that are accurate, not simply significant.
*Psychological Methods, 8*, 305–321.CrossRefGoogle Scholar - Levy, P., & Lemeshow, S. (1991).
*Sampling of populations: methods and applications*. New York: Wiley.Google Scholar - Müller, W. (2001).
*Collecting spatial data: optimum design of experiments for random fields*(2nd ed.). Heidelberg: Physica-Verlag.Google Scholar - Overton, W., & Stehman, S. (1993). Properties of designs for sampling continuous spatial resources from a triangular grid.
*Communications in Statistics: Theory and Methods, 22*, 2641–2660.Google Scholar - Thompson, S. (1992).
*Sampling*. NY: Wiley.Google Scholar