Advertisement

Global mean estimation using a self-organizing dual-zoning method for preferential sampling

  • Yuchun Pan
  • Xuhong Ren
  • Bingbo Gao
  • Yu Liu
  • YunBing Gao
  • Xingyao Hao
  • Ziyue Chen
Article

Abstract

Giving an appropriate weight to each sampling point is essential to global mean estimation. The objective of this paper was to develop a global mean estimation method with preferential samples. The procedure for this estimation method was to first zone the study area based on self-organizing dual-zoning method and then to estimate the mean according to stratified sampling method. In this method, spreading of points in both feature and geographical space is considered. The method is tested in a case study on the metal Mn concentrations in Jilin provinces of China. Six sample patterns are selected to estimate the global mean and compared with the global mean calculated by direct arithmetic mean method, polygon method, and cell method. The results show that the proposed method produces more accurate and stable mean estimates under different feature deviation index (FDI) values and sample sizes. The relative errors of the global mean calculated by the proposed method are from 0.14 to 1.47 % and they are the largest (4.83–8.84 %) by direct arithmetic mean method. At the same time, the mean results calculated by the other three methods are sensitive to the FDI values and sample sizes.

Keywords

Preferential sampling Global mean estimation Self-organizing dual-zoning method 

Notes

Acknowledgments

This study was funded by the National Natural Science Foundation of China (No. 40971237 and No. 41201173) and the Open Fund of National Engineering Research Center for Information Technology in Agriculture (No. KF2012N08-055).

References

  1. Botta-Dukát, Z., Kovács-Láng, E., Rédei, T., Kertész, M., & Garadnai, J. (2007). Statistical and biological consequences of preferential sampling in phytosociology: theoretical considerations and a case study. Folia Geobotanica, 42(2), 141–152.CrossRefGoogle Scholar
  2. Deutsch, C. (1989). DECLUS: a FORTRAN 77 program for determining optimum spatial declustering weights. Computers & Geosciences, 15(3), 325–332.CrossRefGoogle Scholar
  3. Diggle, P. J., Menezes, R., & Su, T. L. (2010). Geostatistical inference under preferential sampling. Journal of the Royal Statistical Society: Series C (Applied Statistics), 59(2), 191–232.CrossRefGoogle Scholar
  4. Dubois, G., & Saisana, M. (2002). Optimizing spatial declustering weights—comparison of methods (pp. 479–484). Berlin-Germany: In Proceedings of the Annual Conference of the International Association for Mathematical Geology.Google Scholar
  5. Goovaerts, P. (1997). Geostatistics for natural resources evaluation (pp. 393–395). New York: Oxford University Press.Google Scholar
  6. Gupta, S., & Shabbir, J. (2007). On the use of transformed auxiliary variables in estimating population mean by using two auxiliary variables. Journal of Statistical Planning and Inference, 137(5), 1606–1611.CrossRefGoogle Scholar
  7. Isaaks, E. H., & Srivastava, R. M., 1989. Applied geostatistics .Oxford University Press, 561pp.Google Scholar
  8. Jiao, L., Liu, Y., & Zou, B. (2011). Self-organizing dual clustering considering spatial analysis and hybrid distance measures. Science China Earth Sciences, 54(8), 1268–1278.CrossRefGoogle Scholar
  9. Journel, A. G. (1983). Nonparametric estimation of spatial distributions. Journal of the International Association for Mathematical Geology, 15(3), 445–468.CrossRefGoogle Scholar
  10. Kamiran, F., & Calders, T., 2010. Classification with no discrimination by preferential sampling. In Proc. Benelearn.Google Scholar
  11. Kohonen, T. (1990). The self-organizing map. Proceedings of the IEEE, 78(9), 1464–1480.CrossRefGoogle Scholar
  12. Li, L., Wang, J., Cao, Z., & Zhong, E. (2008). An information-fusion method to identify pattern of spatial heterogeneity for improving the accuracy of estimation. Stochastic Environmental Research and Risk Assessment, 22(6), 689–704.CrossRefGoogle Scholar
  13. Lin, C. R., Liu, K. H., & Chen, M. S. (2005). Dual clustering: integrating data clustering over optimization and constraint domains. Knowledge and Data Engineering, IEEE Transactions on, 17(5), 628–637.CrossRefGoogle Scholar
  14. Menezes, R., 2009. Clustering and preferential sampling, two distinct issues in geostatistics. In XVII Annual Congress of the Portuguese Society of Statistics.Google Scholar
  15. Merckx, B., Steyaert, M., Vanreusel, A., Vincx, M., & Vanaverbeke, J. (2011). Null models reveal preferential sampling, spatial autocorrelation and overfitting in habitat suitability modelling. Ecological Modelling, 222(3), 588–597.CrossRefGoogle Scholar
  16. Michalcová, D., Lvončík, S., Chytrý, M., & Hájek, O. (2011). Bias in vegetation databases? A comparison of stratified-random and preferential sampling. Journal of Vegetation Science, 22(2), 281–291.CrossRefGoogle Scholar
  17. Olea, R. A. (2007). Declustering of clustered preferential sampling for histogram and semivariogram inference. Mathematical Geology, 39(5), 453–467.CrossRefGoogle Scholar
  18. Rao, T. J. (1981). On a class of almost unbiased ratio estimators. Annals of the Institute of Statistical Mathematics, 33(1), 225–231.CrossRefGoogle Scholar
  19. Shabbir, J., & Yaab, M. Z. (2003). Improvement over transformed auxiliary variable in estimating the finite population mean. Biometrical Journal, 45(6), 723–729.CrossRefGoogle Scholar
  20. Tai, C. H., Dai, B. R., & Chen, M. S. (2007). Incremental clustering in geography and optimization spaces. In Advances in Knowledge Discovery and Data Mining (pp. 272–283). Berlin Heidelberg: Springer.CrossRefGoogle Scholar
  21. Thiessen, A. H. (1911). Precipitation averages for large areas. Monthly Weather Review, 39(7), 1082–1089.Google Scholar
  22. Wang, J. F., Christakos, G., & Hu, M. G. (2009). Modeling spatial means of surfaces with stratified nonhomogeneity. Geoscience and Remote Sensing, IEEE Transactions on, 7(12), 4167–4174.CrossRefGoogle Scholar
  23. Wang, J., Haining, R., & Cao, Z. (2010). Sample surveying to estimate the mean of a heterogeneous surface: reducing the error variance through zoning. International Journal of Geographical Information Science, 24(4), 523–543.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Yuchun Pan
    • 1
    • 2
  • Xuhong Ren
    • 3
  • Bingbo Gao
    • 1
    • 2
    • 4
  • Yu Liu
    • 1
    • 2
  • YunBing Gao
    • 1
    • 2
  • Xingyao Hao
    • 1
    • 2
  • Ziyue Chen
    • 5
  1. 1.Beijing Research Center for Information Technology in AgricultureBeijing Academy of Agriculture and Forestry SciencesBeijingChina
  2. 2.National Engineering Research Center for Information Technology in AgricultureBeijingChina
  3. 3.Department of Computer Science and EngineeringNorth China Institute of Aerospace EngineeringLangfang CityChina
  4. 4.Institute of Geographic Sciences & Nature Resources ResearchChinese Academy of SciencesBeijingChina
  5. 5.College of Global Change and Earth System ScienceBeijing Normal UniversityBeijingChina

Personalised recommendations