Advertisement

Rapid Spatial Aggregation

  • Markus LoecherEmail author
  • Madhav Kumar
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 499)

Abstract

Data visualization is an important component of spatial data analysis. We demonstrate the visualization of spatial/spatio-temporal data on map tiles as implemented in the R package RgoogleMaps. We argue that extremely large spatial or location data sets can lead to clutter and information overload necessitating aggregation to higher geographical identities. Such aggregation requires associating each coordinate point from the set to a particular spatial polygon in the search space. Examples for such polygon-based spatial partitions would be zip codes, census blocks, or school districts. Unless efficient data structures are used, this can be a computationally expensive task involving an exhaustive search across all prospective polygons. In this paper, we propose a methodology that exploits kd-trees as an efficient nearest neighbour search algorithm to significantly reduce the effective number of polygons being searched and expedite the lookup process. The kd-tree is built from either the polygon centroids and/or carefully chosen other points within the polygons. We further demonstrate a successful hybrid strategy by combining a range search with the tree based ranking. Our code has been made publicly available as the R package RapidPolygonLookup.

Keywords

Polygon lookup Spatial kd-tree Visualization 

References

  1. 1.
    Almquist, Z.W.: Us census spatial and demographic data in R: TheUScensus2000 suite of packages. J. Statis. Softw. 37(6), 1–31 (2010). http://www.jstatsoft.org/v37/i06/
  2. 2.
    Arya, S., Mount, D., Kemp, S.E., Jefferis, G.: RANN: Fast Nearest Neighbour Search (wraps Arya and Mount’s ANN library) (2013). http://CRAN.R-project.org/package=RANN. (r package version 2.3.0)
  3. 3.
    Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975). http://doi.acm.org/10.1145/361002.361007 zbMATHMathSciNetCrossRefGoogle Scholar
  4. 4.
    Cortez, P., Morais, A.d.J.R.: A data mining approach to predict forest fires using meteorological data. In: 13th Portuguese Conference on Artificial Intelligence, New Trends in Artificial Intelligence, pp. 512–523. Associao Portuguesa para a Inteligncia (2007)Google Scholar
  5. 5.
    Dusek, T.: Spatially aggregated data and variables in empirical analysis and model building for economics. Cybergeo: Eur. J. Geogr. (2004). http://cybergeo.revues.org/2654 (dossiers, 13ème Colloque Européen de Géographie Théorique et Quantitative, Lucca, Italie, 8–11 septembre 2003, document 285)
  6. 6.
    Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, SIGMOD 1984, pp. 47–57. ACM, New York (1984). http://doi.acm.org/10.1145/602259.602266
  7. 7.
    Loecher, M.: RgoogleMaps: Overlays on Google map tiles in R (2013). http://CRAN.R-project.org/package=RgoogleMaps. (r package version 1.2.0.5)
  8. 8.
    Loecher, M., Kumar, M.: RapidPolygonLookup: Polygon lookup using kd trees (2014). http://CRAN.R-project.org/package=RapidPolygonLookup. (r package version 0.1)
  9. 9.
    Moore, A.: Efficient memory-based learning for robot control. Ph.D. thesis (1991)Google Scholar
  10. 10.
    Openshaw, S.: The modifiable areal unit problem, vol. 38. Geo Books, Norwich (1983)Google Scholar
  11. 11.
    Openshaw, S.: Ecological fallacies and the analysis of areal census data. Environ. Planning A 16(1), 17–31 (1984)CrossRefGoogle Scholar
  12. 12.
    Openshaw, S., Taylor, P.J.: A million or so correlation coefficients: three experiments on the modifiable areal unit problem. Stat. Appl. Spat. Sci. 21, 127–144 (1979)Google Scholar
  13. 13.
    O’Rourke, J.: Computational geometry in C. Cambridge University Press, Cambridge (1998)zbMATHCrossRefGoogle Scholar
  14. 14.
    Pebesma, E.J., Bivand, R.S.: Classes and methods for spatial data in R. R News 5(2), 9–13 (2005). http://CRAN.R-project.org/doc/Rnews/ Google Scholar
  15. 15.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2013). http://www.R-project.org/
  16. 16.
    Bivand, R.S., Edzer Pebesma, V.G.R.: Applied spatial data analysis with R. Springer, New York (2013). http://www.asdar-book.org/
  17. 17.
    San-Fransico-Government: San fransico police department (sfpd) crime incident data (2013). https://data.sfgov.org/, calendar-year data can be extracted from https://data.sfgov.org/Public-Safety/SFPD-Reported-Incidents-2003-to-Present/dyj4-n68b
  18. 18.
    Sunday, D.: Inclusion of a point in a polygon (2014). http://tinyurl.com/q4f6dgs
  19. 19.
    Weidmann, N.B., Ward, M.D.: Predicting conflict in space and time. J. Conflict Resolut. 54(6), 883–901 (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Berlin School of Economics and LawBerlinGermany

Personalised recommendations