, Volume 23, Issue 3, pp 397–423 | Cite as

A spatially-pruned vertex expansion operator in the Neo4j graph database system

  • Yuhan SunEmail author
  • Mohamed Sarwat


Graphs are widely used to model data in many application domains. Thanks to the wide spread use of GPS-enabled devices, many applications assign spatial attributes to graph vertexes (e.g., geographic knowledge bases, geo-tagged social media). Graph database systems such as Neo4j and Titan are commonly used to manage and query graph data. Even though an off-the-shelf graph database system allows users to define spatial location semantics on vertexes and edges, existing graph query processors are not natively optimized for spatial predicates. The paper proposes GeoExpand—a query operator that adds spatial data awareness to a graph database management system. GeoExpand allows efficient execution of graph queries that involve spatial predicates. The proposed operator makes use of spatial bitmap entries stored as properties in the graph. GeoExpand leverages such bitmap entries to possibly terminate the graph traversal process early and hence reduces the query latency. Since the spatial bitmap entries are represented using a light-weight data structure, they do not add a lot of storage or maintenance overhead. That makes GeoExpand a practical solution. Experiments based on implementation inside the core kernel of Neo4j prove that the GeoExpand operator exhibits up to two orders of magnitude better query response time than the classic Expand operator used in Neo4j.


Spatial data Graph database Spatial index 



  1. 1.
    Armenatzoglou N, Papadopoulos S, Papadias D (2013) A general framework for geo-social query processing. Proc VLDB Endow 6(10):913–924. CrossRefGoogle Scholar
  2. 2.
    Bakalov P, Hoel EG, Kim S (2017) A network model for the utility domain. In: Proceedings of the ACM SIGSPATIAL international conference on advances in geographic information systems., pp 32:1–32:10
  3. 3.
    Bao J, Mokbel MF, Chow C (2012) Geofeed: a location aware news feed system. In: Proceedings of the IEEE international conference on data engineering, ICDE., pp 54–65
  4. 4.
    Beckmann N, Kriegel H, Schneider R, Seeger B (1990) The r*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the ACM international conference on management of data, SIGMOD., pp 322–331
  5. 5.
    Doytsher Y, Galon B, Kanza Y (2010) Querying geo-social data by bridging spatial networks and social networks. In: Proceedings of international workshop on location based social networks, LBSN., pp 39–46
  6. 6.
    Doytsher Y, Galon B, Kanza Y (2012) Querying socio-spatial networks on the world-wide web. In: International World Wide Web conference., pp 329–332
  7. 7.
    Finkel RA, Bentley JL (1974) Quad trees: a data structure for retrieval on composite keys. Acta Inform 4:1–9. CrossRefGoogle Scholar
  8. 8.
  9. 9.
    Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM international conference on management of data, SIGMOD., pp 47–57
  10. 10.
    He H, Singh AK (2008) Graphs-at-a-time: query language and access methods for graph databases. In: Proceedings of the ACM international conference on management of data, SIGMOD., pp 405–418
  11. 11.
    Liagouris J, Mamoulis N, Bouros P, Terrovitis M (2014) An effective encoding scheme for spatial RDF data. Proc VLDB Endow 7(12):1271–1282. CrossRefGoogle Scholar
  12. 12.
    Lomet DB (1991) Grow and post index trees: roles, techniques and future potential. In: Advances in spatial databases, second international symposium, SSD’91, Zürich, Switzerland, August 28–30, 1991, Proceedings., pp 183–206
  13. 13.
    Mouratidis K, Li J, Tang Y, Mamoulis N (2015) Joint search by social and spatial proximity. IEEE Trans Knowl Data Eng TKDE 27(3):781–793. CrossRefGoogle Scholar
  14. 14.
    Samet H (2006) Foundations of multidimensional and metric data structures. Morgan Kaufmann, San FranciscoGoogle Scholar
  15. 15.
    Sarwat M (2015) Interactive and scalable exploration of big spatial data—a data management perspective. In: Proceedings of the international conference on mobile data management, MDM., pp 263–270
  16. 16.
    Sarwat M, Elnikety S, He Y, Mokbel MF (2013) Horton+: a distributed system for processing declarative reachability queries over partitioned graphs. Proc VLDB Endow 6 (14):1918–1929. CrossRefGoogle Scholar
  17. 17.
    Sarwat M, Levandoski JJ, Eldawy A, Mokbel MF (2014) Lars*: an efficient and scalable location-aware recommender system. IEEE Trans Knowl Data Eng TKDE 26(6):1384–1399. CrossRefGoogle Scholar
  18. 18.
    Sarwat M, Bao J, Chow C, Levandoski JJ, Magdy A, Mokbel MF (2015) Context awareness in mobile systems. In: Data management in pervasive systems., pp 257–287
  19. 19.
    Shekhar S, Chawla S (2003) Spatial databases: a tour. Prentice Hall, Upper Saddle RiverGoogle Scholar
  20. 20.
    Shi J, Mamoulis N, Wu D, Cheung DW (2014) Density-based place clustering in geo-social networks. In: Proceedings of the ACM international conference on management of data, SIGMOD., pp 99–110
  21. 21.
    Sun Y, Pasumarthy N, Sarwat M (2017) On evaluating social proximity-aware spatial range queries. In: Proceedings of the international conference on mobile data management, MDM., pp 72–81
  22. 22.
    Wikipedia contributors (2018) Cypher query language—Wikipedia, the free encyclopedia.
  23. 23.
    Wood PT (2012) Query languages for graph databases. SIGMOD Record 41(1):50–60. CrossRefGoogle Scholar
  24. 24.
    Zhao P, Han J (2010) On graph query optimization in large networks. Proc VLDB Endow 3(1):340–351. CrossRefGoogle Scholar
  25. 25.
    Zhao P, Aggarwal CC, Wang M (2011) gsketch: on query estimation in graph streams. Proc VLDB Endow 5(3):193–204. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Arizona State UniversityTempeUSA

Personalised recommendations