Abstract
Shortest-path computation on graphs is one of the most well-studied problems in algorithmic theory. An aspect that has only recently attracted attention is the use of databases in combination with graph algorithms, so-called distance oracles, to compute shortest-path queries on large graphs. To this purpose, we propose a novel, efficient, pure-SQL framework for answering exact distance queries on large-scale graphs, implemented entirely on an open-source database engine. Our COLD framework (COmpressed Labels on the Database) can answer multiple distance queries (vertex-to-vertex, one-to-many, k-Nearest Neighbors, Reverse k-Nearest Neighbors, Reverse k-Farthest Neighbors and Top-k Range) not handled by previous methods, rendering it a complete database solution for a variety of practical large-scale graph applications. Our experimentation shows that COLD outperforms existing approaches (including popular graph databases) in terms of query time and efficiency, while requiring significantly less storage space than these methods.
Similar content being viewed by others
References
Abraham I, Delling D, Fiat A, Goldberg AV, Werneck RF (2012) Hldb: Location-based services in databases. In: Proceedings of the 20th International Conference on Advances in Geographic Information Systems, pp 339–348
Abraham I, Delling D, Goldberg AV, Werneck RF (2011) A hub-based labeling algorithm for shortest paths in road networks. In: Proc. 10th International Symposium on Experimental Algorithms (SEA), pp 230–241
Abraham I, Delling D, Goldberg AV, Werneck RF (2012) Hierarchical hub labelings for shortest paths. In: Proc. 20th Annual European Symposium on Algorithms (ESA), pp 24–35
Afshani P, Brodal GS, Zeh N (2011) Ordered and unordered top-k range reporting in large data sets. In: Proc. Twenty-second Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 390–400
Akiba T, Iwata Y, Kawarabayashi K, Kawata Y (2014) Fast shortest-path distance queries on road networks by pruned highway labeling. In: Proc. 16th Workshop on Algorithm Engineering and Experiments (ALENEX), pp 147–154
Akiba T, Iwata Y, Yoshida Y (2013) Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In: Proc. ACM SIGMOD International Conference on Management of Data, pp 349–360
Akiba T, Iwata Y, Yoshida Y (2015) Pruned landmark labeling. https://github.com/iwiwi/pruned-landmark-labeling
Albert R, Jeong H, Barabási A-L (1999) The diameter of the world wide web. CoRR. arXiv:cond-mat/9907038
Bader DA, Meyerhenke H, Sanders P, Wagner D (eds) (2013) Proceedings of the 10th DIMACS Implementation Challenge Workshop Graph Partitioning and Graph Clustering
Bast H, Delling D, Goldberg AV, Muller-Hannemann M, Pajor T, Sanders P, Wagner D, Werneck RF (2015) Route planning in transportation networks. CoRR. arXiv:abs/1504.05140
Borutta F, Nascimento MA, Niedermayer J, Kröger P (2014) Monochromatic rknn queries in time-dependent road networks. In: Proc. Third ACM SIGSPATIAL International Workshop on Mobile Geographic Information Systems, pp 26–33
Cheema MA, Shen Z, Lin X, Zhang W (2014) A unified framework for efficiently processing ranking related queries. In: Proc. 17th International Conference on Extending Database Technology (EDBT), pp 427–438
Cho E, Myers SA, Leskovec J (2011) Friendship and mobility: user movement in location-based social networks. In: Proc. of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1082–1090
Cohen E, Halperin E, Kaplan H, Zwick U (2002) Reachability and distance queries via 2-hop labels. In: Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 937–946
Delling D, Dibbelt J, Pajor T, Werneck R (2015) Public transit labeling. In: Proc. 14th International Symposium on Experimental Algorithms(SEA), pp 273–285
Delling D, Goldberg AV, Pajor T, Werneck RF (2011) Customizable route planning. In: Proc. 10th International Conference on Experimental Algorithms (SEA), pp 376–387
Delling D, Goldberg AV, Pajor T, Werneck RF (2014) Robust distance queries on massive networks. In: Proc. 22th Annual European Symposium on Algorithms (ESA), pp 321–333
Delling D, Goldberg AV, Werneck R (2011) Faster batched shortest paths in road networks. In: Proc. 11th Workshop on Algorithmic Approaches for Transportation Modeling, Optimization, and Systems (ATMOS)
Delling D, Goldberg AV, Werneck RF (2013) Hub label compression. In: Proc. 12th International Symposium on Experimental Algorithms (SEA), pp 18–29
Delling D, Werneck R (2015) Customizable point-of-interest queries in road networks. IEEE Trans Knowl Data Eng 27(3):686–698
Delling D, Werneck RFF (2012) Better bounds for graph bisection. In: Proc. 20th Annual European Symposium on Algorithms (ESA), pp 407–418
Efentakis A (2016) Scalable public transportation queries on the database. In: Proc. 19th International Conference on Extending Database Technology (EDBT), pp 527–538
Efentakis A, Efstathiades C, Pfoser D (2015) COLD. revisiting hub labels on the database for large-scale graphs. In: Proc. 14th International Symposium on Advances in Spatial and Temporal Databases (SSTD), pp 22–39
Efentakis A, Pfoser D (2013) Optimizing landmark-based routing and preprocessing. In: Proc. 6th ACM SIGSPATIAL International Workshop on Computational Transportation Science (CTS)
Efentakis A, Pfoser D (2014) GRASP. extending graph separators for the single-source shortest-path problem. In: Proc. 22th Annual European Symposium on Algorithms (ESA), pp 358–370
Efentakis A, Pfoser D (2016) Rehub: Extending hub labels for reverse k-nearest neighbor queries on large-scale networks. J. Exp. Algorithmics 21:1.13:1–1.13:35
Efentakis A, Pfoser D, Vassiliou Y (2015) Salt.aunifiedframeworkforallshortest-path query variants on road networks. In: Proc. 14th International Symposium on Experimental Algorithms (SEA)), pp 298–311
Gavoille C, Peleg D, Pérennes S, Raz R (2001) Distance labeling in graphs. In: Proc. Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SODA ’01, pp 210–219
Gavoille C, Peleg D, Pérennes S, Raz R (2004) Distance labeling in graphs. J. Algorithms 53(1):85–112
Geisberger R, Sanders P, Schultes D (2008) Better approximation of betweenness centrality. In: Proc. 10th Workshop on Algorithm Engineering and Experiments (ALENEX), pp 90–100
Geisberger R, Sanders P, Schultes D, Delling D (2008) Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In: Proc. 7th International Workshop on Experimental Algorithms (WEA), pp 319–333
Hung H-P, Chuang K-T, Chen M-S (2007) Efficient process of top-k range-sum queries over multiple streams with minimized global error, pp 1404–1419
Jiang M, Fu AW, Wong RC, Xu Y (2014) Hop doubling label indexing for point-to-point distance querying on scale-free networks. PVLDB 7(12):1203–1214
Kumar Y, Janardan R, Gupta P (2008) Efficient algorithms for reverse proximity query problems. In: Proc. 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp 39:1–39:10
Leskovec J, Krevl A (2014) SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data
Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2009) Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Math 6(1):29–123
Liao B, LHU, Yiu ML, Gong Z (2015) Beyond millisecond latency knn search on commodity machine. IEEE Trans Knowl Data Eng 27(10):2618–2631
Liu J, Chen H, Furuse K, Kitagawa H (2010) An efficient algorithm for reverse furthest neighbors query with metric index. In: Proc. 21st International Conference on Database and Expert Systems Applications (DEXA): Part II, pp 437–451
Luo Z, Ling TW, Ang C-H, Lee SY, Cui B (2001) Range top/bottom k queries in olap sparse data cubes. In: Proc. 12th International Conference on Database and Expert Systems Applications (DEXA), pp 678–687
McAuley JJ, Leskovec J (2012) Learning to discover social circles in ego networks. In: Proc. 26th Annual Conference on Neural Information Processing Systems, pp 548–556
PostgreSQL (2016) The world’s most advanced open source database. http://www.postgresql.org/
Safar M, Ibrahimi D, Taniar D (2009) Voronoi-based reverse nearest neighbor query processing on spatial networks. Multimedia Systems 15(5):295–308
Sankaranarayanan J, Samet H (2010) Query processing using distance oracles for spatial networks. IEEE Trans Knowl Data Eng 22(8):1158–1175
Sheng C, Tao Y (2012) Dynamic top-k range reporting in external memory. In: Proc. 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pp 121–130
Tao Y (2014) A dynamic i/o-efficient structure for one-dimensional top-k range reporting. In: Proc. 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pp 256–265
Tran QT, Taniar D, Safar M (2009) Transactions on large-scale data- and knowledge-centered systems i. chapter Reverse K Nearest Neighbor and Reverse Farthest Neighbor Search on Spatial Networks, pp 353–372. Springer-Verlag
Wang S, Cheema MA, Lin X, Zhang Y, Liu D (2016) Efficiently computing reverse k furthest neighbors. In: Proc. 32nd IEEE International Conference on Data Engineering (ICDE), pp 1110–1121
Wang S, Lin W, Yang Y, Xiao X, Zhou S (2015) Efficient route planning on public transportation networks: A labelling approach. In: Proc. 2015 ACM SIGMOD International Conference on Management of Data, pp 967–982
Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: Proc. 12th IEEE International Conference on Data Mining (ICDM), pp 745–754
Yiu ML, Papadias D, Mamoulis N, Tao Y (2006) Reverse nearest neighbors in large graphs. IEEE Trans Knowl Data Eng 18(4):540–553
Zhong R, Li G, Tan K-L, Zhou L (2013) G-tree: An efficient index for knn search on road networks. In: Proc. 22nd ACM International Conference on Conference on Information Knowledge Management (CIKM), pp 39–48. ACM
Acknowledgments
This work was partially supported by the project “Research Programs for Excellence 2014-2016 / CitySense-ATHENA R.I.C.” and the EU/Greece funded KRIPIS Action: MEDA Project. D. Pfoser’s work was partially supported by the National Science Foundation under Grant No. 1637541.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Efentakis, A., Efstathiades, C. & Pfoser, D. Hub Labels on the database for large-scale graphs with the COLD framework. Geoinformatica 21, 703–732 (2017). https://doi.org/10.1007/s10707-016-0287-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-016-0287-5