Abstract
Given two vertices in a graph, computing their distance is a fundamental operation over graphs. However, classical exact methods for this problem often cannot scale up to the rapidly evolving graphs in recent applications. Many approximate methods have been proposed, including some landmark-based methods that have been shown to have good scalability and estimate the upper bound of the distance in acceptable accuracy. In this paper, we propose a new landmark-based framework based a new measure called coverage to more accurately estimate the lower bound of the distance. Although we can prove that selecting the optimal set of landmarks is NP-hard, we propose a heuristic algorithm that can guarantee the approximation ratio. Furthermore, we implement our method through the distributed graph processing systems while considering the characteristic of the distributed graph processing systems. Experiments on large real graphs confirm the superiority of our methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bordeaux, L., Hamadi, Y., Kohli, P.: Tractability: Practical Approaches to Hard Problems. Cambridge University Press, Cambridge (2014)
Borgatti, S.P.: Centrality and network flow. Soc. Netw. 27(1), 55–71 (2005)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: OSDI, pp. 17–30 (2012)
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: OSDI, pp. 599–613 (2014)
Gubichev, A., Bedathur, S.J., Seufert, S., Weikum, G.: Fast and accurate estimation of shortest paths in large graphs. In: CIKM, pp. 499–508 (2010)
Ko, S., Han, W.: TurboGraph++: a scalable and fast graph analytics system. In: SIGMOD, pp. 395–410 (2018)
Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)
Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J.M., Glance, N.S.: Cost-effective outbreak detection in networks, pp. 420–429 (2007)
Leskovec, J., Krevl, A.: SNAP datasets: stanford large network dataset collection, June 2014. http://snap.stanford.edu/data
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: SIGMOD, pp. 135–146 (2010)
Potamias, M., Bonchi, F., Castillo, C., Gionis, A.: Fast shortest path distance estimation in large networks. In: CIKM, pp. 867–876 (2009)
Qiao, M., Cheng, H., Chang, L., Yu, J.X.: Approximate shortest distance computing: a query-dependent local landmark scheme. IEEE Trans. Knowl. Data Eng. 26(1), 55–68 (2014)
Qiao, M., Cheng, H., Yu, J.X.: Querying shortest path distance with bounded errors in large graphs. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 255–273. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22351-8_16
Rebele, T., Suchanek, F., Hoffart, J., Biega, J., Kuzey, E., Weikum, G.: YAGO: a multilingual knowledge base from Wikipedia, wordnet, and geonames. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 177–185. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46547-0_19
Sakr, S., Orakzai, F.M., Abdelaziz, I., Khayyat, Z.: Large-Scale Graph Processing Using Apache Giraph. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47431-1
Tretyakov, K., Armas-Cervantes, A., García-Bañuelos, L., Vilo, J., Dumas, M.: Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. In: CIKM, pp. 1785–1794 (2011)
Yan, D., Cheng, J., Lu, Y., Ng, W.: Blogel: a block-centric framework for distributed computation on real-world graphs. PVLDB 7(14), 1981–1992 (2014)
Yan, D., Cheng, J., Lu, Y., Ng, W.: Effective techniques for message reduction and load balancing in distributed graph computation. In: WWW, pp. 1307–1317 (2015)
Zhang, Q., Yan, D., Cheng, J.: Quegel: a general-purpose system for querying big graphs. In: SIGMOD, pp. 2189–2192 (2016)
Zou, L., Chen, L., Özsu, M.T., Zhao, D.: Answering pattern match queries in large graph databases via graph embedding. VLDB J. 21(1), 97–120 (2012)
Acknowledgment
The corresponding author is Peng Peng, and this work was supported by NSFC under grant 61702171, 61772191 and 61472131, Hunan Provincial Natural Science Foundation of China under grant 2018JJ3065, the Fundamental Research Funds for the Central Universities, Science and Technology Key Projects of Hunan Province (Grant No. 2015TP1004, 2016JC2012), and Changsha science and technology project kq1801008.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, M., Peng, P., Xu, Y., Xia, H., Qin, Z. (2019). Distributed Landmark Selection for Lower Bound Estimation of Distances in Large Graphs. In: Shao, J., Yiu, M., Toyoda, M., Zhang, D., Wang, W., Cui, B. (eds) Web and Big Data. APWeb-WAIM 2019. Lecture Notes in Computer Science(), vol 11641. Springer, Cham. https://doi.org/10.1007/978-3-030-26072-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-26072-9_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26071-2
Online ISBN: 978-3-030-26072-9
eBook Packages: Computer ScienceComputer Science (R0)