Skip to main content

Distributed Landmark Selection for Lower Bound Estimation of Distances in Large Graphs

  • Conference paper
  • First Online:
Web and Big Data (APWeb-WAIM 2019)

Abstract

Given two vertices in a graph, computing their distance is a fundamental operation over graphs. However, classical exact methods for this problem often cannot scale up to the rapidly evolving graphs in recent applications. Many approximate methods have been proposed, including some landmark-based methods that have been shown to have good scalability and estimate the upper bound of the distance in acceptable accuracy. In this paper, we propose a new landmark-based framework based a new measure called coverage to more accurately estimate the lower bound of the distance. Although we can prove that selecting the optimal set of landmarks is NP-hard, we propose a heuristic algorithm that can guarantee the approximation ratio. Furthermore, we implement our method through the distributed graph processing systems while considering the characteristic of the distributed graph processing systems. Experiments on large real graphs confirm the superiority of our methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bordeaux, L., Hamadi, Y., Kohli, P.: Tractability: Practical Approaches to Hard Problems. Cambridge University Press, Cambridge (2014)

    MATH  Google Scholar 

  2. Borgatti, S.P.: Centrality and network flow. Soc. Netw. 27(1), 55–71 (2005)

    Article  MathSciNet  Google Scholar 

  3. Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: OSDI, pp. 17–30 (2012)

    Google Scholar 

  4. Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: OSDI, pp. 599–613 (2014)

    Google Scholar 

  5. Gubichev, A., Bedathur, S.J., Seufert, S., Weikum, G.: Fast and accurate estimation of shortest paths in large graphs. In: CIKM, pp. 499–508 (2010)

    Google Scholar 

  6. Ko, S., Han, W.: TurboGraph++: a scalable and fast graph analytics system. In: SIGMOD, pp. 395–410 (2018)

    Google Scholar 

  7. Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)

    Google Scholar 

  8. Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J.M., Glance, N.S.: Cost-effective outbreak detection in networks, pp. 420–429 (2007)

    Google Scholar 

  9. Leskovec, J., Krevl, A.: SNAP datasets: stanford large network dataset collection, June 2014. http://snap.stanford.edu/data

  10. Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: SIGMOD, pp. 135–146 (2010)

    Google Scholar 

  11. Potamias, M., Bonchi, F., Castillo, C., Gionis, A.: Fast shortest path distance estimation in large networks. In: CIKM, pp. 867–876 (2009)

    Google Scholar 

  12. Qiao, M., Cheng, H., Chang, L., Yu, J.X.: Approximate shortest distance computing: a query-dependent local landmark scheme. IEEE Trans. Knowl. Data Eng. 26(1), 55–68 (2014)

    Article  Google Scholar 

  13. Qiao, M., Cheng, H., Yu, J.X.: Querying shortest path distance with bounded errors in large graphs. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 255–273. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22351-8_16

    Chapter  Google Scholar 

  14. Rebele, T., Suchanek, F., Hoffart, J., Biega, J., Kuzey, E., Weikum, G.: YAGO: a multilingual knowledge base from Wikipedia, wordnet, and geonames. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 177–185. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46547-0_19

    Chapter  Google Scholar 

  15. Sakr, S., Orakzai, F.M., Abdelaziz, I., Khayyat, Z.: Large-Scale Graph Processing Using Apache Giraph. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47431-1

    Book  Google Scholar 

  16. Tretyakov, K., Armas-Cervantes, A., García-Bañuelos, L., Vilo, J., Dumas, M.: Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. In: CIKM, pp. 1785–1794 (2011)

    Google Scholar 

  17. Yan, D., Cheng, J., Lu, Y., Ng, W.: Blogel: a block-centric framework for distributed computation on real-world graphs. PVLDB 7(14), 1981–1992 (2014)

    Google Scholar 

  18. Yan, D., Cheng, J., Lu, Y., Ng, W.: Effective techniques for message reduction and load balancing in distributed graph computation. In: WWW, pp. 1307–1317 (2015)

    Google Scholar 

  19. Zhang, Q., Yan, D., Cheng, J.: Quegel: a general-purpose system for querying big graphs. In: SIGMOD, pp. 2189–2192 (2016)

    Google Scholar 

  20. Zou, L., Chen, L., Özsu, M.T., Zhao, D.: Answering pattern match queries in large graph databases via graph embedding. VLDB J. 21(1), 97–120 (2012)

    Article  Google Scholar 

Download references

Acknowledgment

The corresponding author is Peng Peng, and this work was supported by NSFC under grant 61702171, 61772191 and 61472131, Hunan Provincial Natural Science Foundation of China under grant 2018JJ3065, the Fundamental Research Funds for the Central Universities, Science and Technology Key Projects of Hunan Province (Grant No. 2015TP1004, 2016JC2012), and Changsha science and technology project kq1801008.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Peng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, M., Peng, P., Xu, Y., Xia, H., Qin, Z. (2019). Distributed Landmark Selection for Lower Bound Estimation of Distances in Large Graphs. In: Shao, J., Yiu, M., Toyoda, M., Zhang, D., Wang, W., Cui, B. (eds) Web and Big Data. APWeb-WAIM 2019. Lecture Notes in Computer Science(), vol 11641. Springer, Cham. https://doi.org/10.1007/978-3-030-26072-9_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26072-9_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26071-2

  • Online ISBN: 978-3-030-26072-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics