Skip to main content

Delta-K 2-tree for Compact Representation of Web Graphs

  • Conference paper
Web Technologies and Applications (APWeb 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8709))

Included in the following conference series:

  • 3256 Accesses

Abstract

The World Wide Web structure can be represented by a directed graph named as the web graph. The web graphs have been used in a wide range of applications. However, the increasingly large-scale web graphs pose great challenges to the traditional memory-resident graph algorithms. In the literature, K 2-tree can efficiently compress the web graphs while supporting fast querying in the compressed data. Inspired by K 2-tree, we propose the Delta-K 2-tree compression approach, which exploits the characteristics of similarity between neighbor nodes in the web graphs. In addition, we design a node reordering algorithm to further improve the compression ratio. We compare our approach with the state-of-the-art algorithms, including K 2-tree, WebGraph, and AD. Experimental results of web graph compression on four datasets show that our Delta-K 2-tree approach outperforms the other three in compression ratio (1.66-2.55 bits per link), and meanwhile supports fast forward and reverse querying in graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1), 107–117 (1998)

    Article  Google Scholar 

  2. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  3. China Internet Network Information Center, http://www.cnnic.net.cn/research/bgxz/tjbg/201201/t20120116_23668.html

  4. Vitter, J.S.: External memory algorithms and data structures: Dealing with massive data. ACM Computing Surveys (CsUR) 33(2), 209–271 (2001)

    Article  Google Scholar 

  5. Vitter, J.S.: Algorithms and data structures for external memory. Foundations and Trends in Theoretical Computer Science 2(4), 305–474 (2008)

    Article  MathSciNet  Google Scholar 

  6. Badue, C., Baeza-Yates, R., Ribeiro-Neto, B., Ziviani, N.: Distributed query processing using partitioned inverted files. In: Proceedings of Eighth International Symposium on SPIRE 2001, pp. 10–20. IEEE (2001)

    Google Scholar 

  7. Tomasic, A., Garcia-Molina, H.: Performance of inverted indices in shared-nothing distributed text document information retrieval systems. In: Proceedings of the Second International Conference on Parallel and Distributed Information Systems, pp. 8–17. IEEE (1993)

    Google Scholar 

  8. Yu, G., Gu, Y., Bao, Y.B., Wang, Z.G.: Large scale graph data processing on cloud computing environments. Chinese Journal of Computers 34(10), 1753–1767 (2011)

    Article  Google Scholar 

  9. Boldi, P., Vigna, S.: The Webgraph Framework I: Compression techniques. In: The 13th International Conference on World Wide Web, pp. 539–602. ACM (2004)

    Google Scholar 

  10. Apostolico, A., Drovandi, G.: Graph compression by BFS. Algorithms 2(3), 1031–1044 (2009)

    Article  MathSciNet  Google Scholar 

  11. Brisaboa, N.R., Ladra, S., Navarro, G.: k2-trees for compact web graph representation. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 18–30. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  12. Asano, Y., Miyawaki, Y., Nishizeki, T.: Efficient compression of web graphs. In: Hu, X., Wang, J. (eds.) COCOON 2008. LNCS, vol. 5092, pp. 1–11. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  13. Boldi, P., Vigna, S.: The WebGraph Framework II: Codes For The World-Wide Web. In: The Conference on Data Compression, p. 528. IEEE Computer Society (2004)

    Google Scholar 

  14. Boldi, P., Santini, M., Vigna, S.: A large time-aware web graph. ACM SIGIR Forum 42(2), 33–38 (2008)

    Article  Google Scholar 

  15. Boldi, P., Santini, M., Vigna, S.: Permuting web graphs. In: Avrachenkov, K., Donato, D., Litvak, N. (eds.) WAW 2009. LNCS, vol. 5427, pp. 116–126. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  16. Boldi, P., Rosa, M., Santini, M., Vigna, S.: Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In: The 20th International Conference on World Wide Web, pp. 587–596. ACM (2011)

    Google Scholar 

  17. Jacobson, G.: Space-efficient static trees and graphs. In: 30th Annual Symposium on Foundations of Computer Science, pp. 549–554. IEEE (1989)

    Google Scholar 

  18. Gonzalez, R., Grabowski, S., Makinen, V., Navarro, G.: Practical implementation of rank and select queries. Poster Proceedings Volume of 4th Workshop on Efficient and Experimental Algorithms (WEA 2005), pp: 27–38 (2005)

    Google Scholar 

  19. WebGraph Homepage, http://webgraph.dsi.unimi.it

  20. Drovandi, G.: PhD Web Site, http://www.dia.uniroma3.it/~drovandi/software.php

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, Y., Xiong, G., Liu, Y., Liu, M., Liu, P., Guo, L. (2014). Delta-K 2-tree for Compact Representation of Web Graphs. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds) Web Technologies and Applications. APWeb 2014. Lecture Notes in Computer Science, vol 8709. Springer, Cham. https://doi.org/10.1007/978-3-319-11116-2_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11116-2_24

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11115-5

  • Online ISBN: 978-3-319-11116-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics