Abstract
The World Wide Web structure can be represented by a directed graph named as the web graph. The web graphs have been used in a wide range of applications. However, the increasingly large-scale web graphs pose great challenges to the traditional memory-resident graph algorithms. In the literature, K 2-tree can efficiently compress the web graphs while supporting fast querying in the compressed data. Inspired by K 2-tree, we propose the Delta-K 2-tree compression approach, which exploits the characteristics of similarity between neighbor nodes in the web graphs. In addition, we design a node reordering algorithm to further improve the compression ratio. We compare our approach with the state-of-the-art algorithms, including K 2-tree, WebGraph, and AD. Experimental results of web graph compression on four datasets show that our Delta-K 2-tree approach outperforms the other three in compression ratio (1.66-2.55 bits per link), and meanwhile supports fast forward and reverse querying in graphs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1), 107–117 (1998)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46(5), 604–632 (1999)
China Internet Network Information Center, http://www.cnnic.net.cn/research/bgxz/tjbg/201201/t20120116_23668.html
Vitter, J.S.: External memory algorithms and data structures: Dealing with massive data. ACM Computing Surveys (CsUR) 33(2), 209–271 (2001)
Vitter, J.S.: Algorithms and data structures for external memory. Foundations and Trends in Theoretical Computer Science 2(4), 305–474 (2008)
Badue, C., Baeza-Yates, R., Ribeiro-Neto, B., Ziviani, N.: Distributed query processing using partitioned inverted files. In: Proceedings of Eighth International Symposium on SPIRE 2001, pp. 10–20. IEEE (2001)
Tomasic, A., Garcia-Molina, H.: Performance of inverted indices in shared-nothing distributed text document information retrieval systems. In: Proceedings of the Second International Conference on Parallel and Distributed Information Systems, pp. 8–17. IEEE (1993)
Yu, G., Gu, Y., Bao, Y.B., Wang, Z.G.: Large scale graph data processing on cloud computing environments. Chinese Journal of Computers 34(10), 1753–1767 (2011)
Boldi, P., Vigna, S.: The Webgraph Framework I: Compression techniques. In: The 13th International Conference on World Wide Web, pp. 539–602. ACM (2004)
Apostolico, A., Drovandi, G.: Graph compression by BFS. Algorithms 2(3), 1031–1044 (2009)
Brisaboa, N.R., Ladra, S., Navarro, G.: k2-trees for compact web graph representation. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 18–30. Springer, Heidelberg (2009)
Asano, Y., Miyawaki, Y., Nishizeki, T.: Efficient compression of web graphs. In: Hu, X., Wang, J. (eds.) COCOON 2008. LNCS, vol. 5092, pp. 1–11. Springer, Heidelberg (2008)
Boldi, P., Vigna, S.: The WebGraph Framework II: Codes For The World-Wide Web. In: The Conference on Data Compression, p. 528. IEEE Computer Society (2004)
Boldi, P., Santini, M., Vigna, S.: A large time-aware web graph. ACM SIGIR Forum 42(2), 33–38 (2008)
Boldi, P., Santini, M., Vigna, S.: Permuting web graphs. In: Avrachenkov, K., Donato, D., Litvak, N. (eds.) WAW 2009. LNCS, vol. 5427, pp. 116–126. Springer, Heidelberg (2009)
Boldi, P., Rosa, M., Santini, M., Vigna, S.: Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In: The 20th International Conference on World Wide Web, pp. 587–596. ACM (2011)
Jacobson, G.: Space-efficient static trees and graphs. In: 30th Annual Symposium on Foundations of Computer Science, pp. 549–554. IEEE (1989)
Gonzalez, R., Grabowski, S., Makinen, V., Navarro, G.: Practical implementation of rank and select queries. Poster Proceedings Volume of 4th Workshop on Efficient and Experimental Algorithms (WEA 2005), pp: 27–38 (2005)
WebGraph Homepage, http://webgraph.dsi.unimi.it
Drovandi, G.: PhD Web Site, http://www.dia.uniroma3.it/~drovandi/software.php
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, Y., Xiong, G., Liu, Y., Liu, M., Liu, P., Guo, L. (2014). Delta-K 2-tree for Compact Representation of Web Graphs. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds) Web Technologies and Applications. APWeb 2014. Lecture Notes in Computer Science, vol 8709. Springer, Cham. https://doi.org/10.1007/978-3-319-11116-2_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-11116-2_24
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11115-5
Online ISBN: 978-3-319-11116-2
eBook Packages: Computer ScienceComputer Science (R0)