Towards Scaling Fully Personalized PageRank

Fogaras, Dániel; Rácz, Balázs

doi:10.1007/978-3-540-30216-2_9

Towards Scaling Fully Personalized PageRank

Dániel Fogaras^17,18 &
Balázs Rácz^17,18

Conference paper

571 Accesses
39 Citations
3 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3243))

Abstract

Personalized PageRank expresses backlink-based page quality around user-selected pages in a similar way as PageRank expresses quality over the entire Web. Existing personalized PageRank algorithms can however serve on-line queries only for a restricted choice of page selection. In this paper we achieve full personalization by a novel algorithm that computes a compact database of simulated random walks; this database can serve arbitrary personal choices of small subsets of web pages. We prove that for a fixed error probability, the size of our database is linear in the number of web pages. We justify our estimation approach by asymptotic worst-case lower bounds; we show that exact personalized PageRank values can only be obtained from a database of quadratic size.

Research was supported by grants OTKA T 42559 and T 42706 of the Hungarian National Science Fund, and NKFP-2/0017/2002 project Data Riddle.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bar-Yossef, Z., Berg, A., Chien, S., Fakcharoenphol, J., Weitz, D.: Approximating aggregate queries about web pages via random walks. In: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 535–544. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Google Scholar
Bar-Yossef, Z., Broder, A.Z., Kumar, R., Tomkins, A.: Sic transit gloria telae: towards an understanding of the web’s decay. In: Proceedings of the 13th World Wide Web Conference (WWW), pp. 328–337. ACM Press, New York (2004)
Chapter Google Scholar
Boldi, P., Vigna, S.: The webgraph framework I: compression techniques. In: Proceedings of the 13th World Wide Web Conference (WWW), pp. 595–602. ACM Press, New York (2004)
Chapter Google Scholar
Borodin, A., Roberts, G.O., Rosenthal, J.S., Tsaparas, P.: Finding authorities and hubs from link structures on the world wide web. In: 10th International World Wide Web Conference, pp. 415–429 (2001)
Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(l-7), 107–117 (1998)
Article Google Scholar
Broder, A.Z.: On the resemblance and containment of documents. In: Proceedings of the Compression and Complexity of Sequences (SEQUENCES 1997), pp. 21–29. IEEE Computer Society, Los Alamitos (1997)
Google Scholar
Chen, Y.-Y., Gan, Q., Suel, T.: I/O-efHcient techniques for computing Page-Rank. In: Proceedings of the eleventh international conference on Information and knowledge management, pp. 549–557. ACM Press, New York (2002)
Google Scholar
Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. J. Comput. Syst. Sci. 55(3), 441–453 (1997)
Article MathSciNet MATH Google Scholar
Eiron, N., McCurley, K.S.: Locality, hierarchy, and bidirectionality in the web. In: Second Workshop on Algorithms and Models for the Web-Graph (WAW 2003) (2003)
Google Scholar
Fogaras, D.: Where to start browsing the web? In: Böhme, T., Heyer, G., Unger, H. (eds.) IICS 2003. LNCS, vol. 2877, pp. 65–79. Springer, Heidelberg (2003)
Chapter Google Scholar
Fogaras, D., Rácz, B.: A scalable randomized method to compute link-based similarity rank on the web graph. In: Proceedings of the Clustering Information over the Web workshop. Conference on Extending Database Technology (2004), http://www.ilab.sztaki.hu/websearch/Publications/index.html
Google, P.: http://labs.google.com/personalized
Haveliwala, T.H.: Topic-sensitive PageRank. In: Proceedings of the 11th World Wide Web Conference (WWW), Honolulu, Hawaii (2002)
Google Scholar
Haveliwala, T.H., Kamvar, S., Jeh, G.: An analytical comparison of approaches to personalizing PageRank. Technical report, Stanford University (2003)
Google Scholar
Henzinger, M.R., Heydon, A., Mitzenmacher, M., Najork, M.: Measuring index quality using random walks on the Web. In: Proceedings of the 8th World Wide Web Conference, Toronto, Canada, pp. 213–225 (1999)
Google Scholar
Henzinger, M.R., Heydon, A., Mitzenmacher, M., Najork, M.: On near-uniform url sampling. In: Proceedings of the 9th international World Wide Web conference on Computer networks, pp. 295–308 (2000)
Google Scholar
Henzinger, M.R., Raghavan, P., Rajagopalan, S.: Computing on data streams. In: External memory algorithms, pp. 107–118 (1999)
Google Scholar
Jeh, G., Widom, J.: Scaling personalized web search. In: Proceedings of the 12th World Wide Web Conference (WWW), pp. 271–279. ACM Press, New York (2003)
Google Scholar
Kamvar, S., Haveliwala, T.H., Manning, C., Golub, G.: Exploiting the block structure of the web for computing PageRank. Technical report, Stanford University (2003)
Google Scholar
Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Article MathSciNet MATH Google Scholar
Kushilevitz, E., Nisan, N.: Communication complexity. Cambridge University Press, Cambridge (1997)
Book MATH Google Scholar
Lempel, R., Moran, S.: Rank stability and rank similarity of link-based web ranking algorithms in authority connected graphs. In: Second Workshop on Algorithms and Models for the Web-Graph (WAW 2003) (2003)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)
Google Scholar
Palmer, C.R., Gibbons, P.B., Faloutsos, C.: ANF: a fast and scalable tool for data mining in massive graphs. In: Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 81–90. ACM Press, New York (2002)
Chapter Google Scholar
Richardson, M., Domingos, P.: The Intelligent Surfer: Probabilistic combination of link and content information in PageRank. Advances in Neural Information Processing Systems 14, 1441–1448 (2002)
Google Scholar
Rusmevichientong, P., Pennock, D.M., Lawrence, S., Giles, C.L.: Methods for sampling pages uniformly from the world wide web. In: AAAI Fall Symposium on Using Uncertainty Within Computation, pp. 121–128 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer and Automation Research Institute of the Hungarian Academy of Sciences, Hungary
Dániel Fogaras & Balázs Rácz
Budapest University of Technology and Economics, Hungary
Dániel Fogaras & Balázs Rácz

Authors

Dániel Fogaras
View author publications
You can also search for this author in PubMed Google Scholar
Balázs Rácz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Rome “La Sapienza”, Rome, Italy
Stefano Leonardi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fogaras, D., Rácz, B. (2004). Towards Scaling Fully Personalized PageRank. In: Leonardi, S. (eds) Algorithms and Models for the Web-Graph. WAW 2004. Lecture Notes in Computer Science, vol 3243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30216-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-540-30216-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23427-2
Online ISBN: 978-3-540-30216-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics