Skip to main content

Traps and Pitfalls of Topic-Biased PageRank

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4936))

Abstract

We discuss a number of issues in the definition, computation and comparison of PageRank values that have been addressed sparsely in the literature, often with contradictory approaches. We study the difference between weakly and strongly preferential PageRank, which patch the dangling nodes with different distributions, extending analytical formulae known for the strongly preferential case, and corroborating our results with experiments on a snapshot of 100 millions of pages of the .uk domain. The experiments show that the two PageRank versions are poorly correlated, and results about each one cannot be blindly applied to the other; moreover, our computations highlight some new concerns about the usage of exchange-based correlation indices (such as Kendall’s τ) on approximated rankings.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Haveliwala, T.H.: Topic-sensitive PageRank. In: The eleventh International Conference on World Wide Web Conference, pp. 517–526. ACM Press, New York (2002)

    Chapter  Google Scholar 

  2. Jeh, G., Widom, J.: Scaling personalized web search. In: WWW 2003: Proceedings of the 12th international conference on World Wide Web, pp. 271–279. ACM Press, New York (2003)

    Chapter  Google Scholar 

  3. Csalogány, K., Fogaras, D., Rácz, B., Sarlós, T.: Towards scaling fully personalized PageRank: Algorithms, lower bounds, and experiments. Internet Math. 2, 333–358 (2006)

    Google Scholar 

  4. Boldi, P., Codenotti, B., Santini, M., Vigna, S.: Ubicrawler: A scalable fully distributed web crawler. Software: Practice & Experience 34, 711–726 (2004)

    Article  Google Scholar 

  5. DELIS: Dynamically Evolving Large-scale Information Systems EC FP6 project, http://delis.upb.de/

  6. ODP: Open Directory Project, http://dmoz.org/

  7. Del Corso, G., Gullì, A., Romani, F.: Fast PageRank computation via a sparse linear system. Internet Math. 2 (2006)

    Google Scholar 

  8. Boldi, P., Lonati, V., Santini, M., Vigna, S.: Graph fibrations, graph isomorphism, and PageRank. RAIRO Inform. Théor 40, 227–253 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  9. Eiron, N., McCurley, K.S., Tomlin, J.A.: Ranking the web frontier. In: Proceedings of the 13th conference on World Wide Web, pp. 309–318. ACM Press, New York (2004)

    Chapter  Google Scholar 

  10. Lasserre, J.B.: A formula for singular perturbations of Markov chains. Journal of Applied Probability 31, 829–833 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  11. Yosida, K.: Functional Analysis, 6th edn. Springer, Heidelberg (1980)

    MATH  Google Scholar 

  12. Iosifescu, M.: Finite Markov Processes and Their Applications. John Wiley & Sons, Chichester (1980)

    MATH  Google Scholar 

  13. Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics, pp. 28–36 (2003)

    Google Scholar 

  14. Fagin, R., Kumar, R., McCurley, K.S., Novak, J., Sivakumar, D., Tomlin, J.A., Williamson, D.P.: Searching the workplace web. In: Proceedings of the twelfth international conference on World Wide Web, pp. 366–375. ACM Press, New York (2003)

    Chapter  Google Scholar 

  15. Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings of the tenth international conference on World Wide Web, pp. 613–622. ACM Press, New York (2001)

    Chapter  Google Scholar 

  16. Boldi, P., Santini, M., Vigna, S.: Do your worst to make the best: Paradoxical effects in PageRank incremental computations. Internet Math. 2, 387–404 (2005)

    MathSciNet  MATH  Google Scholar 

  17. Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating pagerank computations. In: Proceedings of the twelfth international conference on World Wide Web, pp. 261–270. ACM Press, New York (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

William Aiello Andrei Broder Jeannette Janssen Evangelos Milios

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Boldi, P., Posenato, R., Santini, M., Vigna, S. (2008). Traps and Pitfalls of Topic-Biased PageRank. In: Aiello, W., Broder, A., Janssen, J., Milios, E. (eds) Algorithms and Models for the Web-Graph. WAW 2006. Lecture Notes in Computer Science, vol 4936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78808-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78808-9_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78807-2

  • Online ISBN: 978-3-540-78808-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics