Advertisement

Improving Random Walk Estimation Accuracy with Uniform Restarts

  • Konstantin Avrachenkov
  • Bruno Ribeiro
  • Don Towsley
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6516)

Abstract

This work proposes and studies the properties of a hybrid sampling scheme that mixes independent uniform node sampling and random walk (RW)-based crawling. We show that our sampling method combines the strengths of both uniform and RW sampling while minimizing their drawbacks. In particular, our method increases the spectral gap of the random walk, and hence, accelerates convergence to the stationary distribution. The proposed method resembles PageRank but unlike PageRank preserves time-reversibility. Applying our hybrid RW to the problem of estimating degree distributions of graphs shows promising results.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aldous, D., Fill, J.A.: Reversible Markov Chains and Random Walks on Graphs. Book in preparation (1995), http://www.stat.berkeley.edu/~aldous
  2. 2.
    Avrachenkov, K.: Analytic perturbation theory and its applications. PhD Thesis, University of South Australia (1999)Google Scholar
  3. 3.
    Avrachenkov, K., Ribeiro, B., Towsley, D.: Improving random walk search and estimation accuracy with uniform restarts. Tech. rep., INRIA Research Report no. 7394 (2010), http://hal.inria.fr
  4. 4.
    Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Baumgartel, H.: Analytic perturbation theory for matrices and operators. Birkhauser, Basel (1985)zbMATHGoogle Scholar
  6. 6.
    Bisnik, N., Abouzeid, A.A.: Optimizing random walk search algorithms in p2p networks. Computer Networks 51(6), 1499–1514 (2007)CrossRefzbMATHGoogle Scholar
  7. 7.
    Brauer, A.: Limits for the characteristic roots of a matrix, iv: Applications to stochastic matrices. Duke Math. J. 19, 75–91 (1952)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Bressan, M., Peserico, E.: Choose the damping, choose the ranking? In: Avrachenkov, K., Donato, D., Litvak, N. (eds.) WAW 2009. LNCS, vol. 5427, pp. 76–89. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  9. 9.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 107–117 (1998)CrossRefGoogle Scholar
  10. 10.
    Gauvin, W., Ribeiro, B., Liu, B., Towsley, D., Wang, J.: Measurement and gender-specific analysis of user publishing characteristics on myspace. IEEE Network Special Issue on Online Social Networks (2010)Google Scholar
  11. 11.
    Gkantsidis, C., Mihail, M.: Hybrid search schemes for unstructured peer-to-peer networks. In: Proceedings of IEEE INFOCOM, pp. 1526–1537 (2005)Google Scholar
  12. 12.
    Gkantsidis, C., Mihail, M., Saberi, A.: Random walks in peer-to-peer networks: algorithms and evaluation. Perform. Eval. 63(3), 241–263 (2006)CrossRefGoogle Scholar
  13. 13.
    Haveliwala, T., Kamvar, S.: The second eigenvalue of the Google matrix. Tech. Rep. Stanford (2003), http://ilpubs.stanford.edu:8090/582/
  14. 14.
    Konrath, M.A., Barcellos, M.P., Mansilha, R.B.: Attacking a swarm with a band of liars: evaluating the impact of attacks on bittorrent. In: Proc. of the IEEE International Conference on Peer-to-Peer Computing, pp. 37–44 (2007)Google Scholar
  15. 15.
    Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Statistical properties of community structure in large social and information networks. In: Proc. of the WWW, pp. 695–704 (2008)Google Scholar
  16. 16.
    Litvak, N., Scheinhardt, W., Volkovich, Y., Zwart, B.: Characterization of tail dependence for in-degree and pagerank. In: Avrachenkov, K., Donato, D., Litvak, N. (eds.) WAW 2009. LNCS, vol. 5427, pp. 90–103. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  17. 17.
    Lovász, L.: Random walks on graphs: a survey. Combinatorics 2, 1–46 (1993)Google Scholar
  18. 18.
    Lovász, L., Simonovits, M.: Random walks in a convex body and an improved volume algorithm. Random Struct. Alg. 4, 359–412 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Lv, Q., Cao, P., Cohen, E., Li, K., Shenker, S.: Search and replication in unstructured peer-to-peer networks. In: Proc. of the 16th International Conference on Supercomputing, pp. 84–95 (2002)Google Scholar
  20. 20.
    Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and Analysis of Online Social Networks. In: Proc. of the IMC (October 2007)Google Scholar
  21. 21.
    Ribeiro, B., Towsley, D.: Estimating and sampling graphs with multidimensional random walks. In: Proc. of the ACM SIGCOMM IMC (October 2010)Google Scholar
  22. 22.
    Sinclair, A.: Improved bounds for mixing rates of Markov chains and multicommodity flow. Combinatorics, Probability and Computing 1, 351–370 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Twitter (2010), http://twitter.com
  24. 24.
    Volz, E., Heckathorn, D.D.: Probability based estimation theory for Respondent-Driven Sampling. Journal of Official Statistics (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Konstantin Avrachenkov
    • 1
  • Bruno Ribeiro
    • 2
  • Don Towsley
    • 2
  1. 1.INRIASophia-AntipolisFrance
  2. 2.Dept. of Computer ScienceUniversity of Massachusetts AmherstAmherstUSA

Personalised recommendations