Advertisement

Fast Katz and Commuters: Efficient Estimation of Social Relatedness in Large Networks

  • Pooya Esfandiar
  • Francesco Bonchi
  • David F. Gleich
  • Chen Greif
  • Laks V. S. Lakshmanan
  • Byung-Won On
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6516)

Abstract

Motivated by social network data mining problems such as link prediction and collaborative filtering, significant research effort has been devoted to computing topological measures including the Katz score and the commute time. Existing approaches typically approximate all pairwise relationships simultaneously. In this paper, we are interested in computing: the score for a single pair of nodes, and the top-k nodes with the best scores from a given source node. For the pairwise problem, we apply an iterative algorithm that computes upper and lower bounds for the measures we seek. This algorithm exploits a relationship between the Lanczos process and a quadrature rule. For the top-k problem, we propose an algorithm that only accesses a small portion of the graph and is related to techniques used in personalized PageRank computing. To test the scalability and accuracy of our algorithms we experiment with three real-world networks and find that these algorithms run in milliseconds to seconds without any preprocessing.

Keywords

Conjugate Gradient Method Quadrature Rule Link Prediction Conjugate Gradient Algorithm Commute Time 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Acar, E., Dunlavy, D.M., Kolda, T.G.: Link prediction on evolving data using matrix and tensor factorizations. In: Proceedings of the 2009 IEEE International Conference on Data Mining Workshops, ICDMW 2009, pp. 262–269. IEEE Computer Society, Los Alamitos (2009)CrossRefGoogle Scholar
  2. 2.
    Andersen, R., Chung, F., Lang, K.: Local graph partitioning using PageRank vectors. In: Proc. of the 47th Annual IEEE Sym. on Found. of Comp. Sci. (2006)Google Scholar
  3. 3.
    Berkhin, P.: Bookmark-coloring algorithm for personalized PageRank computing. Internet Math. 3(1), 41–62 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Foster, K.C., Muth, S.Q., Potterat, J.J., Rothenberg, R.B.: A faster Katz status score algorithm. Comput. & Math. Organ. Theo. 7(4), 275–285 (2001)CrossRefGoogle Scholar
  5. 5.
    Fouss, F., Pirotte, A., Renders, J.-M., Saerens, M.: Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 19(3), 355–369 (2007)CrossRefGoogle Scholar
  6. 6.
    Göbel, F., Jagers, A.A.: Random walks on graphs. Stochastic Processes and their Applications 2(4), 311–336 (1974)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins Univ. Press, Baltimore (1996)zbMATHGoogle Scholar
  8. 8.
    Golub, G.H., Meurant, G.: Matrices, moments and quadrature. In: Numerical analysis 1993 (Dundee, 1993). Pitman Res. Notes Math. Ser., vol. 303, pp. 105–156. Longman Sci. Tech., Harlow (1994)Google Scholar
  9. 9.
    Golub, G.H., Meurant, G.: Matrices, moments and quadrature ii; how to compute the norm of the error in iterative methods. BIT Num. Math. 37(3), 687–705 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Jeh, G., Widom, J.: Scaling personalized web search. In: Proceedings of the 12th International Conference on the World Wide Web, pp. 271–279. ACM, New York (2003)Google Scholar
  11. 11.
    Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18, 39–43 (1953)CrossRefzbMATHGoogle Scholar
  12. 12.
    Li, P., Liu, H., Yu, J.X., He, J., Du, X.: Fast single-pair simrank computation. In: Proc. of the SIAM Intl. Conf. on Data Mining (SDM 2010), Columbus, OH (2010)Google Scholar
  13. 13.
    Liben-Nowell, D., Kleinberg, J.M.: The link prediction problem for social networks. In: Proc. of the ACM Intl. Conf. on Inform. and Knowlg. Manage. CIKM 2003 (2003)Google Scholar
  14. 14.
    McSherry, F.: A uniform approach to accelerated PageRank computation. In: Proc. of the 14th Intl. Conf. on the WWW, pp. 575–582. ACM Press, New York (2005)Google Scholar
  15. 15.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford University (November 1999)Google Scholar
  16. 16.
    Qiu, H., Hancock, E.R.: Commute times for graph spectral clustering. In: Gagalowicz, A., Philips, W. (eds.) CAIP 2005. LNCS, vol. 3691, pp. 128–136. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Qiu, H., Hancock, E.R.: Clustering and embedding using commute times. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1873–1890 (2007)CrossRefGoogle Scholar
  18. 18.
    Rattigan, M.J., Jensen, D.: The case for anomalous link discovery. SIGKDD Explor. Newsl. 7(2), 41–47 (2005)CrossRefGoogle Scholar
  19. 19.
    Saerens, M., Fouss, F., Yen, L., Dupont, P.: The principal components analysis of a graph, and its relationships to spectral clustering. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 371–383. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  20. 20.
    Sarkar, P., Moore, A.W.: A tractable approach to finding closest truncated-commute-time neighbors in large graphs. In: Proc. of the 23rd Conf. on Uncert. in Art. Intell., UAI 2007 (2007)Google Scholar
  21. 21.
    Sarkar, P., Moore, A.W., Prakash, A.: Fast incremental proximity search in large graphs. In: Proc. of the 25th Intl. Conf. on Mach. Learn., ICML 2008 (2008)Google Scholar
  22. 22.
    Spielman, D.A., Srivastava, N.: Graph sparsification by effective resistances. In: Proc. of the 40th Ann. ACM Symp. on Theo. of Comput. (STOC 2008), pp. 563–568 (2008)Google Scholar
  23. 23.
    Varga, R.: Matrix Iterative Analysis. Prentice-Hall, Englewood Cliffs (1962)Google Scholar
  24. 24.
    Wang, C., Satuluri, V., Parthasarathy, S.: Local probabilistic models for link prediction. In: Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, ICDM 2007, Washington, DC, USA, pp. 322–331. IEEE Computer Society, Los Alamitos (December 2007)Google Scholar
  25. 25.
    Yen, L., Fouss, F., Decaestecker, C., Francq, P., Saerens, M.: Graph nodes clustering based on the commute-time kernel. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 1037–1045. Springer, Heidelberg (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Pooya Esfandiar
    • 1
  • Francesco Bonchi
    • 2
  • David F. Gleich
    • 3
  • Chen Greif
    • 1
  • Laks V. S. Lakshmanan
    • 1
  • Byung-Won On
    • 1
  1. 1.University of British ColumbiaVancouverCanada
  2. 2.Yahoo! ResearchBarcelonaSpain
  3. 3.Sandia National LaboratoriesLivermoreUSA

Personalised recommendations