Within-Network Classification Using Local Structure Similarity

  • Christian Desrosiers
  • George Karypis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5781)


Within-network classification, where the goal is to classify the nodes of a partly labeled network, is a semi-supervised learning problem that has applications in several important domains like image processing, the classification of documents, and the detection of malicious activities. While most methods for this problem infer the missing labels collectively based on the hypothesis that linked or nearby nodes are likely to have the same labels, there are many types of networks for which this assumption fails, e.g., molecular graphs, trading networks, etc. In this paper, we present a collective classification method, based on relaxation labeling, that classifies entities of a network using their local structure. This method uses a marginalized similarity kernel that compares the local structure of two nodes with random walks in the network. Through experimentation on different datasets, we show our method to be more accurate than several state-of-the-art approaches for this problem.


Network semi-supervised learning random walk 


  1. 1.
    Barabasi, A., Jeong, H., Neda, Z., Ravasz, E., Schubert, A., Vicsek, T.: Evolution of the social network of scientific collaborations. Physica A 311(3-4), 590–614 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Besag, J.: On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society 48(3), 259–302 (1986)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Borgwardt, K., Ong, C., Schönauer, S., Vishwanathan, S., Smola, A., Kriegel, H.-P.: Protein function prediction via graph kernels. Bioinformatics 21(1), 47–56 (2005)CrossRefGoogle Scholar
  4. 4.
    Callut, J., Francoisse, K., Saerens, M., Dupont, P.: Semi-supervised classification from discriminative random walks. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 162–177. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  5. 5.
    Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. In: SIGMOD 1998: Proc. of the 1998 ACM SIGMOD Int. Conf. on Management of data, pp. 307–318. ACM Press, New York (1998)CrossRefGoogle Scholar
  6. 6.
    Domingos, P., Richardson, M.: Markov logic: A unifying framework for statistical relational learning. In: Proc. of the ICML 2004 Workshop on Statistical Relational Learning and its Connections to Other Fields, pp. 49–54 (2004)Google Scholar
  7. 7.
    Gaertner, T., Flach, P., Wrobel, S.: On graph kernels: Hardness results and efficient alternatives. In: Proc. of the 16th Annual Conf. on Computational Learning Theory, August 2003, pp. 129–143. Springer, Heidelberg (2003)Google Scholar
  8. 8.
    Gallagher, B., Tong, H., Eliassi-Rad, T., Faloutsos, C.: Using ghost edges for classification in sparsely labeled networks. In: KDD 2008: Proc. of the 14th ACM SIGKDD Int. Conf. on Knowledge discovery and data mining, pp. 256–264. ACM Press, New York (2008)Google Scholar
  9. 9.
    Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. In: Neurocomputing: foundations of research, pp. 611–634 (1988)Google Scholar
  10. 10.
    Kashima, H., Tsuda, K., Inokuchi, A.: Marginalized kernels between labeled graphs. In: Proc. of the 12th In. Conf. on Machine Learning, pp. 321–328. AAAI Press, Menlo Park (2003)Google Scholar
  11. 11.
    Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML 2001: Proc. of the 18th Int. Conf. on Machine Learning, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  12. 12.
    Li, X., Zhang, Z., Chen, H., Li, J.: Graph kernel-based learning for gene function prediction from gene interaction network. In: BIBM 2007: Proc. of the 2007 IEEE Int. Conf. on Bioinformatics and Biomedicine, Washington, DC, USA, pp. 368–373. IEEE Computer Society Press, Los Alamitos (2007)Google Scholar
  13. 13.
    Lu, Q., Getoor, L.: Link-based classification. In: Fawcett, T., Mishra, N., Fawcett, T., Mishra, N. (eds.) Proc. 12th Int’l Conf. Machine Learning (ICML), pp. 496–503. AAAI Press, Menlo Park (2003)Google Scholar
  14. 14.
    Macskassy, S.A., Provost, F.: A simple relational classifier. In: Proc. of the 2nd Workshop on Multi-Relational Data Mining (MRDM 2003), pp. 64–76 (2003)Google Scholar
  15. 15.
    Macskassy, S.A., Provost, F.: Classification in networked data: A toolkit and a univariate case study. Journal of Machine Learning Research 8, 935–983 (2007)Google Scholar
  16. 16.
    Neville, J., Jensen, D.: Iterative classification in relational data. In: Proc. Workshop on Statistical Relational Learning, AAAI, pp. 13–20. AAAI Press, Menlo Park (2000)Google Scholar
  17. 17.
    Smola, A., Kondor, R.: Kernels and regularization on graphs. In: Warmuth, M., Schölkopf, B. (eds.) Proc. of the 2003 Conf. on Computational Learning Theory (COLT) and Kernels Workshop, pp. 144–158 (2003)Google Scholar
  18. 18.
    Taskar, B., Abbeel, P., Koller, D.: Discriminative probabilistic models for relational data. In: UAI 2002, Proc. of the 18th Conf. in Uncertainty in Artificial Intelligence, pp. 485–492. Morgan Kaufmann, San Francisco (2002)Google Scholar
  19. 19.
    Yedidia, J.S., Freeman, W.T., Weiss, Y.: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory 51(7), 2282–2312 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proc. of the 12th Int. Conf. on Machine Learning (ICML), pp. 912–919 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Christian Desrosiers
    • 1
  • George Karypis
    • 1
  1. 1.Department of Computer Science & EngineeringUniversity of MinnesotaTwin Cities

Personalised recommendations