Skip to main content

HetPathMine: A Novel Transductive Classification Algorithm on Heterogeneous Information Networks

  • Conference paper
Advances in Information Retrieval (ECIR 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8416))

Included in the following conference series:

Abstract

Transductive classification (TC) using a small labeled data to help classifying all the unlabeled data in information networks. It is an important data mining task on information networks. Various classification methods have been proposed for this task. However, most of these methods are proposed for homogeneous networks but not for heterogeneous ones, which include multi-typed objects and relations and may contain more useful semantic information. In this paper, we firstly use the concept of meta path to represent the different relation paths in heterogeneous networks and propose a novel meta path selection model. Then we extend the transductive classification problem to heterogeneous information networks and propose a novel algorithm, named HetPathMine. The experimental results show that: (1) HetPathMine can get higher accuracy than the existing transductive classification methods and (2) the weight obtained by HetPathMine for each meta path is consistent with human intuition or real-world situations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sun, Y., Han, J.: Mining heterogeneous information networks: principles and methodologies. Synthesis Lectures on Data Mining and Knowledge Discovery 3(2), 1–159 (2012)

    Article  Google Scholar 

  2. Gao, J., Liang, F.E.: On community outliers and their efficient detection in information networks. In: KDD 2010, pp. 813–822. ACM (2010)

    Google Scholar 

  3. Taskar, B., Abbeel, P., Koller, D.: Discriminative probabilistic models for relational data. In: UAI 2002, pp. 485–492. Morgan Kaufmann Publishers Inc. (2002)

    Google Scholar 

  4. Castells, M.: The rise of the network society: The information age: Economy, society, and culture, vol. 1 (2011), Wiley.com

  5. Even, S.: Graph algorithms. Cambridge University Press (2011)

    Google Scholar 

  6. Zhou, D., Bousquet, O.E.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems, vol. 16(16), pp. 321–328 (2004)

    Google Scholar 

  7. Wu, M., Schölkopf, B.: Transductive classification via local learning regularization. In: AISTATS 2007, pp. 628–635 (2007)

    Google Scholar 

  8. Sun, Y., Han, J.E.: Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: EDBT 2009, pp. 565–576. ACM (2009)

    Google Scholar 

  9. Macskassy, S.A., Provost, F.: A simple relational classifier. Technical report, DTIC Document (2003)

    Google Scholar 

  10. Ji, M., Sun, Y., Danilevsky, M., Han, J., Gao, J.: Graph regularized transductive classification on heterogeneous information networks. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part I. LNCS, vol. 6321, pp. 570–586. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. In: VLDB 2011 (2011)

    Google Scholar 

  12. Getoor, L., Taskar, B.: Introduction to statistical relational learning. The MIT Press (2007)

    Google Scholar 

  13. La Fond, T., Neville, J.: Randomization tests for distinguishing social influence and homophily effects. In: WWW 2010, pp. 601–610. ACM (2010)

    Google Scholar 

  14. Sun, Y., Norick, B., Han, J., Yan, X., Yu, P.S., Yu, X.: Integrating meta-path selection with user-guided object clustering in heterogeneous information networks. In: KDD 2012, pp. 1348–1356. ACM (2012)

    Google Scholar 

  15. Cai, D., Shao, Z., He, X., Yan, X., Han, J.: Mining hidden community in heterogeneous social networks. In: LinkKDD 2005, pp. 58–65. ACM (2005)

    Google Scholar 

  16. Montgomery, D.C., Peck, E.A., Vining, G.G.: Introduction to linear regression analysis, vol. 821. Wiley (2012)

    Google Scholar 

  17. Mintz, M.E.: Distant supervision for relation extraction without labeled data. In: ACL 2009, pp. 1003–1011. Association for Computational Linguistics (2009)

    Google Scholar 

  18. Nguyen, T.V.T., Moschitti, A., Riccardi, G.: Convolution kernels on constituent, dependency and sequential structures for relation extraction. In: EMNLP 2009, pp. 1378–1387. Association for Computational Linguistics (2009)

    Google Scholar 

  19. Sun, Y.E.: Co-author relationship prediction in heterogeneous bibliographic networks. In: ASONAM 2011, pp. 121–128. IEEE (2011)

    Google Scholar 

  20. Macskassy, S.A., Provost, F.: Classification in networked data: A toolkit and a univariate case study. The Journal of Machine Learning Research 8, 935–983 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Luo, C., Guan, R., Wang, Z., Lin, C. (2014). HetPathMine: A Novel Transductive Classification Algorithm on Heterogeneous Information Networks. In: de Rijke, M., et al. Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham. https://doi.org/10.1007/978-3-319-06028-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06028-6_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06027-9

  • Online ISBN: 978-3-319-06028-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics