Skip to main content

A New Relevance Measure for Heterogeneous Networks

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9263))

Abstract

Measuring relatedness between objects (nodes) in a heterogeneous network is a challenging and an interesting problem. Many people transform a heterogeneous network into a homogeneous network before applying a similarity measure. However, such transformation results in information loss as path semantics are lost. In this paper, we study the problem of measuring relatedness between objects in a heterogeneous network using only link information and propose a meta-path based novel measure for relevance measurement in a general heterogeneous network with a specified network schema. The proposed measure is semi-metric and incorporates the path semantics by following the specified meta-path. For relevance measurement, using the specified meta-path, the given heterogeneous network is converted into a bipartite network consisting only of source and target type objects between which relatedness is to be measured. In order to validate the effectiveness of the proposed measure, we compared its performance with existing relevance measures which are semi-metric and applicable to heterogeneous networks. To show the viability and the effectiveness of the proposed measure, experiments were performed on real world bibliographic dataset DBLP. Experimental results show that the proposed measure effectively measures the relatedness between objects in a heterogeneous network and it outperforms earlier measures in clustering and query task.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Huang, Y., Gao, X.: Clustering on heterogeneous networks. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 4(3), 213–233 (2014)

    Article  Google Scholar 

  2. Sun, Y., Han, J.: Mining heterogeneous information networks: a structural analysis approach. ACM SIGKDD Explor. Newsl. 14(2), 20–28 (2013)

    Article  MathSciNet  Google Scholar 

  3. Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: meta path-based top-k similarity search in heterogeneous information networks. In: VLDB (2011)

    Google Scholar 

  4. Shi, C., Kong, X., Huang, Y., Philip, S.Y., Wu, B.: HeteSim: a general framework for relevance measure in heterogeneous networks. IEEE Trans. Knowl. Data Eng. 26(10), 2479–2492 (2014)

    Article  Google Scholar 

  5. Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press, London (2009)

    Google Scholar 

  6. Ji, M., Sun, Y., Danilevsky, M., Han, J., Gao, J.: Graph regularized transductive classification on heterogeneous information networks. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part I. LNCS, vol. 6321, pp. 570–586. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. Jeh, G., Widom, J.: SimRank: a measure of structural-context similarity. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 538–543. ACM (2002)

    Google Scholar 

  8. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Technical report, Stanford University Database Group (1998)

    Google Scholar 

  9. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  10. Kumar, P., Raju, B.S., Radha Krishna, P.: A new similarity metric for sequential data. Int. J. Data Warehouse. Min. 6(4), 16–32 (2010)

    Article  Google Scholar 

  11. Lao, N., Cohen, W.W.: Relational retrieval using a combination of path-constrained random walks. Mach. Learn. 81(1), 53–67 (2010)

    Article  MathSciNet  Google Scholar 

  12. Meng, X., Shi, C., Li, Y., Zhang, L., Wu, B.: Relevance measure in large-scale heterogeneous networks. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds.) APWeb 2014. LNCS, vol. 8709, pp. 636–643. Springer, Heidelberg (2014)

    Google Scholar 

  13. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mukul Gupta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Gupta, M., Kumar, P., Bhasker, B. (2015). A New Relevance Measure for Heterogeneous Networks. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2015. Lecture Notes in Computer Science(), vol 9263. Springer, Cham. https://doi.org/10.1007/978-3-319-22729-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22729-0_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22728-3

  • Online ISBN: 978-3-319-22729-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics