Advertisement

Topical network embedding

  • Min Shi
  • Yufei TangEmail author
  • Xingquan Zhu
  • Jianxun Liu
  • Haibo He
Article
  • 83 Downloads

Abstract

Networked data involve complex information from multifaceted channels, including topology structures, node content, and/or node labels etc., where structure and content are often correlated but are not always consistent. A typical scenario is the citation relationships in scholarly publications where a paper is cited by others not because they have the same content, but because they share one or multiple subject matters. To date, while many network embedding methods exist to take the node content into consideration, they all consider node content as simple flat word/attribute set and nodes sharing connections are assumed to have dependency with respect to all words or attributes. In this paper, we argue that considering topic-level semantic interactions between nodes is crucial to learn discriminative node embedding vectors. In order to model pairwise topic relevance between linked text nodes, we propose topical network embedding, where interactions between nodes are built on the shared latent topics. Accordingly, we propose a unified optimization framework to simultaneously learn topic and node representations from the network text contents and structures, respectively. Meanwhile, the structure modeling takes the learned topic representations as conditional context under the principle that two nodes can infer each other contingent on the shared latent topics. Experiments on three real-world datasets demonstrate that our approach can learn significantly better network representations, i.e., 4.1% improvement over the state-of-the-art methods in terms of Micro-F1 on Cora dataset. (The source code of the proposed method is available through the github link: https://github.com/codeshareabc/TopicalNE.)

Keywords

Network embedding Network representation Topic model Semantic mining 

Notes

Acknowledgements

This work is supported in part by the US National Science Foundation (NSF) through Grants Nos. IIS-1763452 and CNS-1828181.

References

  1. Abraham A, Pedregosa F, Eickenberg M, Gervais P, Mueller A, Kossaifi J, Gramfort A, Thirion B, Varoquaux G (2014) Machine learning for neuroimaging with scikit-learn. Front Neuroinform 8(2):14Google Scholar
  2. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3(1):993–1022zbMATHGoogle Scholar
  3. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of the 19th international symposium on computational statistics, pp 177–186CrossRefGoogle Scholar
  4. Cai X, Han J, Pan S, Yang L (2018a) Heterogeneous information network embedding based personalized query-focused astronomy reference paper recommendation. Int J Comput Intell Syst 11(1):591–599CrossRefGoogle Scholar
  5. Cai X, Han J, Yang L (2018b) Generative adversarial network based heterogeneous bibliographic network representation for personalized citation recommendation. In: Proceedings of the 32nd AAAI conference on artificial intelligence, pp 5747–5754Google Scholar
  6. Chang J, Blei D (2009) Relational topic models for document networks. In: Proceedings of the 12th international conference on artificial intelligence and statistics, pp 81–88Google Scholar
  7. Chen J, Zhang Q, Huang X (2016) Incorporate group information to enhance network embedding. In: Proceedings of the 25th ACM international conference on information and knowledge management, pp 1901–1904Google Scholar
  8. Dojchinovski M, Vitvar T (2018) Linked web apis dataset. Semant Web 9(4):1–11 CrossRefGoogle Scholar
  9. Griffiths T (2002) Gibbs sampling in the generative model of Latent Dirichlet Allocation. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.3760
  10. Grover A, Leskovec J (2016) Node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864Google Scholar
  11. Gutmann M, Hyvärinen A (2010) Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the 13th international conference on artificial intelligent and statistics, pp 297–304Google Scholar
  12. Huang X, Li J, Hu X (2017) Label informed attributed network embedding. In: Proceedings of the 10th ACM international conference on web search and data mining, pp 731–739Google Scholar
  13. Jian L, Li J, Liu H (2018) Toward online node classification on streaming networks. Data Min Knowl Discov 32(1):231–257MathSciNetCrossRefGoogle Scholar
  14. Kimura M, Saito K, Nakano R, Motoda H (2010) Extracting influential nodes on a social network for information diffusion. Data Min Knowl Discov 20(1):70MathSciNetCrossRefGoogle Scholar
  15. Le TM, Lauw HW (2014) Probabilistic latent document network embedding. In: Proceedings of the 14th international conference on data mining, pp 270–279Google Scholar
  16. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st international conference on machine learning, pp 1188–1196Google Scholar
  17. Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605zbMATHGoogle Scholar
  18. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
  19. Oro E, Pizzuti C, Procopio N, Ruffolo M (2018) Detecting topic authoritative social media users: a multilayer network approach. IEEE Trans Multimed 20(5):1195–1208CrossRefGoogle Scholar
  20. Pan S, Wu J, Zhu X, Zhang C, Wang Y (2016) Tri-party deep network representation. In: Proceedings of the 25th international joint conference on artificial intelligence, pp 1895–1901Google Scholar
  21. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1532–1543Google Scholar
  22. Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710Google Scholar
  23. Shi M, Liu J, Zhou D, Tang Y (2018a) A topic-sensitive method for mashup tag recommendation utilizing multi-relational service data. IEEE Trans Serv Comput.  https://doi.org/10.1109/TSC.2018.2805826 CrossRefGoogle Scholar
  24. Shi T, Kang K, Choo J, Reddy CK (2018b) Short-text topic modeling via non-negative matrix factorization enriched with local word-context correlations. In: Proceedings of the 27th international conference on world wide web, pp 1105–1114Google Scholar
  25. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077Google Scholar
  26. Tu C, Zhang W, Liu Z, Sun M et al (2016) Max-margin DeepWalk: discriminative learning of network representation. In: Proceedings of the 25th international joint conference on artificial intelligence, pp 3889–3895Google Scholar
  27. Verma A, Bharadwaj KK (2017) Identifying community structure in a multi-relational network employing non-negative tensor factorization and GA k-means clustering. Wiley Interdiscip Rev Data Min Knowl Discov 7(1):e1196CrossRefGoogle Scholar
  28. Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S (2017) Community preserving network embedding. In: Proceedings of the 31st AAAI conference on artificial intelligence, pp 203–209Google Scholar
  29. Wang C, Song Y, Li H, Zhang M, Han J (2018) Unsupervised meta-path selection for text similarity measure based on heterogeneous information networks. Data Min Knowl Discov 32(6):1735–1767MathSciNetCrossRefGoogle Scholar
  30. Yang C, Liu Z, Zhao D, Sun M, Chang EY (2015) Network representation learning with rich text information. In: Proceedings of the 24th international joint conference on artificial intelligence, pp 2111–2117Google Scholar
  31. Zhang D, Yin J, Zhu X, Zhang C (2018) Network representation learning: a survey. IEEE Trans Big Data.  https://doi.org/10.1109/TBDATA.2018.2850013 CrossRefGoogle Scholar

Copyright information

© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer and Electrical Engineering and Computer ScienceFlorida Atlantic UniversityBoca RatonUSA
  2. 2.School of Computer Science and EngineeringHunan University of Science and TechnologyXiangtanChina
  3. 3.Department of Electrical, Computer and Biomedical EngineeringUniversity of Rhode IslandKingstonUSA

Personalised recommendations