Sparse Relational Topic Models for Document Networks

Zhang, Aonan; Zhu, Jun; Zhang, Bo

doi:10.1007/978-3-642-40988-2_43

Sparse Relational Topic Models for Document Networks

Aonan Zhang²³,
Jun Zhu²³ &
Bo Zhang²³

Conference paper

3500 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8188))

Abstract

Learning latent representations is playing a pivotal role in machine learning and many application areas. Previous work on relational topic models (RTM) has shown promise on learning latent topical representations for describing relational document networks and predicting pairwise links. However under a probabilistic formulation with normalization constraints, RTM could be ineffective in controlling the sparsity of the topical representations, and may often need to make strict mean-field assumptions for approximate inference. This paper presents sparse relational topic models (SRTM) under a non-probabilistic formulation that can effectively control the sparsity via a sparsity-inducing regularizer. Our model can also handle imbalance issues in real networks via introducing various cost parameters for positive and negative links. The deterministic optimization problem of SRTM admits efficient coordinate descent algorithms. We also present a generalization to consider all pairwise topic interactions. Our empirical results on several real network datasets demonstrate better performance on link prediction, sparser latent representations, and faster running time than the competitors under a probabilistic formulation.

Download to read the full chapter text

Chapter PDF

References

Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. In: CIKM (2003)
Google Scholar
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. Journal of Machine Learning Research 9, 1981–2014 (2008)
MATH Google Scholar
Hoff, P., Raftery, A., Handcock, M.: Latent space approches to social network analysis. Journal of American Statistical Association 97, 1090–1098 (2002)
Article MathSciNet MATH Google Scholar
Hoff, P.: Modeling homophily and stochastic equivalence in symmetric relational data. In: NIPS (2007)
Google Scholar
Miller, K., Griffiths, T., Jordan, M.: Nonparametric latent feature models for link prediction. In: NIPS (2009)
Google Scholar
Zhu, J.: Max-margin nonparametric latent feature models for link prediction. In: ICML (2012)
Google Scholar
Menon, A.K., Elkan, C.: Link prediction via matrix factorization. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part II. LNCS, vol. 6912, pp. 437–452. Springer, Heidelberg (2011)
Chapter Google Scholar
Chang, J., Blei, D.: Relational topic models for document networks. In: AISTATS (2009)
Google Scholar
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Lee, D., Seung, H.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Article Google Scholar
Zhu, J., Xing, E.: Sparse topical coding. In: UAI (2011)
Google Scholar
Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link lda: Joint models of topic and author community. In: ICML (2009)
Google Scholar
Fu, W., Wang, J., Li, Z., Lu, H., Ma, S.: Learning semantic motion patterns for dynamic scenes by improved sparse topical coding. In: ICME (2012)
Google Scholar
Ji, R., Duan, L., Chen, J., Gao, W.: Towards compact topical descriptors. In: CVPR (2012)
Google Scholar
Li, L.-J., Zhu, J., Su, H., Xing, E.P., Fei-Fei, L.: Multi-level structured image coding on high-dimensional image representation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part II. LNCS, vol. 7725, pp. 147–161. Springer, Heidelberg (2013)
Chapter Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc. B58, 267–288 (1996)
MathSciNet Google Scholar
Hyvarinen, A.: Sparse code shrinkage: Denoising of nongaussian data by maximum likelihood estimation. Neural Computation 11, 1739–1768 (1999)
Article Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press (2004)
Google Scholar
Duchi, J., Shalev-Shwartz, S., Singer, Y., Chandra, T.: Efficient projections onto the ℓ₁-ball for learning in high dimensions. In: ICML (2008)
Google Scholar
McCallum, A., Nigam, K., Rennie, J., Seymore, K.: Automating the construction of internet portals with machine learning. Information Retrieval (2000)
Google Scholar
Craven, M., Dipasquo, D., Freitag, D., McCallum, A.: Learning to extract symbolic knowledge from the world wide web. In: AAAI (1998)
Google Scholar
Jordan, M.I., Ghahramani, Z., Jaakkola, T., Saul, L.K.: An introduction to variational methods for graphical models. In: Jordan, M.I. (ed.) Learning in Graphical Models. MIT Press, Cambridge (1999)
Google Scholar
Zhang, A., Zhu, J., Zhang, B.: Sparse online topic models. In: WWW (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, China
Aonan Zhang, Jun Zhu & Bo Zhang

Authors

Aonan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Bo Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001, Leuven, Belgium
Hendrik Blockeel
Fraunhofer IAIS, Department of Knowledge Discovery, University of Bonn, Schloss Birlinghoven, 53754, Sankt Augustin, Germany
Kristian Kersting
LIACS, Universiteit Leiden, Niels Bohrweg 1, 2333 CA, Leiden, The Netherlands
Siegfried Nijssen
Department of Computer Science and Engineering, Czech Technical University, Technicka 2, 16627, Prague 6, Czech Republic
Filip Železný

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, A., Zhu, J., Zhang, B. (2013). Sparse Relational Topic Models for Document Networks. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2013. Lecture Notes in Computer Science(), vol 8188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40988-2_43

Download citation

DOI: https://doi.org/10.1007/978-3-642-40988-2_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40987-5
Online ISBN: 978-3-642-40988-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics