Topic Models Conditioned on Relations

Wahabzada, Mirwaes; Xu, Zhao; Kersting, Kristian

doi:10.1007/978-3-642-15939-8_26

Mirwaes Wahabzada²³,
Zhao Xu²³ &
Kristian Kersting²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6323))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3599 Accesses
7 Citations

Abstract

Latent Dirichlet allocation is a fully generative statistical language model that has been proven to be successful in capturing both the content and the topics of a corpus of documents. Recently, it was even shown that relations among documents such as hyper-links or citations allow one to share information between documents and in turn to improve topic generation. Although fully generative, in many situations we are actually not interested in predicting relations among documents. In this paper, we therefore present a Dirichlet-multinomial nonparametric regression topic model that includes a Gaussian process prior on joint document and topic distributions that is a function of document relations. On networks of scientific abstracts and of Wikipedia documents we show that this approach meets or exceeds the performance of several baseline topic models.

Download to read the full chapter text

Chapter PDF

Hierarchical Dirichlet Processes with Social Influence

Probabilistic Explicit Topic Modeling Using Wikipedia

Sentence level topic models for associated topics extraction

Article 18 October 2018

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. Journal of Machine Learning Research 9, 1981–2014 (2008)
Google Scholar
Bhattacharya, I., Getoor, L.: A latent dirichlet model for unsupervised entity resolution. In: Proceeding of SIAM Conference on Data Mining, SDM (2006)
Google Scholar
Blei, D., Lafferty, J.: Topic models. In: Srivastava, A., Sahami, M. (eds.) Text Mining: Theory and Applications. Taylor & Francis, Abington (2009)
Google Scholar
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 113–120. ACM, New York (2006)
Chapter Google Scholar
Blei, D.M., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Article MATH Google Scholar
Buntine, W., Jakulin, A.: Applying discrete pca in data analysis. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 59–66 (2004)
Google Scholar
Chang, J., Blei, D.: Relational topic models for document networks. In: Proceeding of the International Conference on Artificial Intelligence and Statistics, AISTATS (2009)
Google Scholar
Chu, W., Sindhwani, V., Ghahramani, Z., Keerthi, S.: Relational learning with gaussian processes. In: Neural Information Processing Systems (2006)
Google Scholar
Deerwester, S., Dumais, S., Landauer, T., Furnas, G., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)
Article Google Scholar
Dietz, L., Bickel, S., Scheffer, T.: Unsupervised prediction of citation influence. In: Proceeding of the International Conference on Machine Learning, ICML (2007)
Google Scholar
Griffiths, T.L., Steyvers, M.: Finding scientific topics. PNAS 101(suppl. 1), 5228–5235 (2004)
Article Google Scholar
Gruber, A., Rosen-Zvi, M., Weiss, Y.: Latent topic models for hypertext. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, UAI (2008)
Google Scholar
Hofmann, T.: Probabilistic latent semantic indexing. Research and Development in Information Retrieval, 50–57 (1999)
Google Scholar
Tenenbaum, J., Sutskever, I., Salakhutdinov, R.: Modelling relational data using bayesian clustered tensor factorization. Neural Information Processing Systems (2009)
Google Scholar
Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: Proc. 21st AAAI (2006)
Google Scholar
Li, F.-F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Proceeding of IEEE CVPR (2005)
Google Scholar
Li, W., Yeung, D., Zhang, Z.: Probabilistic relational pca. In: Neural Information Processing Systems (2009)
Google Scholar
McCallum, A., Corrada-Emmanuel, A., Wang, X.: Topic and role discovery in social networks. In: Proceeding of the International Joint Conference on Artificial Intelligence, IJCAI (2005)
Google Scholar
McCallum, A., Corrada-Emmanuel, A., Wang, X.: Topic and role discovery in social networks. In: Proceedings of International Joint Conference on Artificial Intelligence (2005)
Google Scholar
Mei, Q., Cai, D., Zhang, D., Zhai, C.: Topic modeling with network regularization. In: Proceeding of the 17th International Conference on World Wide Web (2008)
Google Scholar
Mimno, D., McCallum, A.: Expertise modeling for matching papers with reviewers. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD (2007)
Google Scholar
Mimno, D., McCallum, A.: Topic models conditioned on arbitrary features with dirichlet-multinomial regression. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, UAI (2008)
Google Scholar
Nallapati, R., Cohen, W.: Link-plsa-lda: A new unsupervised model for topics and influence of blogs. In: Proceedings of the International Conference on Weblogs and Social Media, ICWSM (2008)
Google Scholar
Neumann, M., Kersting, K., Xu, Z., Schulz, D.: Stacked gaussian process learning. In: Kargupta, W.W.H. (ed.) Proceedings of the 9th IEEE International Conference on Data Mining (ICDM-09), Miami, FL, USA (December 6-9, 2009)
Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)
MATH Google Scholar
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceeding of UAI (2004)
Google Scholar
Silva, R., Chu, W., Ghahramani, Z.: Hidden common cause relations in relational learning. In: Neural Information Processing Systems (2007)
Google Scholar
Singh, A.P., Gordon, G.J.: Relational learning via collective matrix factorization. In: Proc. 14th Intl. Conf. on Knowledge Discovery and Data Mining (2008)
Google Scholar
Smola, A.J., Kondor, I.R.: Kernels and regularization on graphs. In: Annual Conference on Computational Learning Theory (2003)
Google Scholar
Steyvers, M., Griffiths, T.L., Dennis, S.: Probabilistic inference in human semantic memory. Trends in Cognitive Science 10, 327–334 (2006)
Article Google Scholar
Wang, C., Blei, D.M., Heckerman, D.: Continuous time dynamic topic models. In: Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (2008)
Google Scholar
Xu, Z., Kersting, K., Tresp, V.: Multi-relational learning with gaussian processes. In: Boutilier, C. (ed.) Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI-09 (2009)
Google Scholar
Xu, Z., Tresp, V., Yu, K., Kriegel, H.-P.: Infinite hidden relational models. In: Proc. 22nd UAI (2006)
Google Scholar
Yu, K., Chu, W.: Gaussian process models for link analysis and transfer learning. In: Neural Information Processing Systems (2007)
Google Scholar
Zhu, X., Kandola, J., Lafferty, J., Ghahramani, Z.: Graph kernels by spectral transforms. In: Chapelle, O., Schoelkopf, B., Zien, A. (eds.) Semi-Supervised Learning. MIT Press, Cambridge (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Knowledge Discovery Department, Fraunhofer IAIS, Schloss Birlinghoven, 53754, Sankt Augustin, Germany
Mirwaes Wahabzada, Zhao Xu & Kristian Kersting

Authors

Mirwaes Wahabzada
View author publications
You can also search for this author in PubMed Google Scholar
Zhao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Kristian Kersting
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Matemáticas, Estadística y Computación, Universidad de Cantabria, Avenida de los Castros, s/n, 39071, Santander, Spain
José Luis Balcázar
Yahoo! Research Barcelona, Avinguda Diagonal 177, 08018, Barcelona, Spain
Francesco Bonchi
Yahoo! Research Barcelona, Avinguda Diagnonal 177, 08018, Barcelona, Spain
Aristides Gionis
TAO, CNRS-INRIA-LRI, Université Paris-Sud, 91405, Orsay, France
Michèle Sebag

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wahabzada, M., Xu, Z., Kersting, K. (2010). Topic Models Conditioned on Relations. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15939-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-15939-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15938-1
Online ISBN: 978-3-642-15939-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Topic Models Conditioned on Relations

Abstract

Chapter PDF

Similar content being viewed by others

Hierarchical Dirichlet Processes with Social Influence

Probabilistic Explicit Topic Modeling Using Wikipedia

Sentence level topic models for associated topics extraction

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Topic Models Conditioned on Relations

Abstract

Chapter PDF

Similar content being viewed by others

Hierarchical Dirichlet Processes with Social Influence

Probabilistic Explicit Topic Modeling Using Wikipedia

Sentence level topic models for associated topics extraction

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation