Multimedia Tools and Applications

, Volume 78, Issue 1, pp 141–160 | Cite as

Multi-modal max-margin supervised topic model for social event analysis

  • Feng XueEmail author
  • Jianwei Wang
  • Shengsheng Qian
  • Tianzhu Zhang
  • Xueliang Liu
  • Changsheng Xu


In this paper, we proposed a novel multi-modal max-margin supervised topic model (MMSTM) for social event analysis by jointly learning the representation together with the classifier in a unified framework. Compared with existing methods, the proposed MMSTM model has several advantages. (1) The proposed model can utilize the classifier as the regularization term of our model to jointly learn the parameters in the generative model and max-margin classifier, and use the Gibbs sampling to learn parameters of the representation model and max-margin classifier by minimizing the expected loss function. (2) The proposed model is able to not only effectively mine the multi-modal property by jointly learning the latent topic relevance among multiple modalities for social event representation, but also exploit the supervised information by considering a discriminative max-margin classifier for event classification to boost the classification performance. (3) In order to validate the effectiveness of the proposed model, we collect a large-scale real-world dataset for social event analysis, and both qualitative and quantitative evaluation results have demonstrated the effectiveness of the proposed MMSTM.


Social event classification Multi-modal Max-margin Social media Topic model 



The work is supported by the National Key Research and Development Program of China (No. 2017YFB080 3301). This work is also supported by the National Natural Science Foundation of China (No.61772170, 614 72115, 61572498, 61532009, 61472379, 61572296).


  1. 1.
    Bao Y, Collier N, Datta A (2013) A partially supervised cross-collection topic model for cross-domain text classification. In: ACM International conference on information & knowledge management, pp 239–248Google Scholar
  2. 2.
    Blei DM, Jordan MI (2003) Modeling annotated data. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 127–134Google Scholar
  3. 3.
    Blei DM, Mcauliffe JD (2010) Supervised topic models. Adv Neural Inf Process Syst 3:327–332Google Scholar
  4. 4.
    Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. JMLR 3:993–1022zbMATHGoogle Scholar
  5. 5.
    Firan CS, Georgescu M, Nejdl W, Paiu R (2010) Bringing order to your photos: event-driven classification of flickr images based on social knowledge. In: ACM International conference on information and knowledge management, pp 189–198Google Scholar
  6. 6.
    Gao H, Tang S, Zhang Y, Jiang D, Wu F, Zhuang Y (2012) Supervised cross-collection topic modeling. In: ACM Multimedia, pp 957–960Google Scholar
  7. 7.
    Griffiths TL, Steyvers M (2004) Find Sci Topics 101:5228–5235Google Scholar
  8. 8.
    Hoffman MD, Blei DM, Bach FR (2010) Online learning for latent Dirichlet allocation. Adv Neural Inf Process Syst 23:856–864Google Scholar
  9. 9.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678Google Scholar
  10. 10.
    Krestel R, Fankhauser P, Nejdl W (2009) Latent Dirichlet allocation for tag recommendation. In: ACM Conference on recommender systems, Recsys 2009. New York, pp 61–68Google Scholar
  11. 11.
    Kumaran G, Allan J (2004) Text classification and named entities for new event detection. In: International ACM SIGIR conference on research and development in information retrieval, pp 297–304Google Scholar
  12. 12.
    Lacoste-Julien S, Sha F, Jordan MI (2008) Disclda: discriminative learning for dimensionality reduction and classification. In: Proceedings of NIPS neural information processing systems, pp 897–904Google Scholar
  13. 13.
    Lin D, Xiao J (2013) Characterizing layouts of outdoor scenes using spatial topic processes, pp 841–848Google Scholar
  14. 14.
    Liu X, Huet B (2013) Heterogeneous features and model selection for event-based media classification. In: ACM International conference on multimedia retrieval, pp 151–158Google Scholar
  15. 15.
    Makkonen J, Ahonen-Myka H, Salmenkivi M (2004) Simple semantics in topic detection and tracking. Inf Retr 7(3–4):347–368CrossRefGoogle Scholar
  16. 16.
    Min W, Bao BK, Xu C (2014) Multimodal spatio-temporal theme modeling for landmark analysis. IEEE Multimed 21(3):20–29CrossRefGoogle Scholar
  17. 17.
    Niu Z, Hua G, Gao X, Tian Q (2011) Spatial-disclda for visual recognition. In: Computer vision and pattern recognition, pp 1769–1776Google Scholar
  18. 18.
    Perotte A, Bartlett N, Elhadad N, Wood F (2011) Hierarchically supervised latent Dirichlet allocation. Adv Neural Inf Process Syst 24:2609–2617Google Scholar
  19. 19.
    Qian S, Zhang T, Xu C (2016) Multi-modal multi-view topic-opinion mining for social event analysis. In: ACM on multimedia conference, pp 2–11Google Scholar
  20. 20.
    Qian S, Zhang T, Xu C, Shao J (2016) Multi-modal event topic model for social event analysis. IEEE Trans Multimed 18(2):233–246CrossRefGoogle Scholar
  21. 21.
    Radinsky K, Horvitz E (2013) Mining the web to predict future events. In: ACM International conference on web search and data mining, pp 255–264Google Scholar
  22. 22.
    Ramage D, Hall D, Nallapati R, Manning CD (2009) Labeled lda: a supervised topic model for credit attribution in multi-labeled corpora. In: Conference on empirical methods in natural language processing: volume, 248–256Google Scholar
  23. 23.
    Ramage D, Heymann P, Manning CD, Garcia-Molina H (2009) Clustering the tagged web. In: International conference on web search and web data mining, WSDM 2009. Barcelona, pp 54–63Google Scholar
  24. 24.
    Wang Y, Mori G (2011) Max-margin latent Dirichlet allocation for image classification and annotation. Lect Notes Comput Sci 1674(1):39–48CrossRefGoogle Scholar
  25. 25.
    Min W, Bao BK, Mei S, Zhu Y, Rui Y, Jiang S (2017) “You are what you eat: Exploring rich recipe information for cross-region food analysis”. IEEE Trans Multi 99:1–1Google Scholar
  26. 26.
    Yang S, Yuan C, Wu B, Hu W, Wang F (2015) Multi-feature max-margin hierarchical Bayesian model for action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1610–1618Google Scholar
  27. 27.
    Yang W, Boyd-Graber J, Resnik P (2016) A discriminative topic model using document network structure. In: Meeting of the association for computational linguistics, pp 686–696Google Scholar
  28. 28.
    Yue G, Hanwang Z, Xibin SY (2017) Event classification in microblogs via social tracking. Acm Trans Intell Syst Technol 8(3):35Google Scholar
  29. 29.
    Zhang T, Xu C (2014) Cross-domain multi-event tracking via co-pmht. Acm Trans Multimed Comput Commun Appl 10(4):1–19MathSciNetCrossRefGoogle Scholar
  30. 30.
    Zhang T, Xu C, Zhu G, Liu S (2012) A generic framework for video annotation via semi-supervised learning. IEEE Trans Multimed 14(4):1206–1219CrossRefGoogle Scholar
  31. 31.
    Zhu J, Chen N, Perkins H, Zhang B (2014) Gibbs max-margin topic models with data augmentation. J Mach Learn Res 15(1):1073–1110MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Feng Xue
    • 1
    Email author
  • Jianwei Wang
    • 1
  • Shengsheng Qian
    • 2
  • Tianzhu Zhang
    • 2
  • Xueliang Liu
    • 1
  • Changsheng Xu
    • 1
    • 2
  1. 1.Hefei University of TechnologyHefeiChina
  2. 2.National Lab of Pattern Recognition, Institute of AutomationChinese Academy of SciencesBeijingChina

Personalised recommendations