Skip to main content

Predicting Relative Popularity via an End-to-End Multi-modality Model

  • Conference paper
  • First Online:
Book cover Digital TV and Wireless Multimedia Communication (IFTC 2017)

Abstract

Popularity prediction is important for many applications such as service design, network management and so on. Among several factors affecting popularity, content plays a key role, especially when we lack the time sequence data of historical consumption. However, exploring the influence of content-factors on popularity is not easy because of the increasing heterogeneous modalities and their sophisticated inner interplay. In this paper, we utilize several modes to predict popularity. In the meanwhile, considering that it is difficult and little significant to predict the exact number of popularity, we aim to rank pairs of content which is called relative popularity prediction. We cast the relative popularity prediction problem as a classification task and propose an end-to-end multi-modality model with the help of deep neural network. This model combines visual and textual information, maps them to a common feature space and implicitly constructs the interaction between them. Experimental result on real-world data has demonstrated the effectiveness of our model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)

  2. Bao, P.: Modeling and predicting popularity dynamics via an influence-based self-excited hawkes process. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 1897–1900. ACM (2016)

    Google Scholar 

  3. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). http://dl.acm.org/citation.cfm?id=944919.944937

    MATH  Google Scholar 

  4. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)

    Google Scholar 

  5. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)

    Google Scholar 

  6. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  7. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw. 18(5), 602–610 (2005)

    Article  Google Scholar 

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  9. Hessel, J., Lee, L., Mimno, D.: Cats and captions vs. creators and the clock: comparing multimodal content to context in predicting relative popularity. In: Proceedings of the 26th International Conference on World Wide Web, pp. 927–936. International World Wide Web Conferences Steering Committee (2017)

    Google Scholar 

  10. Hill, F., Cho, K., Korhonen, A.: Learning distributed representations of sentences from unlabelled data. arXiv preprint arXiv:1602.03483 (2016)

  11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  12. Jiang, L., Miao, Y., Yang, Y., Lan, Z., Hauptmann, A.G.: Viral video style: a closer look at viral videos on Youtube. In: Proceedings of International Conference on Multimedia Retrieval, p. 193. ACM (2014)

    Google Scholar 

  13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)

    Google Scholar 

  14. Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)

    Google Scholar 

  15. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  16. Pinto, H., Almeida, J.M., Gonçalves, M.A.: Using early view patterns to predict the popularity of Youtube videos. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 365–374. ACM (2013)

    Google Scholar 

  17. Szabo, G., Huberman, B.A.: Predicting the popularity of online content. Commun. ACM 53(8), 80–88 (2010)

    Article  Google Scholar 

  18. Trzcinski, T., Rokita, P.: Predicting popularity of online videos using support vector regression. IEEE Trans. Multimed. 19, 2561–2570 (2017)

    Article  Google Scholar 

  19. Wu, J., Zhou, Y., Chiu, D.M., Zhu, Z.: Modeling dynamics of online video popularity. IEEE Trans. Multimed. 18(9), 1882–1895 (2016)

    Article  Google Scholar 

  20. Yan, F., Mikolajczyk, K.: Deep correlation for matching images and text. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3441–3450 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongxiang Cai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cai, H., Zhang, Y., Wang, Y., Wang, X., Mei, J., Huang, Z. (2018). Predicting Relative Popularity via an End-to-End Multi-modality Model. In: Zhai, G., Zhou, J., Yang, X. (eds) Digital TV and Wireless Multimedia Communication. IFTC 2017. Communications in Computer and Information Science, vol 815. Springer, Singapore. https://doi.org/10.1007/978-981-10-8108-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8108-8_32

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8107-1

  • Online ISBN: 978-981-10-8108-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics