World Wide Web

, Volume 22, Issue 4, pp 1639–1655 | Cite as

A layer-wise deep stacking model for social image popularity prediction

  • Zehang Lin
  • Feitao Huang
  • Yukun Li
  • Zhenguo YangEmail author
  • Wenyin LiuEmail author


In this paper, we present a Layer-wise Deep Stacking (LDS) model to predict the popularity of Flickr-like social posts. LDS stacks multiple regression models in multiple layers, which enables the different models to complement and reinforce each other. To avoid overfitting, a dropout module is introduced to randomly activate the data being fed into the regression models in each layer. In particular, a detector is devised to determine the depth of LDS automatically by monitoring the performance of the features achieved by the LDS layers. Extensive experiments conducted on a public dataset consisting of 432K Flickr image posts manifest the effectiveness and significance of the LDS model and its components. LDS achieves competitive performance on multiple metrics: Spearman’s Rho: 83.50%, MAE: 1.038, and MSE: 2.011, outperforming state-of-the-art approaches for social image popularity prediction.


Social media analysis Social image popularity prediction Stacking model Regression 



This work is supported by the National Natural Science Foundation of China (No. 61703109, No. 91748107), and the Guangdong Innovative Research Team Program (No. 2014ZT05G157).


  1. 1.
    Aloufi, S., Zhu, S., El, S.A.: On the prediction of flickr image popularity by analyzing heterogeneous social sensory data. Sensors 17(3), 631 (2017)CrossRefGoogle Scholar
  2. 2.
    Ansari, A., Essegaier, S., Kohli, R.: Internet recommendation systems. J. Mark. Res. 37(3), 363–375 (2000)CrossRefGoogle Scholar
  3. 3.
    Asur, S., Huberman, B.A.: Predicting the future with social media. In: IEEE, vol. 1, pp 492–499 (2010)Google Scholar
  4. 4.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2010)CrossRefzbMATHGoogle Scholar
  5. 5.
    Cao, D., Ji, R., Lin, D., Li, S.: A cross-media public sentiment analysis system for microblog. Multimedia Systems, Springer 22(4), 479–486 (2016)CrossRefGoogle Scholar
  6. 6.
    Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785–794. ACM (2016)Google Scholar
  7. 7.
    Chollet, F.: Xception: deep learning with depthwise separable convolutions. arXiv:161002357 (2017)
  8. 8.
    Cortes, C., Vapnik, V.: Support-vector networks[J]. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  9. 9.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition, CVPR 2009, 0pp. 248–255 (2009)Google Scholar
  10. 10.
    Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)CrossRefzbMATHGoogle Scholar
  11. 11.
    Guo, C., Li, B., Tian, X.: Flickr group recommendation using rich social media information. Neurocomputing 204, 8–16 (2016)CrossRefGoogle Scholar
  12. 12.
    Hsu, C.C., Lee, Y.C., Lu, P.E., Lu, S.S., Lai, H.T.: Social media prediction based on residual learning and random forest. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1865–1870 (2017)Google Scholar
  13. 13.
    Huang, X., Gao, Y., Quan, F., Sang, J., Xu, C.: Towards SMP challenge: stacking of diverse models for social image popularity prediction. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1895–1900 (2017)Google Scholar
  14. 14.
    Jahrer, M., Töscher, A., Legenstein, R.: Combining predictions for accurate recommender systems. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 693–702. ACM (2010)Google Scholar
  15. 15.
    Khosla, A., Das, S.A., Hamid, R.: What makes an image popular?. In: Proceedings of the 23rd international conference on World Wide Web, pp. 867–876. ACM (2014)Google Scholar
  16. 16.
    Lee, W.Y., Kuo, Y., Hsieh, P.J., Cheng, W., Chao, T., Hsieh, H.L., Tsai, C.E., Chang, H., Lan, J., Hsu, W.: Unsupervised latent aspect discovery for diverse event summarization. In: Proceedings of the 23rd ACM international conference on Multimedia, pp. 197–200. ACM (2015)Google Scholar
  17. 17.
    Li, C., Yue, L., Mei, Q., Wang, D., Sandeep, P.: Sandeep Pandey Click-through Prediction for Advertising in Twitter Timeline. In: ACM SIGKDD, pp. 1959–1968 (2015)Google Scholar
  18. 18.
    Li, L.S.R., Gao, J., Yang, Z., Liu, W.: A hybrid model combining convolutional neural network with XGBoost for predicting social media popularity. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1912–1917 (2017)Google Scholar
  19. 19.
    Liu, B.: Sentiment analysis and opinion mining. Encyclopedia of Machine Learning and Data Mining, pp. 1–10 (2016)Google Scholar
  20. 20.
    Lv, J., Liu, W., Zhang, M., Gong, H., Wu, B., Ma, H.: Multi-feature fusion for predicting social media popularity. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1883–1888 (2017)Google Scholar
  21. 21.
    Natarajan, P., Wu, S., Vitaladevuni, S., Zhuang, X.: Multimodal feature fusion for robust event detection in Web videos. In: Computer vision and pattern recognition IEEE, pp. 1298–1305 (2012)Google Scholar
  22. 22.
    Nguyen, H.M., Woo, S., Im, J., Jun, T., Kim, D.: A Workload Prediction Approach Using Models Stacking Based on Recurrent Neural Network and Autoencoder IEEE, 929-936 (2016)Google Scholar
  23. 23.
    Park, T., Casella, G.: The bayesian lasso. J. Am. Stat. Assoc. 103(482), 681–686 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Rabinovich, M., Spatschek, O.: Web caching and replication. SIGMOD (2003)Google Scholar
  25. 25.
    Roy, S.D., Mei, T., Zeng, W., L, S.: Towards cross-domain learning for social video popularity prediction. IEEE Transactions on multimedia 15, 1255–1267 (2013)CrossRefGoogle Scholar
  26. 26.
    Schinas, M., Papadopoulos, S., Petkos, G., Kompatsiaris, Y., Mitkas, P.A.: Multimodal graph-based event detection and summarization in social media streams. In: Proceedings of the 23rd ACM international conference on Multimedia, pp. 189-192. ACM (2015)Google Scholar
  27. 27.
    Sill, J., Takcs, G., Mackey, L., Lin, D.: Feature-weighted linear stacking. arXiv:09110460 (2009)
  28. 28.
  29. 29.
    Snedecor, G.W., Cocheran, W.G.: Statistical methods, 7th edn., p 192. Iowa State University Press, Ames (1980)Google Scholar
  30. 30.
    Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp 2951–2959 (2012)Google Scholar
  31. 31.
    Spyrou, E., Mylonas, P.: Analyzing Flickr metadata to extract location-based information and semantically organize its photo content. Neurocomputing 172, 114–133 (2016)CrossRefGoogle Scholar
  32. 32.
    Srivastava, N., Hinton, G., Krizhevsky, A.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Tatar, A., de Amorim, M.D., Fdida, S., Antoniadis, P.: A Survey on Predicting the Popularity of Web Content, vol. 5, p 8 (2014)Google Scholar
  34. 34.
    Tkachenko, N., Jarvis, S., Procter, R.: Predicting floods with flickr tags. Accessed 24 Feb. 2017 (2017)
  35. 35.
    Vishwanath, D., Gupta, S.: Adding CNNs to the Mix: stacking models for sentiment classification. In: India conference (INDICON), 2016 IEEE Annual, pp 1–4 (2016)Google Scholar
  36. 36.
    Wang, P., Wang, Z., Wang, D.: Recurrent deep stacking networks for speech recognition. arXiv:161204675 (2016)
  37. 37.
    Wang, s, Guo, W.: Sparse multi-graph embedding for multimodal feature representation. IEEE Trans. Multimedia PP 99, 1–1 (2017)Google Scholar
  38. 38.
    Wang, W., Zhang, W.: Combining multiple features for image popularity prediction in social media. In: Proceedings of the 2017 ACM on multimedia conference, pp. 1865–1870. ACM (2017)Google Scholar
  39. 39.
    Wu, B., Cheng, W.H., Zhang, Y., Huang, Q., Li, J., Mei, T.: Sequential prediction of social media popularity with deep temporal context networks. In: Proceedings of the Twenty-Sixth international joint conference on artificial intelligence, IJCAI (2017)Google Scholar
  40. 40.
    Wu, B., Mei, T., Cheng, W.H., Zhang, Y.: Time matters: multi-scale temporalization of social media popularity. In: Proceedings of the 2016 ACM on multimedia conference, pp. 1336–1344 (2016)Google Scholar
  41. 41.
    Wu, B., Mei, T., Cheng, W.H., Zhang, Y.: Unfolding temporal dynamics: predicting social media popularity using multi-scale temporal decomposition Proceeding of AAAI, pp. 272–278 (2016)Google Scholar
  42. 42.
  43. 43.
    Xie, S., Girshick, R., Dollr, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Computer vision and pattern recognition, CVPR 2017, pp. 5987–5995 (2017)Google Scholar
  44. 44.
    Yang, Z., Li, Q., Li, Z., Ma, Y., Gong, Z., Liu, W.: Dual structure constrained multimodal feature coding for social event detection from flickr data. ACM transactions on internet technology (2017)Google Scholar
  45. 45.
    Yang, Z., Li, Q., Lu, Z., Gong, Z., Liu, W.: Dual graph regularized NMF model for social event detection from flickr data. World Wide Web J 20(5), 995–1015 (2017)CrossRefGoogle Scholar
  46. 46.
    You, Q., Luo, J., Jin, H., Yang, J.: Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia. In: Proceedings of the Ninth ACM international conference on Web search and data mining, pp. 13–22. ACM (2016)Google Scholar
  47. 47.
    Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyGuangdong University of TechnologyGuangzhouChina

Personalised recommendations