A Cross-Modal CCA-Based Astroturfing Detection Approach

  • Xiaoxuan Bai
  • Yingxiao Xiang
  • Wenjia Niu
  • Jiqiang Liu
  • Tong Chen
  • Jingjing Liu
  • Tong Wu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10631)


In recent years, astroturfing can generate abnormal, damaging even illegal behaviors in cyberspace which may mislead the public perception and bring a bad effect on both Internet users and society. This paper aims to design a algorithm to detect astroturfing in online shopping effectively and help users to identify potential online astroturfers quickly. The previous work used single method text-text or image-image to detect astroturfing, while in this paper we first propose a cross-modal canonical correlation analysis model (CCCA) which combines text and images. First, we identify several features of astroturfing and analysis these features. Then, we use feature extraction algorithm, image similarity algorithm and CCA algorithm, and propose a cross-modal method to detect astroturfing which release comments with pictures. We also conduct an experiment on a Taobao dataset to verify our method. The experimental results show that the supervised method proposed is effective.


Astroturfing detection Canonical correlation analysis algorithm Cross-modal method CCCA model 



This material is based upon work supported by the National Natural Science Foundation of China (Grant Nos. 61672092, 61502030, 61672091), Science and Technology on Information Assurance Laboratory (No. 614200103011711), BM-IIE Project (No. BMK2017B02-2), the Fundamental Research Funds for the Central Universities (No. 2017RC016), National High Technology Research and Development Program of China (863 Program) (No. 2015AA016003).


  1. 1.
    Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: Proceedings of the 26th Annual Computer Security Applications Conference, pp. 1–9. ACM (2010)Google Scholar
  2. 2.
    Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 309–319. Association for Computational Linguistics (2011)Google Scholar
  3. 3.
    Chen, C., Wu, K., Srinivasan, V., Zhang, X.: Battling the internet water army: detection of hidden paid posters. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 116–120. IEEE (2013)Google Scholar
  4. 4.
    Duh, A., Štiglic, G., Korošak, D.: Enhancing identification of opinion spammer groups. In: Proceedings of International Conference on Making Sense of Converging Media, p. 326. ACM (2013)Google Scholar
  5. 5.
    Lim, E.-P., Nguyen, V.-A., Jindal, N., Liu, B., Lauw, H.W.: Detecting product review spammers using rating behaviors. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 939–948. ACM (2010)Google Scholar
  6. 6.
    Mukherjee, A., Kumar, A., Liu, B., Wang, J., Hsu, M., Castellanos, M., Ghosh, R.: Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–640. ACM (2013)Google Scholar
  7. 7.
    Lu, Y., Zhang, L., Xiao, Y., Li, Y.: Simultaneously detecting fake reviews and review spammers using factor graph model. In: Proceedings of the 5th Annual ACM Web Science Conference, pp. 225–233. ACM (2013)Google Scholar
  8. 8.
    Mukherjee, A., Liu, B., Glance, N.: Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st International Conference on World Wide Web, pp. 191–200. ACM (2012)Google Scholar
  9. 9.
    Peng, L., Bin, W., Zhiwei, S., Yachao, C., Hengxun, L.: Tag-TextRank: a webpage keyword extraction method based on tags. J. Comput. Res. Dev. 49(11), 2344–2351 (2012)Google Scholar
  10. 10.
    Lin, Y., Lv, F., Zhu, S., Yang, M., Cour, T., Yu, K., Cao, L., Huang, T.: Large-scale image classification: fast feature extraction and SVM training. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1689–1696. IEEE (2011)Google Scholar
  11. 11.
    Mizuno, K., Terachi, Y., Takagi, K., Izumi, S., Kawaguchi, H., Yoshimoto, M.: Architectural study of hog feature extraction processor for real-time object detection. In: 2012 IEEE Workshop on Signal Processing Systems (SiPS), pp. 197–202. IEEE (2012)Google Scholar
  12. 12.
    Pereira, J.C., Coviello, E., Doyle, G., Rasiwasia, N., Lanckriet, G.R.G., Levy, R., Vasconcelos, N.: On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 521–535 (2014)CrossRefGoogle Scholar
  13. 13.
    Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R.G., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 251–260. ACM (2010)Google Scholar
  14. 14.
    Wang, K., He, R., Wang, W., Wang, L., Tan, T.: Learning coupled feature spaces for cross-modal matching. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2095 (2013)Google Scholar
  15. 15.
    Ranjan, V., Rasiwasia, N., Jawahar, C.V.: Multi-label cross-modal retrieval. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4094–4102 (2015)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Beijing Key Laboratory of Security and Privacy in Intelligent TransportationBeijing Jiaotong UniversityBeijingChina
  2. 2.Tsinghua UniversityBeijingChina

Personalised recommendations