Advertisement

Tensor Embeddings for Content-Based Misinformation Detection with Limited Supervision

  • Sara AbdaliEmail author
  • Gisel G. Bastidas
  • Neil Shah
  • Evangelos E. Papalexakis
Chapter
  • 60 Downloads
Part of the Lecture Notes in Social Networks book series (LNSN)

Abstract

Web-based technologies like social media have become primary news outlets for many people in recent years. Considering the fact that these digital outlets are extremely vulnerable to misinformation and fake news which may impact a user’s opinion toward social, political, and economic issues, the necessity of robust and efficient approaches for misinformation detection task comes to light more than ever. The majority of misinformation detection approaches previously proposed leverage manually extracted features and supervised classifiers which require a large number of labeled data which is often infeasible to collect in practice. To meet this challenge, in this work we propose a novel strategy mixing tensor-based modeling of article content and semi-supervised learning on article embeddings for the misinformation detection task which requires very few labels to achieve state-of-the-art results. We propose and experiment with three different article content modeling variations which target article body text or title, and enable meaningful representations of word co-occurrences which are discriminative in the downstream news categorization task. We tested our approach on real world data and the evaluation results show that we achieve 75% accuracy using only 30% of the labeled data of a public dataset while the previously proposed and published SVM-based classifier results in 67% accuracy. Moreover, our approach achieves 71% accuracy on a large dataset using only 2% of the labels. Additionally, our approach is able to classify articles into different fake news categories (clickbait, bias, rumor, hate, and junk science) by only using the titles of the articles, with roughly 70% accuracy and 30% of the labeled data.

Keywords

Misinformation Fake news detection Tensor decomposition Semi-supervised learning 

Notes

Acknowledgements

This research was supported by a gift from Snap Inc, an Adobe Data Science Faculty Award, by the Department of the Navy, Naval Engineering Education Consortium under award no. N00174-17-1-0005, and by the National Science Foundation CDS&E Grant no. OAC-1808591. Any opinions, findings, and conclusions or recommendations expressed here are those of the author(s) and do not necessarily reflect the views of the funding parties. We would also like to thank Daniel Fonseca for proofreading of the book chapter.

References

  1. 1.
    Bader, B.W., Kolda, T.G.: Matlab tensor toolbox version 2.6. Available online (2015)Google Scholar
  2. 2.
    Biyani, P., Tsioutsiouliklis, K., Blackmer, J.: “8 amazing secrets for getting more clicks”: detecting clickbaits in news streams using article informality. In: Proceedings of the Thirtieth AAAI Conference on Artificial (AAAI’16), pp. 94–100 (2016)Google Scholar
  3. 3.
    Braunstein, A., Mézard, M., Zecchina, R.: Survey propagation: an algorithm for satisfiability. Random Struct. Algorithms 27(2), 201–226 (2005).  https://doi.org/10.1002/rsa.v27:2 MathSciNetCrossRefGoogle Scholar
  4. 4.
    BS Detector (2017). http://bsdetector.tech/
  5. 5.
    Chen, Y., Conroy, N., Rubin, V.: Misleading online content: recognizing clickbait as “false news” (2015). https://doi.org/10.1145/2823465.2823467
  6. 6.
    Guacho, G.B., Abdali, S., Shah, N., Papalexakis, E.E.: Semi-supervised content-based detection of misinformation via tensor embeddings. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 322–325 (2018).  https://doi.org/10.1109/ASONAM.2018.8508241
  7. 7.
    Gupta, A., Lamba, H., Kumaraguru, P.: $1.00 per rt #bostonmarathon #prayforboston: analyzing fake content on twitter. In: 2013 APWG eCrime Researchers Summit, pp. 1–12 (2013).  https://doi.org/10.1109/eCRS.2013.6805772
  8. 8.
    Gupta, A., Lamba, H., Kumaraguru, P., Joshi, A.: Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In: Proceedings of the 22Nd International Conference on World Wide Web, WWW ’13 Companion, pp. 729–736. ACM, New York (2013). https://doi.org/10.1145/2487788.2488033
  9. 9.
    Gupta, M., Zhao, P., Han, J.: Evaluating Event Credibility on Twitter, pp. 153–164. https://doi.org/10.1137/1.9781611972825.14, http://epubs.siam.org/doi/abs/10.1137/1.9781611972825.14
  10. 10.
    Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco (2011)zbMATHGoogle Scholar
  11. 11.
    Hardalov, M., Koychev, I., Nakov, P.: In Search of Credible News. Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2016 Lecture Notes in Computer Science, pp. 172–180 (2016). https://doi.org/10.1007/978-3-319-44748-3_17
  12. 12.
    Harshman, R.A.: Foundations of the PARAFAC procedure: models and conditions for an“ explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics 16(1), 84 (1970)Google Scholar
  13. 13.
    Horne, B.D., Adali, S.: This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. CoRR abs/1703.09398 (2017). http://arxiv.org/abs/1703.09398
  14. 14.
    Hosseinimotlagh, S., Papalexakis, E.E.: Unsupervised content-based identification of fake news articles with tensor decomposition ensembles (2017)Google Scholar
  15. 15.
    Jin, Z., Cao, J., Jiang, Y.G., Zhang, Y.: News credibility evaluation on microblog with a hierarchical propagation model. In: 2014 IEEE International Conference on Data Mining, pp. 230–239 (2014).  https://doi.org/10.1109/ICDM.2014.91
  16. 16.
    Jin, Z., Cao, J., Zhang, Y., Luo, J.: News verification by exploiting conflicting social viewpoints in microblogs (2016). https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12128
  17. 17.
    Koutra, D., Ke, T.Y., Kang, U., Chau, D., Pao, H.K., Faloutsos, C.: Unifying guilt-by-association approaches: theorems and fast algorithms. In: Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), Lecture Notes in Computer Science, vol. 6912, pp. 245–260. Springer, Berlin/Heidelberg (2011)Google Scholar
  18. 18.
    Kumar, S., Shah, N.: False information on web and social media: a survey. arXiv preprint arXiv:1804.08559 (2018)Google Scholar
  19. 19.
    Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B.J., Wong, K.F., Cha, M.: Detecting rumors from microblogs with recurrent neural networks. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI’16), pp. 3818–3824. AAAI Press (2016). http://dl.acm.org/citation.cfm?id=3061053.3061153
  20. 20.
    Ma, J., Gao, W., Wei, Z., Lu, Y., Wong, K.F.: Detect rumors using time series of social context information on microblogging websites. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM ’15), pp. 1751–1754. ACM, New York (2015). https://doi.org/10.1145/2806416.2806607
  21. 21.
    Papalexakis, E.E., Faloutsos, C., Sidiropoulos, N.D.: Tensors for data mining and data fusion: models, applications, and scalable algorithms. ACM Trans. Intell. Syst. Technol. 8(2), 16:1–16:44 (2016). https://doi.org/10.1145/2915921
  22. 22.
    Pelleg, D., Moore, A.: Accelerating exact k-means algorithms with geometric reasoning. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 277–281. ACM (1999)Google Scholar
  23. 23.
    Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: identifying misinformation in microblogs. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11), pp. 1589–1599. Association for Computational Linguistics, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2145432.2145602
  24. 24.
    Rubin, V.L., Conroy, N.J., Chen, Y., Cornwell, S.: Fake news or truth? using satirical cues to detect potentially misleading news (2016)Google Scholar
  25. 25.
    Ruchansky, N., Seo, S., Liu, Y.: CSI: a hybrid deep model for fake news. CoRR abs/1703.06959 (2017). http://arxiv.org/abs/1703.06959
  26. 26.
    Shah, N., Beutel, A., Gallagher, B., Faloutsos, C.: Spotting suspicious link behavior with fbox: an adversarial perspective. In: IEEE International Conference on Data Mining (ICDM), 2014, pp. 959–964. IEEE (2014)Google Scholar
  27. 27.
    Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. CoRR abs/1708.01967 (2017). http://arxiv.org/abs/1708.01967
  28. 28.
    Shu, K., Le, T., Lee, D., Huan, L.: Deep headline generation for clickbait detection. In: IEEE International Conference on Data Mining (ICDM), pp. 467–476 (2018)Google Scholar
  29. 29.
    Shu, K., Sliva, A., Wang, S., Liu, H.: Beyond news contents: the role of social context for fake news detection. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (WSDM’19), pp. 312–320 (2019)Google Scholar
  30. 30.
    Shu, K., Cui, L., Wang, S., Lee, D., Liu, H.: Defend: explainable fake news detection. In: Proceedings of 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019)Google Scholar
  31. 31.
    Sidiropoulos, N., De Lathauwer, L., Fu, X., Huang, K., Papalexakis, E., Faloutsos, C.: Tensor decomposition for signal processing and machine learning. IEEE Trans. Signal Process. PP (2016).  https://doi.org/10.1109/TSP.2017.2690524
  32. 32.
    Silverman, C.: This analysis shows how fake election news stories outperformed real news on facebook. BuzzFeed News (2016)Google Scholar
  33. 33.
    Wu, L., Li, J., Hu, X., Liu, H.: Gleaning wisdom from the past: early detection of emerging rumors in social media. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 99–107. SIAM (2017)Google Scholar
  34. 34.
    Wu, L., Liu, H.: Tracing fake-news footprints: characterizing social media messages by how they propagate. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 637–645. ACM (2018)Google Scholar
  35. 35.
    Yedidia, J., Freeman, W., Weiss, Y.: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theory 51, 2282–2312 (2005).  https://doi.org/10.1109/TIT.2005.850085 MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Sara Abdali
    • 1
    Email author
  • Gisel G. Bastidas
    • 1
  • Neil Shah
    • 2
  • Evangelos E. Papalexakis
    • 1
  1. 1.University of CaliforniaRiversideUSA
  2. 2.Snap Inc.Los AngelesUSA

Personalised recommendations