Abstract
Web-based technologies like social media have become primary news outlets for many people in recent years. Considering the fact that these digital outlets are extremely vulnerable to misinformation and fake news which may impact a user’s opinion toward social, political, and economic issues, the necessity of robust and efficient approaches for misinformation detection task comes to light more than ever. The majority of misinformation detection approaches previously proposed leverage manually extracted features and supervised classifiers which require a large number of labeled data which is often infeasible to collect in practice. To meet this challenge, in this work we propose a novel strategy mixing tensor-based modeling of article content and semi-supervised learning on article embeddings for the misinformation detection task which requires very few labels to achieve state-of-the-art results. We propose and experiment with three different article content modeling variations which target article body text or title, and enable meaningful representations of word co-occurrences which are discriminative in the downstream news categorization task. We tested our approach on real world data and the evaluation results show that we achieve 75% accuracy using only 30% of the labeled data of a public dataset while the previously proposed and published SVM-based classifier results in 67% accuracy. Moreover, our approach achieves 71% accuracy on a large dataset using only 2% of the labels. Additionally, our approach is able to classify articles into different fake news categories (clickbait, bias, rumor, hate, and junk science) by only using the titles of the articles, with roughly 70% accuracy and 30% of the labeled data.
The authors contributed equally to this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We experimented with small values of that window and results were qualitatively similar.
- 2.
- 3.
- 4.
- 5.
References
Bader, B.W., Kolda, T.G.: Matlab tensor toolbox version 2.6. Available online (2015)
Biyani, P., Tsioutsiouliklis, K., Blackmer, J.: “8 amazing secrets for getting more clicks”: detecting clickbaits in news streams using article informality. In: Proceedings of the Thirtieth AAAI Conference on Artificial (AAAI’16), pp. 94–100 (2016)
Braunstein, A., Mézard, M., Zecchina, R.: Survey propagation: an algorithm for satisfiability. Random Struct. Algorithms 27(2), 201–226 (2005). https://doi.org/10.1002/rsa.v27:2
BS Detector (2017). http://bsdetector.tech/
Chen, Y., Conroy, N., Rubin, V.: Misleading online content: recognizing clickbait as “false news” (2015). https://doi.org/10.1145/2823465.2823467
Guacho, G.B., Abdali, S., Shah, N., Papalexakis, E.E.: Semi-supervised content-based detection of misinformation via tensor embeddings. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 322–325 (2018). https://doi.org/10.1109/ASONAM.2018.8508241
Gupta, A., Lamba, H., Kumaraguru, P.: $1.00 per rt #bostonmarathon #prayforboston: analyzing fake content on twitter. In: 2013 APWG eCrime Researchers Summit, pp. 1–12 (2013). https://doi.org/10.1109/eCRS.2013.6805772
Gupta, A., Lamba, H., Kumaraguru, P., Joshi, A.: Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In: Proceedings of the 22Nd International Conference on World Wide Web, WWW ’13 Companion, pp. 729–736. ACM, New York (2013). https://doi.org/10.1145/2487788.2488033
Gupta, M., Zhao, P., Han, J.: Evaluating Event Credibility on Twitter, pp. 153–164. https://doi.org/10.1137/1.9781611972825.14, http://epubs.siam.org/doi/abs/10.1137/1.9781611972825.14
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco (2011)
Hardalov, M., Koychev, I., Nakov, P.: In Search of Credible News. Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2016 Lecture Notes in Computer Science, pp. 172–180 (2016). https://doi.org/10.1007/978-3-319-44748-3_17
Harshman, R.A.: Foundations of the PARAFAC procedure: models and conditions for an“ explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics 16(1), 84 (1970)
Horne, B.D., Adali, S.: This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. CoRR abs/1703.09398 (2017). http://arxiv.org/abs/1703.09398
Hosseinimotlagh, S., Papalexakis, E.E.: Unsupervised content-based identification of fake news articles with tensor decomposition ensembles (2017)
Jin, Z., Cao, J., Jiang, Y.G., Zhang, Y.: News credibility evaluation on microblog with a hierarchical propagation model. In: 2014 IEEE International Conference on Data Mining, pp. 230–239 (2014). https://doi.org/10.1109/ICDM.2014.91
Jin, Z., Cao, J., Zhang, Y., Luo, J.: News verification by exploiting conflicting social viewpoints in microblogs (2016). https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12128
Koutra, D., Ke, T.Y., Kang, U., Chau, D., Pao, H.K., Faloutsos, C.: Unifying guilt-by-association approaches: theorems and fast algorithms. In: Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), Lecture Notes in Computer Science, vol. 6912, pp. 245–260. Springer, Berlin/Heidelberg (2011)
Kumar, S., Shah, N.: False information on web and social media: a survey. arXiv preprint arXiv:1804.08559 (2018)
Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B.J., Wong, K.F., Cha, M.: Detecting rumors from microblogs with recurrent neural networks. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI’16), pp. 3818–3824. AAAI Press (2016). http://dl.acm.org/citation.cfm?id=3061053.3061153
Ma, J., Gao, W., Wei, Z., Lu, Y., Wong, K.F.: Detect rumors using time series of social context information on microblogging websites. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM ’15), pp. 1751–1754. ACM, New York (2015). https://doi.org/10.1145/2806416.2806607
Papalexakis, E.E., Faloutsos, C., Sidiropoulos, N.D.: Tensors for data mining and data fusion: models, applications, and scalable algorithms. ACM Trans. Intell. Syst. Technol. 8(2), 16:1–16:44 (2016). https://doi.org/10.1145/2915921
Pelleg, D., Moore, A.: Accelerating exact k-means algorithms with geometric reasoning. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 277–281. ACM (1999)
Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: identifying misinformation in microblogs. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11), pp. 1589–1599. Association for Computational Linguistics, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2145432.2145602
Rubin, V.L., Conroy, N.J., Chen, Y., Cornwell, S.: Fake news or truth? using satirical cues to detect potentially misleading news (2016)
Ruchansky, N., Seo, S., Liu, Y.: CSI: a hybrid deep model for fake news. CoRR abs/1703.06959 (2017). http://arxiv.org/abs/1703.06959
Shah, N., Beutel, A., Gallagher, B., Faloutsos, C.: Spotting suspicious link behavior with fbox: an adversarial perspective. In: IEEE International Conference on Data Mining (ICDM), 2014, pp. 959–964. IEEE (2014)
Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. CoRR abs/1708.01967 (2017). http://arxiv.org/abs/1708.01967
Shu, K., Le, T., Lee, D., Huan, L.: Deep headline generation for clickbait detection. In: IEEE International Conference on Data Mining (ICDM), pp. 467–476 (2018)
Shu, K., Sliva, A., Wang, S., Liu, H.: Beyond news contents: the role of social context for fake news detection. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (WSDM’19), pp. 312–320 (2019)
Shu, K., Cui, L., Wang, S., Lee, D., Liu, H.: Defend: explainable fake news detection. In: Proceedings of 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019)
Sidiropoulos, N., De Lathauwer, L., Fu, X., Huang, K., Papalexakis, E., Faloutsos, C.: Tensor decomposition for signal processing and machine learning. IEEE Trans. Signal Process. PP (2016). https://doi.org/10.1109/TSP.2017.2690524
Silverman, C.: This analysis shows how fake election news stories outperformed real news on facebook. BuzzFeed News (2016)
Wu, L., Li, J., Hu, X., Liu, H.: Gleaning wisdom from the past: early detection of emerging rumors in social media. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 99–107. SIAM (2017)
Wu, L., Liu, H.: Tracing fake-news footprints: characterizing social media messages by how they propagate. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 637–645. ACM (2018)
Yedidia, J., Freeman, W., Weiss, Y.: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theory 51, 2282–2312 (2005). https://doi.org/10.1109/TIT.2005.850085
Acknowledgements
This research was supported by a gift from Snap Inc, an Adobe Data Science Faculty Award, by the Department of the Navy, Naval Engineering Education Consortium under award no. N00174-17-1-0005, and by the National Science Foundation CDS&E Grant no. OAC-1808591. Any opinions, findings, and conclusions or recommendations expressed here are those of the author(s) and do not necessarily reflect the views of the funding parties. We would also like to thank Daniel Fonseca for proofreading of the book chapter.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Abdali, S., Bastidas, G.G., Shah, N., Papalexakis, E.E. (2020). Tensor Embeddings for Content-Based Misinformation Detection with Limited Supervision. In: Shu, K., Wang, S., Lee, D., Liu, H. (eds) Disinformation, Misinformation, and Fake News in Social Media. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-42699-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-42699-6_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-42698-9
Online ISBN: 978-3-030-42699-6
eBook Packages: Computer ScienceComputer Science (R0)