Tensor Embeddings for Content-Based Misinformation Detection with Limited Supervision

Abdali, Sara; Bastidas, Gisel G.; Shah, Neil; Papalexakis, Evangelos E.

doi:10.1007/978-3-030-42699-6_7

Sara Abdali¹⁷,
Gisel G. Bastidas¹⁷,
Neil Shah¹⁸ &
…
Evangelos E. Papalexakis¹⁷

Part of the book series: Lecture Notes in Social Networks ((LNSN))

8141 Accesses
1 Citations

Abstract

Web-based technologies like social media have become primary news outlets for many people in recent years. Considering the fact that these digital outlets are extremely vulnerable to misinformation and fake news which may impact a user’s opinion toward social, political, and economic issues, the necessity of robust and efficient approaches for misinformation detection task comes to light more than ever. The majority of misinformation detection approaches previously proposed leverage manually extracted features and supervised classifiers which require a large number of labeled data which is often infeasible to collect in practice. To meet this challenge, in this work we propose a novel strategy mixing tensor-based modeling of article content and semi-supervised learning on article embeddings for the misinformation detection task which requires very few labels to achieve state-of-the-art results. We propose and experiment with three different article content modeling variations which target article body text or title, and enable meaningful representations of word co-occurrences which are discriminative in the downstream news categorization task. We tested our approach on real world data and the evaluation results show that we achieve 75% accuracy using only 30% of the labeled data of a public dataset while the previously proposed and published SVM-based classifier results in 67% accuracy. Moreover, our approach achieves 71% accuracy on a large dataset using only 2% of the labels. Additionally, our approach is able to classify articles into different fake news categories (clickbait, bias, rumor, hate, and junk science) by only using the titles of the articles, with roughly 70% accuracy and 30% of the labeled data.

The authors contributed equally to this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We experimented with small values of that window and results were qualitatively similar.
2.
http://boilerpipe-web.appspot.com/
3.
http://newspaper.readthedocs.io/en/latest/
4.
https://www.diffbot.com/dev/docs/article/
5.
https://www.alexa.com/

References

Bader, B.W., Kolda, T.G.: Matlab tensor toolbox version 2.6. Available online (2015)
Google Scholar
Biyani, P., Tsioutsiouliklis, K., Blackmer, J.: “8 amazing secrets for getting more clicks”: detecting clickbaits in news streams using article informality. In: Proceedings of the Thirtieth AAAI Conference on Artificial (AAAI’16), pp. 94–100 (2016)
Google Scholar
Braunstein, A., Mézard, M., Zecchina, R.: Survey propagation: an algorithm for satisfiability. Random Struct. Algorithms 27(2), 201–226 (2005). https://doi.org/10.1002/rsa.v27:2
Article MathSciNet Google Scholar
BS Detector (2017). http://bsdetector.tech/
Chen, Y., Conroy, N., Rubin, V.: Misleading online content: recognizing clickbait as “false news” (2015). https://doi.org/10.1145/2823465.2823467
Guacho, G.B., Abdali, S., Shah, N., Papalexakis, E.E.: Semi-supervised content-based detection of misinformation via tensor embeddings. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 322–325 (2018). https://doi.org/10.1109/ASONAM.2018.8508241
Gupta, A., Lamba, H., Kumaraguru, P.: $1.00 per rt #bostonmarathon #prayforboston: analyzing fake content on twitter. In: 2013 APWG eCrime Researchers Summit, pp. 1–12 (2013). https://doi.org/10.1109/eCRS.2013.6805772
Gupta, A., Lamba, H., Kumaraguru, P., Joshi, A.: Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In: Proceedings of the 22Nd International Conference on World Wide Web, WWW ’13 Companion, pp. 729–736. ACM, New York (2013). https://doi.org/10.1145/2487788.2488033
Gupta, M., Zhao, P., Han, J.: Evaluating Event Credibility on Twitter, pp. 153–164. https://doi.org/10.1137/1.9781611972825.14, http://epubs.siam.org/doi/abs/10.1137/1.9781611972825.14
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco (2011)
MATH Google Scholar
Hardalov, M., Koychev, I., Nakov, P.: In Search of Credible News. Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2016 Lecture Notes in Computer Science, pp. 172–180 (2016). https://doi.org/10.1007/978-3-319-44748-3_17
Harshman, R.A.: Foundations of the PARAFAC procedure: models and conditions for an“ explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics 16(1), 84 (1970)
Google Scholar
Horne, B.D., Adali, S.: This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. CoRR abs/1703.09398 (2017). http://arxiv.org/abs/1703.09398
Hosseinimotlagh, S., Papalexakis, E.E.: Unsupervised content-based identification of fake news articles with tensor decomposition ensembles (2017)
Google Scholar
Jin, Z., Cao, J., Jiang, Y.G., Zhang, Y.: News credibility evaluation on microblog with a hierarchical propagation model. In: 2014 IEEE International Conference on Data Mining, pp. 230–239 (2014). https://doi.org/10.1109/ICDM.2014.91
Jin, Z., Cao, J., Zhang, Y., Luo, J.: News verification by exploiting conflicting social viewpoints in microblogs (2016). https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12128
Koutra, D., Ke, T.Y., Kang, U., Chau, D., Pao, H.K., Faloutsos, C.: Unifying guilt-by-association approaches: theorems and fast algorithms. In: Machine Learning and Knowledge Discovery in Databases (ECML/PKDD), Lecture Notes in Computer Science, vol. 6912, pp. 245–260. Springer, Berlin/Heidelberg (2011)
Google Scholar
Kumar, S., Shah, N.: False information on web and social media: a survey. arXiv preprint arXiv:1804.08559 (2018)
Google Scholar
Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B.J., Wong, K.F., Cha, M.: Detecting rumors from microblogs with recurrent neural networks. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI’16), pp. 3818–3824. AAAI Press (2016). http://dl.acm.org/citation.cfm?id=3061053.3061153
Ma, J., Gao, W., Wei, Z., Lu, Y., Wong, K.F.: Detect rumors using time series of social context information on microblogging websites. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM ’15), pp. 1751–1754. ACM, New York (2015). https://doi.org/10.1145/2806416.2806607
Papalexakis, E.E., Faloutsos, C., Sidiropoulos, N.D.: Tensors for data mining and data fusion: models, applications, and scalable algorithms. ACM Trans. Intell. Syst. Technol. 8(2), 16:1–16:44 (2016). https://doi.org/10.1145/2915921
Pelleg, D., Moore, A.: Accelerating exact k-means algorithms with geometric reasoning. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 277–281. ACM (1999)
Google Scholar
Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: identifying misinformation in microblogs. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11), pp. 1589–1599. Association for Computational Linguistics, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2145432.2145602
Rubin, V.L., Conroy, N.J., Chen, Y., Cornwell, S.: Fake news or truth? using satirical cues to detect potentially misleading news (2016)
Google Scholar
Ruchansky, N., Seo, S., Liu, Y.: CSI: a hybrid deep model for fake news. CoRR abs/1703.06959 (2017). http://arxiv.org/abs/1703.06959
Shah, N., Beutel, A., Gallagher, B., Faloutsos, C.: Spotting suspicious link behavior with fbox: an adversarial perspective. In: IEEE International Conference on Data Mining (ICDM), 2014, pp. 959–964. IEEE (2014)
Google Scholar
Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. CoRR abs/1708.01967 (2017). http://arxiv.org/abs/1708.01967
Shu, K., Le, T., Lee, D., Huan, L.: Deep headline generation for clickbait detection. In: IEEE International Conference on Data Mining (ICDM), pp. 467–476 (2018)
Google Scholar
Shu, K., Sliva, A., Wang, S., Liu, H.: Beyond news contents: the role of social context for fake news detection. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (WSDM’19), pp. 312–320 (2019)
Google Scholar
Shu, K., Cui, L., Wang, S., Lee, D., Liu, H.: Defend: explainable fake news detection. In: Proceedings of 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019)
Google Scholar
Sidiropoulos, N., De Lathauwer, L., Fu, X., Huang, K., Papalexakis, E., Faloutsos, C.: Tensor decomposition for signal processing and machine learning. IEEE Trans. Signal Process. PP (2016). https://doi.org/10.1109/TSP.2017.2690524
Silverman, C.: This analysis shows how fake election news stories outperformed real news on facebook. BuzzFeed News (2016)
Google Scholar
Wu, L., Li, J., Hu, X., Liu, H.: Gleaning wisdom from the past: early detection of emerging rumors in social media. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 99–107. SIAM (2017)
Google Scholar
Wu, L., Liu, H.: Tracing fake-news footprints: characterizing social media messages by how they propagate. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 637–645. ACM (2018)
Google Scholar
Yedidia, J., Freeman, W., Weiss, Y.: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theory 51, 2282–2312 (2005). https://doi.org/10.1109/TIT.2005.850085
Article MathSciNet Google Scholar

Download references

Acknowledgements

This research was supported by a gift from Snap Inc, an Adobe Data Science Faculty Award, by the Department of the Navy, Naval Engineering Education Consortium under award no. N00174-17-1-0005, and by the National Science Foundation CDS&E Grant no. OAC-1808591. Any opinions, findings, and conclusions or recommendations expressed here are those of the author(s) and do not necessarily reflect the views of the funding parties. We would also like to thank Daniel Fonseca for proofreading of the book chapter.

Author information

Authors and Affiliations

University of California, Riverside, CA, USA
Sara Abdali, Gisel G. Bastidas & Evangelos E. Papalexakis
Snap Inc., Los Angeles, CA, USA
Neil Shah

Authors

Sara Abdali
View author publications
You can also search for this author in PubMed Google Scholar
Gisel G. Bastidas
View author publications
You can also search for this author in PubMed Google Scholar
Neil Shah
View author publications
You can also search for this author in PubMed Google Scholar
Evangelos E. Papalexakis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sara Abdali .

Editor information

Editors and Affiliations

Computer Science & Engineering, Arizona State University, Tempe, AZ, USA
Kai Shu
College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA, USA
Suhang Wang
College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA, USA
Dongwon Lee
Computer Science & Engineering, Arizona State University, Tempe, AZ, USA
Huan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Abdali, S., Bastidas, G.G., Shah, N., Papalexakis, E.E. (2020). Tensor Embeddings for Content-Based Misinformation Detection with Limited Supervision. In: Shu, K., Wang, S., Lee, D., Liu, H. (eds) Disinformation, Misinformation, and Fake News in Social Media. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-42699-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-42699-6_7
Published: 18 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-42698-9
Online ISBN: 978-3-030-42699-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics