Unsupervised Explainable Controversy Detection from Online News

  • Youngwoo KimEmail author
  • James Allan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11437)


Alerting users that a web page is controversial has been proposed as one method to support critical thinking about text and discourse. We propose an approach to discover controversial topics in a generic document using unsupervised training. Our approach comprises iterative training of a controversy classifier using a disagreement signal within comments and explaining the controversy of the document by generating a topic phrase describing it. Experiments show the effectiveness of our proposed training method using an EM algorithm. When controversial topic extraction is restricted to quality phrases and incorporates TextRank signals, it outperforms several baseline approaches.


Controversy Topic extraction Controversy detection 



This work was supported in part by the Center for Intelligent Information Retrieval and in part by NSF grant #IIS-1813662. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsor. We thank Kaspar Beelen for sharing the labeled data.


  1. 1.
    Ancona, M., Ceolini, E., Oztireli, C., Gross, M.: Towards better understanding of gradient-based attribution methods for deep neural networks. In: 6th International Conference on Learning Representations (ICLR 2018) (2018)Google Scholar
  2. 2.
    Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)CrossRefGoogle Scholar
  3. 3.
    Beelen, K., Kanoulas, E., van de Velde, B.: Detecting controversies in online news media. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1069–1072. ACM, New York (2017).
  4. 4.
    Bender, E.M., et al.: Annotating social acts: authority claims and alignment moves in Wikipedia talk pages. In: Proceedings of the Workshop on Languages in Social Media, pp. 48–57. Association for Computational Linguistics (2011)Google Scholar
  5. 5.
    De Clercq, O., Hertling, S., Hoste, V., Ponzetto, S.P., Paulheim, H.: Identifying Disputed Topics in the News, pp. 32–43 (2014)Google Scholar
  6. 6.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B (Methodol.) 39, 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Dori-Hacohen, S., Allan, J.: Detecting controversy on the web. In: Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management, pp. 1845–1848. ACM (2013)Google Scholar
  8. 8.
    Dori-Hacohen, S., Allan, J.: Automated controversy detection on the web. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 423–434. Springer, Cham (2015). Scholar
  9. 9.
    Dori-Hacohen, S., Jensen, D., Allan, J.: Controversy detection in Wikipedia using collective classification. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 797–800. ACM, New York (2016).
  10. 10.
    Fessler, J.A., Hero, A.O.: Space-alternating generalized expectation-maximization algorithm. IEEE Trans. Signal Process. 42(10), 2664–2677 (1994). Scholar
  11. 11.
    Garimella, K., Morales, G.D.F., Gionis, A., Mathioudakis, M.: Quantifying controversy on social media. Trans. Soc. Comput. 1(1), 3:1–3:27 (2018). Scholar
  12. 12.
    Jang, M., Allan, J.: Explaining controversy on social media via stance summarization. In: The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1221–1224. ACM, New York.
  13. 13.
    Jang, M., Foley, J., Dori-Hacohen, S., Allan, J.: Probabilistic approaches to controversy detection. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, CIKM 2016, pp. 2069–2072. ACM, New York (2016).
  14. 14.
    Kata, A.: A postmodern pandora’s box: anti-vaccination misinformation on the internet. Vaccine 28(7), 1709–1716 (2010). Scholar
  15. 15.
    Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
  16. 16.
    Mann, G.S., McCallum, A.: Generalized expectation criteria for semi-supervised learning with weakly labeled data. J. Mach. Learn. Res. 11(Feb), 955–984 (2010)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411. Association for Computational Linguistics, Barcelona, Spain, July 2004Google Scholar
  18. 18.
    Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: identifying misinformation in microblogs. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1589–1599. Association for Computational Linguistics (2011)Google Scholar
  19. 19.
    Yamamoto, Y.: Disputed sentence suggestion towards credibility-oriented web search. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) APWeb 2012. LNCS, vol. 7235, pp. 34–45. Springer, Heidelberg (2012). Scholar
  20. 20.
    Yasseri, T., Sumi, R., Rung, A., Kornai, A., Kertész, J.: Dynamics of conflicts in Wikipedia. PloS ONE 7(6), e38869 (2012)CrossRefGoogle Scholar
  21. 21.
    Zielinski, K., Nielek, R., Wierzbicki, A., Jatowt, A.: Computing controversy: formal model and algorithms for detecting controversy on Wikipedia and in search queries. Inf. Process. Manage. 54(1), 14–36 (2018). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Center for Intelligent Information Retrieval, College of Information and Computer SciencesUniversity of Massachusetts AmherstAmherstUSA

Personalised recommendations