Advertisement

Detection of Hate Speech and Offensive Language in Twitter Data Using LSTM Model

  • Akanksha Bisht
  • Annapurna Singh
  • H. S. Bhadauria
  • Jitendra VirmaniEmail author
  • Kriti
Chapter
  • 27 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1124)

Abstract

In today’s world, internet is an emerging technology with exponential user growth. A major concern with that is the increase of toxic online content by people of different backgrounds. With the expansion of deep learning, quite a lot of researches have inclined toward using their deep neural networks for abundant discipline. Even for natural language processing (NLP)-based tasks, deep networks, specifically recurrent neural network (RNN), and their types are lately being considered over the traditional shallow networks. This paper addresses the problem of hate speech hovering on social media. We propose an LTSM-based classification system that differentiates between hate speech and offensive language. This system describes a contemporary approach that employs word embeddings with LSTM and Bi-LSTM neural networks for the identification of hate speech on Twitter. The best performing LSTM network classifier achieved an accuracy of 86% with early stopping criterion based on loss function during training.

Keywords

Sentiment analysis NLP Deep learning Hate speech Offensive language Bi-LSTM LSTM Twitter 

References

  1. 1.
  2. 2.
    A. Hern, Facebook, YouTube, Twitter and Microsoft sign EU hate speech code. The Guardian. Accessed 7 June 2016Google Scholar
  3. 3.
    L. Silva, M. Mondal, D. Correa, F. Benevenuto, I. Weber, Analyzing the targets of hate in online social media. arXiv preprint arXiv:1603.07709 (2016)
  4. 4.
    Hateful Conduct Policy (2017), https://support.twitter.com/articles/. Accessed Feb 2017
  5. 5.
    Z. Waseem, Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter, in Proceedings of the First Workshop on NLP and Computational Social Science (2016), pp. 138–142Google Scholar
  6. 6.
    A. Jha, R. Mamidi, When does a compliment become sexist? analysis and classification of ambivalent sexism using twitter data, in Proceedings of the Second Workshop on NLP and Computational Social Science (2017), pp. 7–16Google Scholar
  7. 7.
    A. Joulin, E. Grave, P. Bojanowski, T. Mikolov, Bag of tricks for efficient text classification (2016). arXiv preprint arXiv:1607.01759
  8. 8.
    I. Kwok, Y. Wang, Locate the hate: detecting tweets against blacks, in Twenty-Seventh AAAI Conference on Artificial Intelligence (2013)Google Scholar
  9. 9.
    P. Burnap, M.L. Williams, Cyber hate speech on twitter: an application of machine classification and statistical modeling for policy and decision making. Policy Internet 7(2), 223–242 (2015)CrossRefGoogle Scholar
  10. 10.
    J.H. Park, P. Fung, One-step and two-step classification for abusive language detection on twitter. arXiv preprint arXiv:1706.01206 (2017)
  11. 11.
    B. Gambäck, U.K. Sikdar, Using convolutional neural networks to classify hate-speech, in Proceedings of the First Workshop on Abusive Language Online, pp. 85–90Google Scholar
  12. 12.
    M. ElSherief, S. Nilizadeh, D. Nguyen, G. Vigna, E. Belding, Peer to peer hate: hate speech instigators and their targets, in Twelfth International AAAI Conference on Web and Social Media (2018)Google Scholar
  13. 13.
    P. Badjatiya, S. Gupta, M. Gupta, V. Varma, Deep learning for hate speech detection in tweets, In Proceedings of the 26th International Conference on World Wide Web Companion (International World Wide Web Conferences Steering Committee, 2017), pp. 759–760Google Scholar
  14. 14.
    M.O. Ibrohim, I. Budi, Multi-label hate speech and abusive language detection in indonesian twitter, in Proceedings of the Third Workshop on Abusive Language Online (2019), pp. 46–57Google Scholar
  15. 15.
    C.N.D. Santos, I. Melnyk, I. Padhi, Fighting offensive language on social media with unsupervised text style transfer. arXiv preprint arXiv:1805.07685 (2018)
  16. 16.
    U. Bretschneider, R. Peters, Detecting offensive statements towards foreigners in social media, in Proceedings of the 50th Hawaii International Conference on System Sciences (2017)Google Scholar
  17. 17.
    T. Davidson, D. Warmsley, M. Macy, I. Weber, Automated hate speech detection and the problem of offensive language. arXiv preprint arXiv:1703.04009 (2017)
  18. 18.
    B. vanAken, J. Risch, R. Krestel, A. Löser, Challenges for toxic comment classification: an in-depth error analysis. arXiv preprint arXiv:1809.07572 (2018)
  19. 19.
    N. Djuric, J. Zhou, R. Morris, M. Grbovic, V. Radosavljevic, N. Bhamidipati, Hate speech detection with comment embeddings, in Proceedings of the 24th International Conference on World Wide Web (ACM, 2015), pp. 29–30Google Scholar
  20. 20.
    V. Basile, C. Bosco, E. Fersini, D. Nozza, V. Patti, F.M.R. Pardo, M. Sanguinetti, Semeval-2019 task 5: multilingual detection of hate speech against immigrants and women in twitter, in Proceedings of the 13th International Workshop on Semantic Evaluation (2019), pp. 54–63Google Scholar
  21. 21.
    S.V. Georgakopoulos, S.K. Tasoulis, A.G. Vrahatis, V.P. Plagianakos, Convolutional neural networks for toxic comment classification, in Proceedings of the 10th Hellenic Conference on Artificial Intelligence (ACM, 2018), p. 35Google Scholar
  22. 22.
    Z. Zhang, D. Robinson, J. Tepper, Detecting hate speech on twitter using a convolution-gru based deep neural network, in European Semantic Web Conference (Springer, Cham, 2018), pp. 745–760CrossRefGoogle Scholar
  23. 23.
    H. Watanabe, M. Bouazizi, T. Ohtsuki, Hate speech on twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access 6, 13825–13835 (2018)CrossRefGoogle Scholar
  24. 24.
    C. Nobata, J. Tetreault, A. Thomas, Y. Mehdad, Y. Chang, Abusive language detection in online user content, in Proceedings of the 25th International Conference on World Wide Web (International World Wide Web Conferences Steering Committee, 2016), pp. 145–153Google Scholar
  25. 25.
    G.K. Pitsilis, H. Ramampiaro, H. Langseth, Detecting offensive language in tweets using deep learning. arXiv preprint arXiv:1801.04433 (2018)
  26. 26.
    P. Mathur, R. Shah, R. Sawhney, D. Mahata, Detecting offensive tweets in hindi-english code-switched language, in Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media (2018), pp. 18–26Google Scholar
  27. 27.
    B. Vandersmissen, Automated detection of offensive language behavior on social networking sites. IEEE Trans. (2012)Google Scholar
  28. 28.
    P. Mathur, R. Sawhney, M. Ayyar, R. Shah, Did you offend me? classification of offensive tweets in hinglish language, in Proceedings of the 2nd Workshop on Abusive Language Online (ALW2) (2018), pp. 138–148Google Scholar
  29. 29.
    S. Agarwal, A. Sureka, Using knn and svm based one-class classifier for detecting online radicalization on twitter, in International Conference on Distributed Computing and Internet Technology (Springer, Cham, 2015), pp. 431–442CrossRefGoogle Scholar
  30. 30.
    A.H. Razavi, D. Inkpen, S. Uritsky, S. Matwin, Offensive language detection using multi-level classification, in Canadian Conference on Artificial Intelligence (Springer, Berlin, Heidelberg, 2010), pp. 16–27CrossRefGoogle Scholar
  31. 31.
    G. Xiang, B. Fan, L. Wang, J. Hong, C. Rose, Detecting offensive tweets via topical feature discovery over a large scale twitter corpus, in Proceedings of the 21st ACM international conference on Information and knowledge management (ACM, 2012), pp. 1980–1984Google Scholar
  32. 32.
    M. Zampieri, S. Malmasi, P. Nakov, S. Rosenthal, N. Farra, & R. Kumar, Semeval-2019 task 6: identifying and categorizing offensive language in social media (offenseval). arXiv preprint arXiv:1903.08983 (2019)
  33. 33.
    Z. Xu, S. Zhu, Filtering offensive language in online communities using grammatical relations, in Proceedings of the Seventh Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (2010), pp. 1–10Google Scholar
  34. 34.
    G. Wiedemann, E. Ruppert, R. Jindal, C. Biemann, Transfer learning from LDA to BiLSTM-CNN for offensive language detection in twitter. arXiv preprint arXiv:1811.02906 (2018)
  35. 35.
    K. Rother, M. Allee, A. Rettberg, Ulmfit at germeval-2018: a deep neural language model for the classification of hate speech in german tweets, in 14th Conference on Natural Language Processing KONVENS 2018 (2018), p. 113Google Scholar
  36. 36.
    H. Mubarak, K. Darwish, W. Magdy, Abusive language detection on Arabic social media, in Proceedings of the First Workshop on Abusive Language Online (2017), pp. 52–56Google Scholar
  37. 37.
    T.G. Almeida, B.À. Souza, F.G. Nakamura, E.F. Nakamura, Detecting hate, offensive, and regular speech in short comments, in Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web (ACM, 2017), pp. 225–228Google Scholar
  38. 38.
    A. Gaydhani, V. Doma, S. Kendre, L. Bhagwat, Detecting hate speech and offensive language on twitter using machine learning: an n-gram and tfidf based approach. arXiv preprint arXiv:1809.08651 (2018)
  39. 39.
    T. Gröndahl, L. Pajola, M. Juuti, M. Conti, N. Asokan, All you need is: evading hate speech detection, in Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security (ACM, 2018), pp. 2–12Google Scholar
  40. 40.
    L. Gao, R. Huang, Detecting online hate speech using context aware models. arXiv preprint arXiv:1710.07395 (2017)

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Akanksha Bisht
    • 1
  • Annapurna Singh
    • 1
  • H. S. Bhadauria
    • 1
  • Jitendra Virmani
    • 2
    Email author
  • Kriti
    • 3
  1. 1.G B Pant Institute of Engineering and TechnologyPauri GarhwalIndia
  2. 2.CSIR—Central Scientific Instruments OrganizationChandigarhIndia
  3. 3.Thapar Institute of Engineering and TechnologyPatialaIndia

Personalised recommendations