Skip to main content

A Knowledge-Based Weighted KNN for Detecting Irony in Twitter

  • Conference paper
  • First Online:
Book cover Advances in Computational Intelligence (MICAI 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11289))

Included in the following conference series:

Abstract

In this work, we propose a variant of a well-known instance-based algorithm: WKNN. Our idea is to exploit task-dependent features in order to calculate the weight of the instances according to a novel paradigm: the Textual Attraction Force, that serves to quantify the degree of relatedness between documents. The proposed method was applied to a challenging text classification task: irony detection. We experimented with corpora in the state of the art. The obtained results show that despite being a simple approach, our method is competitive with respect to more advanced techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Where \(\partial (c, f(knn_i))\) returns 1 if \(c=f(knn_i)\) and 0 otherwise.

  2. 2.

    This example is part of the obtained results when the aforementioned method was applied with a size of k = 7. All the tweets were extracted from dataset developed by [22].

  3. 3.

    This tweet was labeled with the class “Irony”.

  4. 4.

    The real classes of these tweets are denoted as follows: “*” represents the class “Irony” while “**” is denoted as “Nonirony”.

  5. 5.

    Hashtags, mentions, emoji, and url were not considered in the bag-of-words model.

  6. 6.

    We consider five different punctuation marks: “.”, “,”, “:”, “!”, and “?”.

  7. 7.

    We used a list of terms defined in https://en.wiktionary.org/wiki/Appendix:English_internet_slang.

  8. 8.

    We used the embeddings pre-trained on the Google News corpus.

    https://code.google.com/archive/p/word2vec/.

  9. 9.

    Satire is strongly related to verbal irony, providing a detailed definition of such a concept is beyond of the scope of this work.

  10. 10.

    In computational linguistics, irony is often considered as an umbrella term that covers also sarcasm.

  11. 11.

    We performed three different binary classifications by combining each of the nonironic classes with the ironic one. From now on, these experiments will be referred as TwReyes2013-Edu, TwReyes2013-Hum, and TwReyes2013-Pol.

  12. 12.

    https://competitions.codalab.org/competitions/17468.

  13. 13.

    Where massFunction can be any of the functions defined in Sect. 3.1.

  14. 14.

    The default configuration of parameters in the classifiers was applied.

  15. 15.

    The authors reported the performance of their model in terms of F-measure considering both classes together. Attempting to compare our results, we carried out experiments by exploiting the aforementioned model but instead of considering an overall performance, we are reporting only the performance in terms of the ironic class.

  16. 16.

    For further details on the shared task see [22].

References

  1. Barbieri, F., Basile, V., Croce, D., Nissim, M., Novielli, N., Patti, V.: Overview of the Evalita 2016 sentiment polarity classification task. In: Proceedings of Third Italian Conference on Computational Linguistics, vol. 1749. CEUR-WS.org (2016)

    Google Scholar 

  2. Basile, V., Bolioli, A., Nissim, M., Patti, V., Rosso, P.: Overview of the Evalita 2014 sentiment polarity classification task. In: Proceedings of the First Italian Conference on Computational Linguistics, pp. 50–57 (2014)

    Google Scholar 

  3. Brysbaert, M., Warriner, A.B., Kuperman, V.: Concreteness ratings for 40 thousand generally known English word lemmas. Behav. Res. Met. 46(3), 904–911 (2014)

    Article  Google Scholar 

  4. Cambria, E., Hussain, A.: Sentic Computing, vol. 1. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23654-4

    Book  Google Scholar 

  5. Cambria, E., Olsher, D., Rajagopal, D.: SenticNet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 1515–1521 (2014)

    Google Scholar 

  6. Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst., Man, Cybern. SMC 6(4), 325–327 (1976)

    Article  Google Scholar 

  7. Ghosh, A., et al.: SemEval-2015 task 11: sentiment analysis of figurative language in Twitter. In: Proceedings of the 9th International Workshop on Semantic Evaluation, pp. 470–478 (2015)

    Google Scholar 

  8. Giora, R., Fein, O.: Irony: context and salience. Metaphor. Symb. 14(4), 241–257 (1999)

    Article  Google Scholar 

  9. Gou, J., Du, L., Zhang, Y., Xiong, T.: A new distance-weighted k-nearest neighbor classifier. J. Inform. Comp. Sci. 9(6), 1429–1436 (2012)

    Google Scholar 

  10. Grice, H.P.: Logic and conversation. In: Cole, P., Morgan, J.L. (eds.) Syntax and Semantics: Volume 3: Speech Acts, pp. 41–58. Academic Press, San Diego (1975)

    Google Scholar 

  11. Hernández Farías, D.I., Patti, V., Rosso, P.: Irony detection in Twitter: the role of affective content. ACM Trans. Internet Technol. 16(3), 19:1–19:24 (2016)

    Google Scholar 

  12. Hernández Farías, D.I., Rosso, P.: Irony, sarcasm, and sentiment analysis. chapter 7. In: Pozzi, F.A., Fersini, E., Messina, E., Liu, B. (eds.) Sentiment Analysis in Social Networks, pp. 113–127. Morgan Kaufmann (2016)

    Google Scholar 

  13. Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the 10th SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)

    Google Scholar 

  14. Joshi, A., Bhattacharyya, P., Carman, M.J.: Automatic sarcasm detection: a survey. ACM Comput. Surv. 50(5), 73:1–73:22 (2017)

    Article  Google Scholar 

  15. Mitchell, T.M.: Machine learning and data min. Com. ACM 42(11), 30–36 (1999)

    Article  Google Scholar 

  16. Mohammad, S.M., Turney, P.D.: Crowdsourcing a word-emotion association lexicon. Comput. Intell. 29(3), 436–465 (2013)

    Article  MathSciNet  Google Scholar 

  17. Mohammad, S.M., Zhu, X., Kiritchenko, S., Martin, J.: Sentiment, emotion, purpose, and style in electoral tweets. Inf. Process. Manag. 51(4), 480–499 (2015)

    Article  Google Scholar 

  18. Plutchik, R.: The nature of emotions. Am. Sci. 89(4), 344–350 (2001)

    Article  Google Scholar 

  19. Reyes, A., Rosso, P., Veale, T.: A multidimensional approach for detecting irony in Twitter. Lang. Resour. Eval. 47(1), 239–268 (2013)

    Article  Google Scholar 

  20. Riloff, E., Qadir, A., Surve, P., Silva, L.D., Gilbert, N., Huang, R.: Sarcasm as contrast between a positive sentiment and negative situation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 704–714. ACL (2013)

    Google Scholar 

  21. Skalicky, S., Crossley, S.: A statistical analysis of satirical Amazon.com product reviews. Eur. J. Humour Res. 2, 66–85 (2015)

    Article  Google Scholar 

  22. Van Hee, C., Lefever, E., Hoste, V.: SemEval-2018 task 3: irony detection in English tweets. In: Proceedings of the 12th International Workshop on Semantic Evaluation, SemEval-2018. ACL, June 2018

    Google Scholar 

Download references

Acknowledgments

This research was funded by CONACYT project FC 2016-2410. The work of P. Rosso has been funded by the SomEMBED TIN2015-71147-C2-1-P MINECO research project. The work of V. Patti was partially funded by Progetto di Ateneo/CSP 2016 (IhatePrejudice, S1618_L2_BOSC_01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Delia Irazú Hernández Farías .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hernández Farías, D.I., Montes-y-Gómez, M., Escalante, H.J., Rosso, P., Patti, V. (2018). A Knowledge-Based Weighted KNN for Detecting Irony in Twitter. In: Batyrshin, I., Martínez-Villaseñor, M., Ponce Espinosa, H. (eds) Advances in Computational Intelligence. MICAI 2018. Lecture Notes in Computer Science(), vol 11289. Springer, Cham. https://doi.org/10.1007/978-3-030-04497-8_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04497-8_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04496-1

  • Online ISBN: 978-3-030-04497-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics