Abstract
In this work, we propose a variant of a well-known instance-based algorithm: WKNN. Our idea is to exploit task-dependent features in order to calculate the weight of the instances according to a novel paradigm: the Textual Attraction Force, that serves to quantify the degree of relatedness between documents. The proposed method was applied to a challenging text classification task: irony detection. We experimented with corpora in the state of the art. The obtained results show that despite being a simple approach, our method is competitive with respect to more advanced techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Where \(\partial (c, f(knn_i))\) returns 1 if \(c=f(knn_i)\) and 0 otherwise.
- 2.
This example is part of the obtained results when the aforementioned method was applied with a size of k = 7. All the tweets were extracted from dataset developed by [22].
- 3.
This tweet was labeled with the class “Irony”.
- 4.
The real classes of these tweets are denoted as follows: “*” represents the class “Irony” while “**” is denoted as “Nonirony”.
- 5.
Hashtags, mentions, emoji, and url were not considered in the bag-of-words model.
- 6.
We consider five different punctuation marks: “.”, “,”, “:”, “!”, and “?”.
- 7.
We used a list of terms defined in https://en.wiktionary.org/wiki/Appendix:English_internet_slang.
- 8.
We used the embeddings pre-trained on the Google News corpus.
- 9.
Satire is strongly related to verbal irony, providing a detailed definition of such a concept is beyond of the scope of this work.
- 10.
In computational linguistics, irony is often considered as an umbrella term that covers also sarcasm.
- 11.
We performed three different binary classifications by combining each of the nonironic classes with the ironic one. From now on, these experiments will be referred as TwReyes2013-Edu, TwReyes2013-Hum, and TwReyes2013-Pol.
- 12.
- 13.
Where massFunction can be any of the functions defined in Sect. 3.1.
- 14.
The default configuration of parameters in the classifiers was applied.
- 15.
The authors reported the performance of their model in terms of F-measure considering both classes together. Attempting to compare our results, we carried out experiments by exploiting the aforementioned model but instead of considering an overall performance, we are reporting only the performance in terms of the ironic class.
- 16.
For further details on the shared task see [22].
References
Barbieri, F., Basile, V., Croce, D., Nissim, M., Novielli, N., Patti, V.: Overview of the Evalita 2016 sentiment polarity classification task. In: Proceedings of Third Italian Conference on Computational Linguistics, vol. 1749. CEUR-WS.org (2016)
Basile, V., Bolioli, A., Nissim, M., Patti, V., Rosso, P.: Overview of the Evalita 2014 sentiment polarity classification task. In: Proceedings of the First Italian Conference on Computational Linguistics, pp. 50–57 (2014)
Brysbaert, M., Warriner, A.B., Kuperman, V.: Concreteness ratings for 40 thousand generally known English word lemmas. Behav. Res. Met. 46(3), 904–911 (2014)
Cambria, E., Hussain, A.: Sentic Computing, vol. 1. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23654-4
Cambria, E., Olsher, D., Rajagopal, D.: SenticNet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 1515–1521 (2014)
Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst., Man, Cybern. SMC 6(4), 325–327 (1976)
Ghosh, A., et al.: SemEval-2015 task 11: sentiment analysis of figurative language in Twitter. In: Proceedings of the 9th International Workshop on Semantic Evaluation, pp. 470–478 (2015)
Giora, R., Fein, O.: Irony: context and salience. Metaphor. Symb. 14(4), 241–257 (1999)
Gou, J., Du, L., Zhang, Y., Xiong, T.: A new distance-weighted k-nearest neighbor classifier. J. Inform. Comp. Sci. 9(6), 1429–1436 (2012)
Grice, H.P.: Logic and conversation. In: Cole, P., Morgan, J.L. (eds.) Syntax and Semantics: Volume 3: Speech Acts, pp. 41–58. Academic Press, San Diego (1975)
Hernández Farías, D.I., Patti, V., Rosso, P.: Irony detection in Twitter: the role of affective content. ACM Trans. Internet Technol. 16(3), 19:1–19:24 (2016)
Hernández Farías, D.I., Rosso, P.: Irony, sarcasm, and sentiment analysis. chapter 7. In: Pozzi, F.A., Fersini, E., Messina, E., Liu, B. (eds.) Sentiment Analysis in Social Networks, pp. 113–127. Morgan Kaufmann (2016)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the 10th SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)
Joshi, A., Bhattacharyya, P., Carman, M.J.: Automatic sarcasm detection: a survey. ACM Comput. Surv. 50(5), 73:1–73:22 (2017)
Mitchell, T.M.: Machine learning and data min. Com. ACM 42(11), 30–36 (1999)
Mohammad, S.M., Turney, P.D.: Crowdsourcing a word-emotion association lexicon. Comput. Intell. 29(3), 436–465 (2013)
Mohammad, S.M., Zhu, X., Kiritchenko, S., Martin, J.: Sentiment, emotion, purpose, and style in electoral tweets. Inf. Process. Manag. 51(4), 480–499 (2015)
Plutchik, R.: The nature of emotions. Am. Sci. 89(4), 344–350 (2001)
Reyes, A., Rosso, P., Veale, T.: A multidimensional approach for detecting irony in Twitter. Lang. Resour. Eval. 47(1), 239–268 (2013)
Riloff, E., Qadir, A., Surve, P., Silva, L.D., Gilbert, N., Huang, R.: Sarcasm as contrast between a positive sentiment and negative situation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 704–714. ACL (2013)
Skalicky, S., Crossley, S.: A statistical analysis of satirical Amazon.com product reviews. Eur. J. Humour Res. 2, 66–85 (2015)
Van Hee, C., Lefever, E., Hoste, V.: SemEval-2018 task 3: irony detection in English tweets. In: Proceedings of the 12th International Workshop on Semantic Evaluation, SemEval-2018. ACL, June 2018
Acknowledgments
This research was funded by CONACYT project FC 2016-2410. The work of P. Rosso has been funded by the SomEMBED TIN2015-71147-C2-1-P MINECO research project. The work of V. Patti was partially funded by Progetto di Ateneo/CSP 2016 (IhatePrejudice, S1618_L2_BOSC_01).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Hernández Farías, D.I., Montes-y-Gómez, M., Escalante, H.J., Rosso, P., Patti, V. (2018). A Knowledge-Based Weighted KNN for Detecting Irony in Twitter. In: Batyrshin, I., Martínez-Villaseñor, M., Ponce Espinosa, H. (eds) Advances in Computational Intelligence. MICAI 2018. Lecture Notes in Computer Science(), vol 11289. Springer, Cham. https://doi.org/10.1007/978-3-030-04497-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-04497-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04496-1
Online ISBN: 978-3-030-04497-8
eBook Packages: Computer ScienceComputer Science (R0)