Advertisement

Recognizing Textual Entailment and Paraphrases in Portuguese

  • Gil RochaEmail author
  • Henrique Lopes Cardoso
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10423)

Abstract

The aim of textual entailment and paraphrase recognition is to determine whether the meaning of a text fragment can be inferred (is entailed) from the meaning of another text fragment. In this paper, we address the task of automatically recognizing textual entailment (RTE) and paraphrases from text written in the Portuguese language employing supervised machine learning techniques. Firstly, we formulate the task as a multi-class classification problem. We conclude that semantic-based approaches are very promising to recognize textual entailment and that combining data from European and Brazilian Portuguese brings several challenges typical with cross-language learning. Then, we formulate the task as a binary classification problem and demonstrate the capability of the proposed classifier for RTE and paraphrases. The results reported in this work are promising, achieving 0.83 of accuracy on the test data.

Notes

Acknowledgments

The first author is partially supported by a doctoral grant from Doctoral Program in Informatics Engineering (ProDEI) from the Faculty of Engineering of the University of Porto (FEUP).

References

  1. 1.
    Agirre, E., Banea, C., Cardie, C., Cer, D.M., Diab, M.T., Gonzalez-Agirre, A., Guo, W., Lopez-Gazpio, I., Maritxalar, M., Mihalcea, R., Rigau, G., Uria, L., Wiebe, J.: Semeval-2015 task 2: semantic textual similarity, english, spanish and pilot on interpretability. In: Cer, D.M., Jurgens, D., Nakov, P., Zesch, T. (eds.) Proceedings of the 9th International Workshop on Semantic Evaluation, Denver, USA, pp. 252–263. ACL (2015)Google Scholar
  2. 2.
    Al-Rfou, R., Perozzi, B., Skiena, S.: Polyglot: distributed word representations for multilingual NLP. In: Proceedings of Seventeenth Conference on Computational Natural Language Learning, pp. 183–192. ACL, Sofia, Bulgaria, August 2013Google Scholar
  3. 3.
    Alves, A.O., Oliveira, H., Rodrigues, R.: ASAPP: Alinhamento Semântico Automático de Palavras aplicado ao Português. Linguamática 8(2), 43–58 (2016)Google Scholar
  4. 4.
    Androutsopoulos, I., Malakasiotis, P.: A survey of paraphrasing and textual entailment methods. J. Artif. Int. Res. 38(1), 135–187 (2010)zbMATHGoogle Scholar
  5. 5.
    Beltagy, I., Roller, S., Cheng, P., Erk, K., Mooney, R.J.: Representing meaning with a combination of logical and distributional models. Comput. Linguist. 42(4), 763–808 (2016)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bentivogli, L., Dagan, I., Dang, H.T., Giampiccolo, D., Magnini, B.: Fifth PASCAL recognizing textual entailment challenge. In: Proceedings of Text Analysis Conference (2009)Google Scholar
  7. 7.
    Dagan, I., Glickman, O., Magnini, B.: The PASCAL recognising Textual entailment challenge. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds.) MLCW 2005. LNCS, vol. 3944, pp. 177–190. Springer, Heidelberg (2006). doi: 10.1007/11736790_9CrossRefGoogle Scholar
  8. 8.
    Dagan, I., Roth, D., Sammons, M., Zanzotto, F.M.: Recognizing Textual Entailment: Models and Applications. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers, San Rafael (2013)Google Scholar
  9. 9.
    De Marneffe, M., Rafferty, A.N., Manning, C.D.: Finding contradictions in text. In: Association for Computational Linguistics (2008)Google Scholar
  10. 10.
    Fellbaum, C. (ed.): WordNet: an electronic lexical database Language, speech, and communication. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  11. 11.
    Fialho, P., Marques, R., Martins, B., Coheur, L., Quaresma, P.: INESC-ID@ASSIN: Medição de Similaridade Semântica e Reconhecimento de Inferência Textual. Linguamática 8(2), 33–42 (2016)Google Scholar
  12. 12.
    Fonseca, E., Santos, L., Criscuolo, M., Aluisio, S.: ASSIN: avaliacao de similaridade semantica e inferencia textual. In: Computational Processing of the Portuguese Language - 12th International Conference, Tomar, Portugal, 13–15 July (2016)Google Scholar
  13. 13.
    Garcia, M., Gamallo, P.: Yet another suite of multilingual NLP tools. In: Sierra-Rodríguez, J.-L., Leal, J.P., Simões, A. (eds.) SLATE 2015. CCIS, vol. 563, pp. 65–75. Springer, Cham (2015). doi: 10.1007/978-3-319-27653-3_7CrossRefGoogle Scholar
  14. 14.
    Gonçalo Oliveira, H.: CONTO.PT: groundwork for the automatic creation of a fuzzy portuguese wordnet. In: Silva, J., Ribeiro, R., Quaresma, P., Adami, A., Branco, A. (eds.) PROPOR 2016. LNCS, vol. 9727, pp. 283–295. Springer, Cham (2016). doi: 10.1007/978-3-319-41552-9_29CrossRefGoogle Scholar
  15. 15.
    Hartmann, N.S.: Solo Queue at ASSIN: Combinando Abordagens Tradicionais e Emergentes. Linguamática 8(2), 59–64 (2016)Google Scholar
  16. 16.
    Lai, A., Hockenmaier, J.: Illinois-LH: a denotational and distributional approach to semantics. In: Proceedings of 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 329–334. ACL, Dublin, Ireland, August 2014Google Scholar
  17. 17.
    Lin, C.Y., Och, F.J.: Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In: Proceedings of 42nd Annual Meeting Association for Computational Linguistics, Stroudsburg, PA, USA (2004)Google Scholar
  18. 18.
    Lippi, M., Torroni, P.: Argumentation mining: state of the art and emerging trends. ACM Trans. Internet Technol. 16(2), 10:1–10:25 (2016)CrossRefGoogle Scholar
  19. 19.
    Madnani, N., Dorr, B.J.: Generating phrasal and sentential paraphrases: a survey of data-driven methods. Comput. Linguist. 36(3), 341–387 (2010)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Marelli, M., Bentivogli, L., Baroni, M., Bernardi, R., Menini, S., Zamparelli, R.: Semeval-2014 task 1: evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment. In: Nakov, P., Zesch, T. (eds.) Proceedings of 8th International Workshop on Semantic Evaluation, COLING, Dublin, Ireland, pp. 1–8. ACL (2014)Google Scholar
  21. 21.
    Moens, M.F.: Information Extraction: Algorithms and Prospects in a Retrieval Context. Springer, Heidelberg (2009)zbMATHGoogle Scholar
  22. 22.
    Mollá, D., Vicedo, J.L.: Question answering in restricted domains: an overview. Comput. Linguist. 33(1), 41–61 (2007)CrossRefGoogle Scholar
  23. 23.
    Padó, S., Galley, M., Jurafsky, D., Manning, C.: Robust machine translation evaluation with entailment features. In: Proceedings of Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 297–305. ACL, Stroudsburg, PA, USA (2009)Google Scholar
  24. 24.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: A method for automatic evaluation of machine translation. In: Proceedings of 40th Annual Meeting Association Computational Linguistics, pp. 311–318. ACL, Stroudsburg, PA, USA (2002)Google Scholar
  25. 25.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  26. 26.
    Rocha, G., Lopes Cardoso, H., Teixeira, J.: ArgMine: a framework for argumentation mining. In: 12th International Conference on Computational Processing of the Portuguese Language - PROPOR 2016, Student Research Workshop, Tomar, Portugal, 13–15 July (2016)Google Scholar
  27. 27.
    Rocktäschel, T., Grefenstette, E., Hermann, K.M., Kociský, T., Blunsom, P.: Reasoning about entailment with neural attention. CoRR abs/1509.06664 (2015)Google Scholar
  28. 28.
    Sammons, M., Vydiswaran, V., Roth, D.: Recognizing textual entailment. In: Bikel, D.M., Zitouni, I. (eds.) Multilingual Natural Language Applications: From Theory to Practice, pp. 209–258. Prentice Hall, Upper Saddle River (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.LIACC/DEI, Faculdade de EngenhariaUniversidade do PortoPortoPortugal

Personalised recommendations