Combining Rules and CRF Learning for Opinion Source Identification in Spanish Texts

Rosá, Aiala; Wonsever, Dina; Minel, Jean-Luc

doi:10.1007/978-3-642-34654-5_46

Aiala Rosá^21,22,
Dina Wonsever²¹ &
Jean-Luc Minel²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7637))

Included in the following conference series:

Ibero-American Conference on Artificial Intelligence

1838 Accesses
1 Citations

Abstract

In this work we present a system for the automatic annotation of opinions in Spanish texts. We focus mainly in the definition of a TFS-style model for the predicates of opinion and their arguments, in the creation of a lexicon of opinion predicates and in two additional variants for identifying the source of opinions. The original system extracts opinions and all its elements (predicate, source, topic and message) based on hand-coded rules, the first variant uses a CRF model for learning the source, assuming that the predicate is already tagged, and the second variant is a combined version, with the result of source recognition via the rule-based system being added as an additional attribute for training the CRF model. We found that this hybrid system performs better than each of the systems evaluated separately. This work involved the construction of several resources for Spanish: a lexicon of opinion predicates, a 13,000 word corpus with whole opinion annotations and a 40,000 word corpus with annotations of opinion predicates and sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Acerenza, F., Rabosto, M., Zubizarreta, M., Rosá, A., Wonsever, D.: Resolución de correferencias entre fuentes de opiniones en español. In: XXXVIII Conferencia Latinoamericana en Informática (to appear, CLEI 2012)
Google Scholar
Bethard, S., Yu, H., Thornton, A., Hatzivassiloglou, V., Jurafsky, D.: Automatic Extraction of Opinion Propositions and their Holders. In: AAAI Spring Symposium on Exploring Attitude and Affect in Text, pp. 20–27. The AAAI Press, Menlo Park (2004)
Google Scholar
Bethard, S., Yu, H., Thornton, A., Hatzivassiloglou, V., Jurafsky, D.: Extracting Opinion Propositions and Opinion Holders Using Syntactic and Lexical Cues. In: Shanahan, J., Qu, Y., Wiebe, J. (eds.) Computing Attitude and Affect in Text – Theory and Applications. The Information Retrieval Series, vol. 20, pp. 125–141. Springer, Heidelberg (2006)
Google Scholar
Cai, J., Strube, M.: Evaluation Metrics for End-to-End Coreference Resolution Systems. In: 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2010), pp. 28–36. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Choi, Y., Cardie, C., Riloff, E., Patwardhan, S.: Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns. In: Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT-EMNLP 2005), pp. 355–362. Association for Computational Linguistics, Vancouver (2005)
Google Scholar
Choi, Y., Breck, E., Cardie, C.: Joint Extraction of Entities and Relations for Opinion Recognition. In: 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), pp. 431–439. Association for Computational Linguistics, Stroudsburg (2006)
Chapter Google Scholar
Choi, Y., Cardie, C.: Learning with Compositional Semantics as Structural Inference for Subsentential Sentiment Analysis. In: 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2008), pp. 793–801. Association for Computational Linguistics, Stroudsburg (2008)
Google Scholar
Kim, S.-M., Hovy, E.: Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Texts. In: 2006 Workshop on Sentiment and Subjectivity in Text (SST 2006), pp. 1–8. Association for Computational Linguistics, Stroudsburg (2006)
Google Scholar
Krestel, R., Bergler, S., Witte, R.: Minding the Source – Automatic Tagging of Reported Speech in Newspaper Articles. In: 6th International Language Resources and Evaluation Conference (LREC 2008), pp. 2823–2828. ELRA (2008)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.: Conditional Random Fields – Probabilistic Models for Segmenting and Labeling Sequence Data. In: 18th International Conference on Machine Learning (ICML 2001), pp. 282–289. ACM (2001)
Google Scholar
Lu, B.: Identifying Opinion Holders and Targets with Dependency Parser in Chinese News Texts. In: Student Research Workshop at Human Language Technologies – 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT-SRWS 2010), pp. 46–51. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Nakagawa, T., Inui, K., Kurohashi, S.: Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables. In: 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT 2010), pp. 786–794. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Pollard, C., Sag, I.A.: Information-Based Syntax and Semantics, Volume 1 – Fundamentals. CSLI Lecture Notes no. 13. Center for the Study of Language and Information (CSLI). University of Chicago Press, Stanford (1987)
Google Scholar
Pouliquen, B., Steinberger, R., Best, C.: Automatic Detection of Quotations in Multilingual News. In: Recent Advances in Natural Language Processing (RANLP 2007), pp. 487–492 (2007)
Google Scholar
Rosá, A., Wonsever, D., Minel, J.L.: Opinion Identification in Spanish Texts. In: Young Investigators Workshop on Computational Approaches to Languages of the Americas at Human Language Technologies – 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT 2010), pp. 54–61. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Rosá, A.: Identificación de opiniones de diferentes fuentes en textos en español. PhD Thesis. Universidad de la República (Uruguay) / Université Paris Ouest Nanterre La Défense (France) (2011)
Google Scholar
Ruppenhofer, J., Somasundaran, S., Wiebe, J.: Finding the Sources and Targets of Subjective Expressions. In: 6th International Language Resources and Evaluation Conference (LREC 2008), pp. 2781–2788. ELRA (2008)
Google Scholar
Saurí, R.: A Factuality Profiler for Eventualities in Text. PhD dissertation. Brandeis University (2008)
Google Scholar
Sutton, C., McCallum, A.: An Introduction to Conditional Random Fields. arXiv, p. arXiv:1011.4088v1 (2010)
Google Scholar
Wiebe, J., Wilson, T., Cardie, C.: Annotating Expressions of Opinions and Emotions in Language. Language Resources and Evaluation 39(2-3), 165–210 (2005)
Article Google Scholar
Wiegand, M., Klakow, D.: Convolution Kernels for Opinion Holder Extraction. In: Human Language Technologies – 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT 2010), pp. 795–803. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Wonsever, D., Minel, J.-L.: Contextual Rules for Text Analysis. In: Gelbukh, A. (ed.) CICLing 2001. LNCS, vol. 2004, pp. 509–523. Springer, Heidelberg (2001)
Chapter MATH Google Scholar

Download references

Author information

Authors and Affiliations

Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay
Aiala Rosá & Dina Wonsever
Université Paris Ouest Nanterre la Défense, Nanterre, France
Aiala Rosá & Jean-Luc Minel

Authors

Aiala Rosá
View author publications
You can also search for this author in PubMed Google Scholar
Dina Wonsever
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Luc Minel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Facultad de Informática, Universidad Complutense de Madrid, c\ Profesor José García Santesmases, 28040, Madrid, Spain
Juan Pavón & Rubén Fuentes-Fernández &
Universidad Nacional de Colombia, Carrera 30 No 45-03, Edificio 477, Bogotá, DC, Colombia
Néstor D. Duque-Méndez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rosá, A., Wonsever, D., Minel, JL. (2012). Combining Rules and CRF Learning for Opinion Source Identification in Spanish Texts. In: Pavón, J., Duque-Méndez, N.D., Fuentes-Fernández, R. (eds) Advances in Artificial Intelligence – IBERAMIA 2012. IBERAMIA 2012. Lecture Notes in Computer Science(), vol 7637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34654-5_46

Download citation

DOI: https://doi.org/10.1007/978-3-642-34654-5_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34653-8
Online ISBN: 978-3-642-34654-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics