Abstract
We describe a part-of-speech tagging system specially designed to tag Spanish texts using small linguistic resources. Nevertheless, the tagger obtains encouraging results. We have found and exploited useful contextual parameters to tag ambiguous and unknown words. Our tagger is mainly supported by word lists and one corpus with around 104 words. The system has been tested for texts of the so called “news” genre and is still on continuous development.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Daelemans, Walter: Memory-based lexical acquisition and processing, Lecture Notes in Artificial Intelligence, 898, Springer Verlag, pp 85–98, 1995.
Daelemans, Walter; Durieux, Gert & van-den-Bosch, Antal: Towards inductive lexicon, Proc. of LREC Workshop on Adapting Lexical and Corpus Resources to Sublanguages and Applications, Granada, http://ilk.kub.nl/, 1998.
Daelemans, Walter; van-den-Bosch, Antal; Zavrel, Jakub; Veenstra, Jorn; Buchholz, Sabine & Busser, Bertjan: Rapid development of NLP modules with memorybased learning, Proc. of ELSNET in Wonderland, pp 105–113, 1998.
Jiménez-Salazar, Héctor & Morales-Luna, Guillermo: Instance metrics improvement by probabilistic support, Lecture Notes in Artificial Intelligence, 1793, Springer Verlag, pp 699–705, 2000.
Lara, Luis Fernando; Ham-Chande, Roberto & García-Hidalgo, Ma. Isabel: Investigaciones lingüísticas en lexicografía, Jornadas 89, El Colegio de México, 1979.
Màrquez, Lluís & Rodríquez, Horacio: Part-of-speech tagging using decision trees, Lecture Notes in Artificial Intelligence, 1398, pp 25–33, 1998.
Marques, N. & Pereira, G.: A POS-tagger generator for unknown languages, Procesamiento del Lenguaje Natural, Rev. No. 27, SEPLN, pp 199–206, Spain, 2001.
Moreno de Alba, Jose G.: Morfología derivativa nominal en el español de México, National University of Mexico (UNAM), Mexico 1986.
Pla, F.; Molina, A. & Prieto N.: Evaluación de un etiquetador morfosintáctico basado en bigramas especializados para el castellano, Procesamiento del Lenguaje Natural, Rev. No. 27, SEPLN, pp 215–221, Spain, 2001.
Rodríguez, Santiago & Carretero, Jesús: Building a Spanish speller, http://www.datsi.fi.upm.es, 1997.
Ruiz, L.: Desarrollo de un modelo computacional para el procesamiento de corpus textuales basado en la etiquetación automática, Ph. D. dissertation, Universidad de Oriente, Cuba, 2001.
van-den Bosch, Antal; Daelemans, Walter; Weijters, Ton: Morphological analysis as classification: an inductive-learning approach, http://lib-www.lanl.gov/cmp-lg/9607021, 1996.
Zavrel, Jakub; Daelemans, Walter; Veenstra, Jorn: Resolving PP-attachment ambiguities with MBL, CoNLL, http://ilk.kub.nl/, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiménez, H., Morales, G. (2002). Sepe: A POS Tagger for Spanish. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2002. Lecture Notes in Computer Science, vol 2276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45715-1_23
Download citation
DOI: https://doi.org/10.1007/3-540-45715-1_23
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43219-7
Online ISBN: 978-3-540-45715-2
eBook Packages: Springer Book Archive