Sepe: A POS Tagger for Spanish

Jiménez, Héctor; Morales, Guillermo

doi:10.1007/3-540-45715-1_23

Héctor Jiménez⁵ &
Guillermo Morales⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2276))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1515 Accesses

Abstract

We describe a part-of-speech tagging system specially designed to tag Spanish texts using small linguistic resources. Nevertheless, the tagger obtains encouraging results. We have found and exploited useful contextual parameters to tag ambiguous and unknown words. Our tagger is mainly supported by word lists and one corpus with around 10⁴ words. The system has been tested for texts of the so called “news” genre and is still on continuous development.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Daelemans, Walter: Memory-based lexical acquisition and processing, Lecture Notes in Artificial Intelligence, 898, Springer Verlag, pp 85–98, 1995.
Google Scholar
Daelemans, Walter; Durieux, Gert & van-den-Bosch, Antal: Towards inductive lexicon, Proc. of LREC Workshop on Adapting Lexical and Corpus Resources to Sublanguages and Applications, Granada, http://ilk.kub.nl/, 1998.
Daelemans, Walter; van-den-Bosch, Antal; Zavrel, Jakub; Veenstra, Jorn; Buchholz, Sabine & Busser, Bertjan: Rapid development of NLP modules with memorybased learning, Proc. of ELSNET in Wonderland, pp 105–113, 1998.
Google Scholar
Jiménez-Salazar, Héctor & Morales-Luna, Guillermo: Instance metrics improvement by probabilistic support, Lecture Notes in Artificial Intelligence, 1793, Springer Verlag, pp 699–705, 2000.
Google Scholar
Lara, Luis Fernando; Ham-Chande, Roberto & García-Hidalgo, Ma. Isabel: Investigaciones lingüísticas en lexicografía, Jornadas 89, El Colegio de México, 1979.
Google Scholar
Màrquez, Lluís & Rodríquez, Horacio: Part-of-speech tagging using decision trees, Lecture Notes in Artificial Intelligence, 1398, pp 25–33, 1998.
Google Scholar
Marques, N. & Pereira, G.: A POS-tagger generator for unknown languages, Procesamiento del Lenguaje Natural, Rev. No. 27, SEPLN, pp 199–206, Spain, 2001.
Google Scholar
Moreno de Alba, Jose G.: Morfología derivativa nominal en el español de México, National University of Mexico (UNAM), Mexico 1986.
Google Scholar
Pla, F.; Molina, A. & Prieto N.: Evaluación de un etiquetador morfosintáctico basado en bigramas especializados para el castellano, Procesamiento del Lenguaje Natural, Rev. No. 27, SEPLN, pp 215–221, Spain, 2001.
Google Scholar
Rodríguez, Santiago & Carretero, Jesús: Building a Spanish speller, http://www.datsi.fi.upm.es, 1997.
Ruiz, L.: Desarrollo de un modelo computacional para el procesamiento de corpus textuales basado en la etiquetación automática, Ph. D. dissertation, Universidad de Oriente, Cuba, 2001.
Google Scholar
van-den Bosch, Antal; Daelemans, Walter; Weijters, Ton: Morphological analysis as classification: an inductive-learning approach, http://lib-www.lanl.gov/cmp-lg/9607021, 1996.
Zavrel, Jakub; Daelemans, Walter; Veenstra, Jorn: Resolving PP-attachment ambiguities with MBL, CoNLL, http://ilk.kub.nl/, 1997.

Download references

Author information

Authors and Affiliations

Faculty of Computer Science, Autonomous University of Puebla, 72570, Puebla, C.U., Mexico
Héctor Jiménez
Computer Science Section, CINVESTAV, Molecular Engineering Program, Mexican Institute of Petroleum, Mexico
Guillermo Morales

Authors

Héctor Jiménez
View author publications
You can also search for this author in PubMed Google Scholar
Guillermo Morales
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CIC Centro de Investigacion en Computacion, IPN Instituto Politecnico Nacional, Col Zacateno, CP 07738, Mexico DF, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiménez, H., Morales, G. (2002). Sepe: A POS Tagger for Spanish. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2002. Lecture Notes in Computer Science, vol 2276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45715-1_23

Download citation

DOI: https://doi.org/10.1007/3-540-45715-1_23
Published: 05 February 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43219-7
Online ISBN: 978-3-540-45715-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics