Skip to main content

Sepe: A POS Tagger for Spanish

  • Conference paper
  • First Online:
Book cover Computational Linguistics and Intelligent Text Processing (CICLing 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2276))

  • 1515 Accesses

Abstract

We describe a part-of-speech tagging system specially designed to tag Spanish texts using small linguistic resources. Nevertheless, the tagger obtains encouraging results. We have found and exploited useful contextual parameters to tag ambiguous and unknown words. Our tagger is mainly supported by word lists and one corpus with around 104 words. The system has been tested for texts of the so called “news” genre and is still on continuous development.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Daelemans, Walter: Memory-based lexical acquisition and processing, Lecture Notes in Artificial Intelligence, 898, Springer Verlag, pp 85–98, 1995.

    Google Scholar 

  2. Daelemans, Walter; Durieux, Gert & van-den-Bosch, Antal: Towards inductive lexicon, Proc. of LREC Workshop on Adapting Lexical and Corpus Resources to Sublanguages and Applications, Granada, http://ilk.kub.nl/, 1998.

  3. Daelemans, Walter; van-den-Bosch, Antal; Zavrel, Jakub; Veenstra, Jorn; Buchholz, Sabine & Busser, Bertjan: Rapid development of NLP modules with memorybased learning, Proc. of ELSNET in Wonderland, pp 105–113, 1998.

    Google Scholar 

  4. Jiménez-Salazar, Héctor & Morales-Luna, Guillermo: Instance metrics improvement by probabilistic support, Lecture Notes in Artificial Intelligence, 1793, Springer Verlag, pp 699–705, 2000.

    Google Scholar 

  5. Lara, Luis Fernando; Ham-Chande, Roberto & García-Hidalgo, Ma. Isabel: Investigaciones lingüísticas en lexicografía, Jornadas 89, El Colegio de México, 1979.

    Google Scholar 

  6. Màrquez, Lluís & Rodríquez, Horacio: Part-of-speech tagging using decision trees, Lecture Notes in Artificial Intelligence, 1398, pp 25–33, 1998.

    Google Scholar 

  7. Marques, N. & Pereira, G.: A POS-tagger generator for unknown languages, Procesamiento del Lenguaje Natural, Rev. No. 27, SEPLN, pp 199–206, Spain, 2001.

    Google Scholar 

  8. Moreno de Alba, Jose G.: Morfología derivativa nominal en el español de México, National University of Mexico (UNAM), Mexico 1986.

    Google Scholar 

  9. Pla, F.; Molina, A. & Prieto N.: Evaluación de un etiquetador morfosintáctico basado en bigramas especializados para el castellano, Procesamiento del Lenguaje Natural, Rev. No. 27, SEPLN, pp 215–221, Spain, 2001.

    Google Scholar 

  10. Rodríguez, Santiago & Carretero, Jesús: Building a Spanish speller, http://www.datsi.fi.upm.es, 1997.

  11. Ruiz, L.: Desarrollo de un modelo computacional para el procesamiento de corpus textuales basado en la etiquetación automática, Ph. D. dissertation, Universidad de Oriente, Cuba, 2001.

    Google Scholar 

  12. van-den Bosch, Antal; Daelemans, Walter; Weijters, Ton: Morphological analysis as classification: an inductive-learning approach, http://lib-www.lanl.gov/cmp-lg/9607021, 1996.

  13. Zavrel, Jakub; Daelemans, Walter; Veenstra, Jorn: Resolving PP-attachment ambiguities with MBL, CoNLL, http://ilk.kub.nl/, 1997.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jiménez, H., Morales, G. (2002). Sepe: A POS Tagger for Spanish. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2002. Lecture Notes in Computer Science, vol 2276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45715-1_23

Download citation

  • DOI: https://doi.org/10.1007/3-540-45715-1_23

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43219-7

  • Online ISBN: 978-3-540-45715-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics