Skip to main content

Part-of-Speech Tagging Based on Machine Translation Techniques

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4477))

Abstract

In this paper, a new approach to the Part-of-Speech (PoS) tagging problem is proposed. The PoS tagging problem can be viewed as a special translation process where the source language is the set of strings being considered and the target language is the sequence of POS tags. In this work, we have used phrase-based machine translation technology to tackle the PoS tagging problem. Experiments on the Penn Treebank WSJ task were carried out and very good results were obtained.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Harris, Z.: String analysis of sentence structure. Mouton, The Hague (1962)

    Google Scholar 

  2. Brill, E.: A Simple Rule-Based Part-of-speech Tagger. In: Proceedings of the Third Conference on Applied Natural Language Processing, ANLP (1992)

    Google Scholar 

  3. Brants, T.: TnT – A Statistical Part-of-Speech Tagger. In: Proceedings of the Sixth Applied Natural Language Processing (ANLP-2000), Seattle, WA (2000)

    Google Scholar 

  4. Ratnaparkhi, A.: A Maximum Entropy Part-of.Speech Tagger. In: Proceedings of the 1st Conference on Empirical Methods in Natural Language Processing, EMNLP 1996 (1996)

    Google Scholar 

  5. Bender, O., Macherey, K., Och, F., Ney, H.: Comparison of Alignment Templates and Maximum Entropy Models for Natural Language Understanding. In: Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2003 (2003)

    Google Scholar 

  6. Zens, R., Och, F., Ney, H.: Improvements in phrase-based statistical machine translation. In: Proceedings of the Human Language Technology Conference, HLT-NAACL’2004 (2004)

    Google Scholar 

  7. Klein, S., Simons, F.: A computational approach to grammatical coding of English words. Journal of the Association for Computing Machinery 10(3) (1963)

    Google Scholar 

  8. Greene, B., Rubin, M.: Automatic tagging of English. Technical report, Department of Linguistics, Providence, Rhode Island, 1071 (1962)

    Google Scholar 

  9. Weischedel, R., Schwartz, R., Palmucci, J., Meteer, M., Ramsaw, L.: Coping with ambiguity and unknown words trhough probabilistic models. Computational Linguistics 19(2) (1993)

    Google Scholar 

  10. Merialdo, B.: Tagging English text with a probabilistic model. Computational Linguistics 20(2) (1994)

    Google Scholar 

  11. Koehn, P., Och, F., Marcu, D.: Statistical phrase-based translation. In: Proceedings of the Human Language Technology Conference, HLT-NAACL’2003 (2003)

    Google Scholar 

  12. Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a large annotated corpus of english: The Penn Treebank. Computational Linguistics 19(2) (1994)

    Google Scholar 

  13. Koehn, P.: Pharaoh: A beam search decoder for phrase-based statistical machine translation models. In: Frederking, R.E., Taylor, K.B. (eds.) AMTA 2004. LNCS (LNAI), vol. 3265, pp. 115–124. Springer, Heidelberg (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Joan Martí José Miguel Benedí Ana Maria Mendonça Joan Serrat

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Gascó i Mora, G., Sánchez Peiró, J.A. (2007). Part-of-Speech Tagging Based on Machine Translation Techniques. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2007. Lecture Notes in Computer Science, vol 4477. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72847-4_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72847-4_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72846-7

  • Online ISBN: 978-3-540-72847-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics