Part-of-Speech Tagging Based on Machine Translation Techniques

Gascó i Mora, Guillem; Sánchez Peiró, Joan Andreu

doi:10.1007/978-3-540-72847-4_34

Part-of-Speech Tagging Based on Machine Translation Techniques

Guillem Gascó i Mora¹ &
Joan Andreu Sánchez Peiró¹

Conference paper

1597 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4477))

Abstract

In this paper, a new approach to the Part-of-Speech (PoS) tagging problem is proposed. The PoS tagging problem can be viewed as a special translation process where the source language is the set of strings being considered and the target language is the sequence of POS tags. In this work, we have used phrase-based machine translation technology to tackle the PoS tagging problem. Experiments on the Penn Treebank WSJ task were carried out and very good results were obtained.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Harris, Z.: String analysis of sentence structure. Mouton, The Hague (1962)
Google Scholar
Brill, E.: A Simple Rule-Based Part-of-speech Tagger. In: Proceedings of the Third Conference on Applied Natural Language Processing, ANLP (1992)
Google Scholar
Brants, T.: TnT – A Statistical Part-of-Speech Tagger. In: Proceedings of the Sixth Applied Natural Language Processing (ANLP-2000), Seattle, WA (2000)
Google Scholar
Ratnaparkhi, A.: A Maximum Entropy Part-of.Speech Tagger. In: Proceedings of the 1st Conference on Empirical Methods in Natural Language Processing, EMNLP 1996 (1996)
Google Scholar
Bender, O., Macherey, K., Och, F., Ney, H.: Comparison of Alignment Templates and Maximum Entropy Models for Natural Language Understanding. In: Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2003 (2003)
Google Scholar
Zens, R., Och, F., Ney, H.: Improvements in phrase-based statistical machine translation. In: Proceedings of the Human Language Technology Conference, HLT-NAACL’2004 (2004)
Google Scholar
Klein, S., Simons, F.: A computational approach to grammatical coding of English words. Journal of the Association for Computing Machinery 10(3) (1963)
Google Scholar
Greene, B., Rubin, M.: Automatic tagging of English. Technical report, Department of Linguistics, Providence, Rhode Island, 1071 (1962)
Google Scholar
Weischedel, R., Schwartz, R., Palmucci, J., Meteer, M., Ramsaw, L.: Coping with ambiguity and unknown words trhough probabilistic models. Computational Linguistics 19(2) (1993)
Google Scholar
Merialdo, B.: Tagging English text with a probabilistic model. Computational Linguistics 20(2) (1994)
Google Scholar
Koehn, P., Och, F., Marcu, D.: Statistical phrase-based translation. In: Proceedings of the Human Language Technology Conference, HLT-NAACL’2003 (2003)
Google Scholar
Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a large annotated corpus of english: The Penn Treebank. Computational Linguistics 19(2) (1994)
Google Scholar
Koehn, P.: Pharaoh: A beam search decoder for phrase-based statistical machine translation models. In: Frederking, R.E., Taylor, K.B. (eds.) AMTA 2004. LNCS (LNAI), vol. 3265, pp. 115–124. Springer, Heidelberg (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Departament de Sistemes Informàtics i Computació, Universitat Politència de València, Camí de Vera s/n, 46022 València (Spain),
Guillem Gascó i Mora & Joan Andreu Sánchez Peiró

Authors

Guillem Gascó i Mora
View author publications
You can also search for this author in PubMed Google Scholar
Joan Andreu Sánchez Peiró
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Joan Martí José Miguel Benedí Ana Maria Mendonça Joan Serrat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gascó i Mora, G., Sánchez Peiró, J.A. (2007). Part-of-Speech Tagging Based on Machine Translation Techniques. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2007. Lecture Notes in Computer Science, vol 4477. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72847-4_34

Download citation

DOI: https://doi.org/10.1007/978-3-540-72847-4_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72846-7
Online ISBN: 978-3-540-72847-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics