Advertisement

Part-of-speech tagging using decision trees

  • Lluís Màrquez
  • Horacio Rodríguez
Regular Papers Applications of ML
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1398)

Abstract

We have applied inductive learning of statistical decision trees to the Natural Language Processing (NLP) task of morphosyntactic disambiguation (Part Of Speech Tagging). Previous work showed that the acquired language models are independent enough to be easily incorporated, as a statistical core of rules, in any flexible tagger. They are also complete enough to be directly used as sets of POS disambiguation rules. We have implemented a quite simple and fast tagger that has been tested and evaluated on the Wall Street Journal (WSJ) corpus with a remarkable accuracy. In this paper we basically address the problem of tagging when only small training material is available, which is crucial in any process of constructing, from scratch, an annotated corpus. We show that quite high accuracy can be achieved with our system in this situation. In addition we also face the problem of dealing with unknown words under the same conditions of lacking training examples. In this case some comparative results and comments about close related work are reported.

Keywords

Natural Language Processing Wall Street Journal Ambiguous Word Training Corpus Word Sense Disambiguation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont, California, 1984.Google Scholar
  2. Brill, E.: A Simple Rule-Based Part-of-Speech Tagger. In Proceedings of the 3rd ACL Conference on Applied Natural Language Processing, 1992.Google Scholar
  3. Brill, E.: Unsupervised Learning of Disambiguation Rules for Part-of-speech Tagging. Proceedings of 3rd Workshop on Very Large Corpora, Massachusetts, 1995.Google Scholar
  4. Church, K.W.: S Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In proc. of 2nd Conference on Applied Natural Language Processing, 1988.Google Scholar
  5. Cover, T.M. and Thomas, J.A. (Editors): Elements of Information Theory. John & Wiley 1991.Google Scholar
  6. Cutting, D., Kupiec, J., Pederson, J. and Sibun, P.: A Practical Part-of-Speech Tagger. In proc. of 3rd Conference on Applied Natural Language Processing, 1992.Google Scholar
  7. DeRose, S.J.: Grammatical Category Disambiguation by Statistical Optimization. Computational Linguistics 14(1), pp. 31–39.Google Scholar
  8. Daelemans, W., Zavrel, J., Berck, P. and Gillis, S.: MTB: A Memory-Based Part-of-Speech Tagger Generator Proc. of 4th Workshop on Very Large Corpora, 1996.Google Scholar
  9. Garside, R., Leech, G. and Sampson, G.: The Computational Analysis of English. London and New York: Longman, 1987.Google Scholar
  10. Greene, B.B., and Rubin, G.M.: Automatic Grammatical Tagging of English. Technical Report, Department of Linguistics, Brown University, 1971.Google Scholar
  11. Karlsson, F., Voutilainen, A., Heikkilä, J. and Anttila, A.: Constraint Grammar. A Language-Independent System for Parsing Unrestricted Text. Mouton de Gruyter, Berlin, New York, 1995.Google Scholar
  12. Krenn, B. and Samuelsson, C.: The Linguist's Guide to Statistics. Don't Panic. Universität des Saarlandes. Saarbrücken. Germany. WWW: http://coli.uni-sb.deGoogle Scholar
  13. Krovetz, R.: Homonymy and Polysemy in Information Retrieval. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, ACL '97.Google Scholar
  14. López de Mántaras, R.: A Distance-Based Attribute Selection Measure for Decision Tree Induction. Machine Learning, Kluwer Academic, 1991.Google Scholar
  15. Magerman, M.: Learning Grammatical Structure Using Statistical Decision-Trees. In proc. of the 3rd International Colloquium on Grammatical Inference, ICGI '96.Google Scholar
  16. Marcus, M.P., Marcinkiewicz, M.A. and Santorini, B.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics, v.19, n.2, 1993.Google Scholar
  17. Màrquez, L. and Rodríguez, H.: Towards Learning a Constraint Grammar from Annotated Corpora Using Decision Trees. ESPRIT BRA-7315, WP #15, 1995.Google Scholar
  18. Màrquez, L. and Padró, L.: A Flexible POS Tagger Using an Automatically Acquired Language Model. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, ACL '97.Google Scholar
  19. Màrquez, L. and Rodriguez, H.: Automatically Acquiring a Language Model for POS Tagging Using Decision Trees. In Proceedings of the Second Conference on Recent Advances in Natural Language Processing, RANLP '97.Google Scholar
  20. McCarthy, J.F. and Lehnert, W.G.: Using Decision Trees for Coreference Resolution. Proceedings of 14th IJCAI, 1995.Google Scholar
  21. Merialdo, B.: Tagging English Text with a Probabilistic Model. Computational Linguistics 20(2), pp. 155–171.Google Scholar
  22. Oostdijk, N.: Corpus Linguistic and the automatic analysis of English. Rodopi, Amsterdam, 1991.Google Scholar
  23. Padró, L.: A Hybrid Environment for Syntax-Semantic Tagging. PhD Thesis, Dep. Llenguatges i Sistemes Informàtics, Universitat Politecnica de Catalunya, 1998.Google Scholar
  24. Quinlan, J.R.: C4.5: Programs for Machine Learning. San Mateo, CA. Morgan Kaufmann, 1993.Google Scholar
  25. Rosenfeld, R.: Adaptive Statistical Language Modeling: A Maximum Entropy Approach. PhD Thesis. School of Computer Science, Carnegie Mellon University, 1994.Google Scholar
  26. Samuelsson, C., Tapanainen, P. and Voutilainen, A.: Inducing Constraint Grammars. Proceedings of the 3rd International Colloquium on Grammatical Inference, 1996.Google Scholar
  27. Samuelsson, C. and Voutilainen, A.: Comparing a Linguistic and a Stochastic Tagger. In Proceedings of the 35th Annual Meeting of the ACL, 1997.Google Scholar
  28. Schmid, H.: Part-of-speech tagging with neural networks. Proceedings of 15th International Conference on Computational Linguistics, COLING '94.Google Scholar
  29. Schmid, H.: Probabilistic Part-of-Speech Tagging Using Decision Trees. In Proceedings of the Conference on New Methods in Language Processing, Manchester, UK, 1994.Google Scholar
  30. Wilks, Y. and Stevenson, M.: Combining Independent Knowledge Sources for Word Sense Disambiguation. In Proceedings of the Second Conference on Recent Advances in Natural Language Processing, RANLP '97.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Lluís Màrquez
    • 1
  • Horacio Rodríguez
    • 1
  1. 1.Dep. Llenguatges i Sistemes InformàticsUniversitat Politècnica de CatalunyaCatalonia

Personalised recommendations