Prepositional Phrase Attachment Through a Backed-off Model
Recent work has considered corpus-based or statistical approaches to the problem of prepositional phrase attachment ambiguity. Typically, ambiguous verb phrases of the form v np1 p np2 are resolved through a model which considers values of the four head words (v, n1, p and n2). This paper shows that the problem is analogous to n-gram language models in speech recognition, and that one of the most common methods for language modeling, the backed-off estimate, is applicable. Results on Wall Street Journal data of 84.5% accuracy are obtained using this method. A surprising result is the importance of low-count events — ignoring events which occur less than 5 times in training data reduces performance to 81.6%.
KeywordsWall Street Journal Head Noun Maximum Entropy Model Sparse Data Problem Head Word
Unable to display preview. Download preview PDF.
- Brill, E. and Resnik, P. 1994. A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation. Proceedings of the fifteenth international conference on computational linguistics, Kyoto, Japan.Google Scholar
- Gale, W. and Church, K. 1990. Poor Estimates of Context are Worse than None. Proceedings of the June 1990 DARPA Speech and Natural Language Workshop, Hidden Valley, Pennsylvania.Google Scholar
- Karp, D., Schabes, Y., Zaidel, M. and Egedi, D. 1994. A Freely Available Wide Coverage Morphological Analyzer for English. Proceedings of the fifteenth International Conference on Computational Linguistics, Kyoto, Japan.Google Scholar
- Hindle, D. and Rooth, M. 1993 Structural Ambiguity and Lexical Relations. Com putational Linguistics, 19 (1): 103–120.Google Scholar
- Katz, S. 1987. Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recogniser, IEEE Transactions on Acoustics, Speech, and Signal Processing Vol. ASSP-35, No. 3, 1987.Google Scholar
- Marcus, M., Santorini, B. and Marcinkiewicz, M. 1993. Building a Large Annotated Corpus of English: the Penn Treebank. Computational Linguistics, 19 (2): 313–330.Google Scholar
- Ratnaparkhi, A., Reynar, J. and Roukos, S. 1994. A Maximum Entropy Model for Prepositional Phrase Attachment. Proceedings of the ARPA Workshop on Human Language Technology, Plainsboro, NJ, March 1994.Google Scholar