Backoff Parameter Estimation for the DOP Model

  • Khalil Sima’an
  • Luciano Buratto
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2837)


The Data Oriented Parsing (DOP) model currently achieves state-of-the-art parsing on benchmark corpora. However, existing DOP parameter estimation methods are known to be biased, and ad hoc adjustments are needed in order to reduce the effects of these biases on performance. In contrast with earlier work, in this paper we show that the DOP parameters constitute a hierarchically structured space of correlated events (rather than a set of disjoint events). The correlations between the different parameters can be expressed by an asymmetric relation called “backoff”. Subsequently, we present a novel recursive estimation algorithm that exploits this hierarchical structure for parameter estimation through discounting and backoff. Finally, we report on experiments showing error reductions of up to 15% in comparison to earlier estimation methods.


Probability Mass Computational Linguistics Discount Probability Root Label Disjoint Event 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bod, R.: What is the minimal set of fragments that achieves maximal parse accuracy? In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, ACL 2001 (2001)Google Scholar
  2. 2.
    Bod, R.: Enriching Linguistics with Statistics: Performance models of Natural Language. PhD dissertation. ILLC dissertation series 1995-14, University of Amsterdam (1995) Google Scholar
  3. 3.
    Bonnema, R., Buying, P., Scha, R.: A new probability model for data oriented parsing. In: Dekker, P. (ed.) Proceedings of the Twelfth Amsterdam Colloquium, University of Amsterdam, Amsterdam, pp. 85–90 (1999)Google Scholar
  4. 4.
    Johnson, M.: The DOP estimation method is biased and inconsistent. Computational Linguistics 28(1), 71–76 (2002)CrossRefGoogle Scholar
  5. 5.
    Black, E., Jelinek, F., Lafferty, J., Magerman, D., Mercer, R., Roukos, S.: Towards History-based Grammars: Using Richer Models for Probabilistic Parsing. In: Proceedings of the 31st Annual Meeting of the ACL (ACL 1993), Columbus, Ohio, Ohio (1993)Google Scholar
  6. 6.
    Katz, S.: Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech and Signal Processing (ASSP) 35(3), 400–401 (1987)CrossRefGoogle Scholar
  7. 7.
    Chen, S., Goodman, J.: An empirical study of smoothing techniques for language modeling. Technical Report TR-10-98, Harvard University (1998)Google Scholar
  8. 8.
    Chelba, C., Jelinek, F.: Exploiting syntactic structure for language modeling. In: Boitet, C., Whitelock, P. (eds.) Proceedings of the Thirty-Sixth Annual Meeting of the Association for Computational Linguistics and Seventeenth International Conference on Computational Linguistics, pp. 225–231. Morgan Kaufmann Publishers, San Francisco (1998)Google Scholar
  9. 9.
    Charniak, E.: A maximum entropy inspired parser. In: Proceedings of the 1st Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2000), Seattle, Washington, USA, pp. 132–139 (2000)Google Scholar
  10. 10.
    Sima’an, K.: Computational complexity of probabilistic disambiguation. Grammars 5(2), 125–151 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Bod, R.: A computational model of language performance: Data Oriented Parsing. In: Proceedings of the 14th International Conference on Computational Linguistics (COLING 1992), Nantes (1992) Google Scholar
  12. 12.
    Bod, R.: Combining semantic and syntactic structure for language modeling. In: Proceedings ICSLP 2000 (2000)Google Scholar
  13. 13.
    Good, I.: The population frequencies of species and the estimation of population parameters. Biometrika 40, 237–264 (1953)zbMATHMathSciNetGoogle Scholar
  14. 14.
    Scha, R., Bonnema, R., Bod, R., Sima’an, K.: Disambiguation and Interpretation of Wordgraphs using Data Oriented Parsing. Technical Report #31, Netherlands Organization for Scientific Research (NWO), Priority Programme Language and Speech Technology (1996),
  15. 15.
    Black, E., et al.: A procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars. In: Proceedings of the February 1991 DARPA Speech and Natural Language Workshop, pp. 306–311. Morgan Kaufman, San Mateo (1991)CrossRefGoogle Scholar
  16. 16.
    Sima’an, K.: Learning Efficient Disambiguation. PhD dissertation (University of Utrecht). ILLC dissertation series 1999-02, University of Amsterdam, Amsterdam (1999) Google Scholar
  17. 17.
    Bod, R.: Beyond Grammar: An Experience-Based Theory of Language. CSLI Publications, California (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Khalil Sima’an
    • 1
  • Luciano Buratto
    • 1
  1. 1.Institute for Logic, Language and Computation (ILLC)University of AmsterdamThe Netherlands

Personalised recommendations