Skip to main content

Discriminative Latent Variable Grammars

  • Chapter
  • First Online:
  • 1149 Accesses

Abstract

As we saw in the previous chapter, learning a refined latent variable grammar involves the estimation of a set of grammar parameters θ on latent annotations despite the fact that the original trees lack the latent annotations. In the previous chapter, we considered generative grammars, where the parameters θ are set to maximize the joint likelihood of the training sentences and their parse trees. In this section we will consider discriminative grammars, where the parameters θ are set to maximize the likelihood of the correct parse tree (vs. all possible trees) given a sentence.

The material in this chapter was originally presented in Petrov and Klein (2008a,b).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The material in this chapter was originally presented in Petrov and Klein (2008a,b).

  2. 2.

    Although we show only the binary component, of course both binary and unary productions are included.

  3. 3.

    Alternatively, maximum conditional likelihood estimation can also be seen as a special case of maximum likelihood estimation, where P(w) is assumed to be the empirical one and not learned. The conditional likelihood optimization can therefore be addressed by an EM algorithm which is similar to the generative case. However, while the E-Step remains the same, the M-Step involves fitting a log-linear model, which requires optimization, unlike the joint case, which can be done analytically using relative frequency estimators. This EM algorithm typically converges to a comparable local maximum as direct optimization of the objective function but requires 3–4 times more iterations.

  4. 4.

    We consider different regularization penalties in Sect. 3.3.2.2.

  5. 5.

    Even a tighter threshold produced no search errors on a held out set in Chap. 2. We enforce that the gold parse is always reachable.

  6. 6.

    The other main factor determining the parsing time is the grammar size.

  7. 7.

    Memory limitations prevent us from learning grammars with more subcategories, a problem that could be alleviated by merging back the least useful splits as in Sect. 2.3.2.

  8. 8.

    Conversely, \(\hat{x}\) is a coarser version of x, or, in the language of Sect. 2.4.1.1, \(\hat{x}\) is a projection of x.

  9. 9.

    We define dominating productions and refining productions analogously as for subcategories.

  10. 10.

    L1-regularization drives more than 95% of the feature weights to zero in each round.

  11. 11.

    These scores lack any probabilistic interpretation, but can be normalized to compute the necessary expectations for training, see Sect. 3.2.

  12. 12.

    Synthetic constituents are nodes that are introduced during binarization.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Slav Petrov .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Petrov, S. (2011). Discriminative Latent Variable Grammars. In: Coarse-to-Fine Natural Language Processing. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22743-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22743-1_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22742-4

  • Online ISBN: 978-3-642-22743-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics