Discriminative Latent Variable Grammars

Petrov, Slav

doi:10.1007/978-3-642-22743-1_3

Discriminative Latent Variable Grammars

Slav Petrov²

Chapter
First Online: 01 January 2011

1149 Accesses

Part of the book series: Theory and Applications of Natural Language Processing ((NLP))

Abstract

As we saw in the previous chapter, learning a refined latent variable grammar involves the estimation of a set of grammar parameters θ on latent annotations despite the fact that the original trees lack the latent annotations. In the previous chapter, we considered generative grammars, where the parameters θ are set to maximize the joint likelihood of the training sentences and their parse trees. In this section we will consider discriminative grammars, where the parameters θ are set to maximize the likelihood of the correct parse tree (vs. all possible trees) given a sentence.

The material in this chapter was originally presented in Petrov and Klein (2008a,b).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
The material in this chapter was originally presented in Petrov and Klein (2008a,b).
2.
Although we show only the binary component, of course both binary and unary productions are included.
3.
Alternatively, maximum conditional likelihood estimation can also be seen as a special case of maximum likelihood estimation, where P(w) is assumed to be the empirical one and not learned. The conditional likelihood optimization can therefore be addressed by an EM algorithm which is similar to the generative case. However, while the E-Step remains the same, the M-Step involves fitting a log-linear model, which requires optimization, unlike the joint case, which can be done analytically using relative frequency estimators. This EM algorithm typically converges to a comparable local maximum as direct optimization of the objective function but requires 3–4 times more iterations.
4.
We consider different regularization penalties in Sect. 3.3.2.2.
5.
Even a tighter threshold produced no search errors on a held out set in Chap. 2. We enforce that the gold parse is always reachable.
6.
The other main factor determining the parsing time is the grammar size.
7.
Memory limitations prevent us from learning grammars with more subcategories, a problem that could be alleviated by merging back the least useful splits as in Sect. 2.3.2.
8.
Conversely, \(\hat{x}\) is a coarser version of x, or, in the language of Sect. 2.4.1.1, \(\hat{x}\) is a projection of x.
9.
We define dominating productions and refining productions analogously as for subcategories.
10.
L₁-regularization drives more than 95% of the feature weights to zero in each round.
11.
These scores lack any probabilistic interpretation, but can be normalized to compute the necessary expectations for training, see Sect. 3.2.
12.
Synthetic constituents are nodes that are introduced during binarization.

Author information

Authors and Affiliations

Google, New York, USA
Slav Petrov

Authors

Slav Petrov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Slav Petrov .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Petrov, S. (2011). Discriminative Latent Variable Grammars. In: Coarse-to-Fine Natural Language Processing. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22743-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-22743-1_3
Published: 08 September 2011
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22742-4
Online ISBN: 978-3-642-22743-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics