PCFG Learning by Nonterminal Partition Search

Belz, Anja

doi:10.1007/3-540-45790-9_2

Anja Belz⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2484))

Included in the following conference series:

International Colloquium on Grammatical Inference

332 Accesses

Abstract

pcfg Learning by Partition Search is a general grammatical inference method for constructing, adapting and optimising pcfgs. Given a training corpus of examples from a language, a canonical grammar for the training corpus, and a parsing task, Partition Search pcfg Learning constructs a grammar that maximises performance on the parsing task and minimises grammar size. This paper describes Partition Search in detail, also providing theoretical background and a characterisation of the family of inference methods it belongs to. The paper also reports an example application to the task of building grammars for noun phrase extraction, a task that is crucial in many applications involving natural language processing. In the experiments, Partition Search improves parsing performance by up to 21.45% compared to a general baseline and by up to 3.48% compared to a task-specific baseline, while reducing grammar size by up to 17.25%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Learning (k,l)-context-sensitive probabilistic grammars with nonparametric Bayesian approach

Article 16 July 2021

Learning Domain-Specific Grammars from a Small Number of Examples

Distributional Learning of Context-Free and Multiple Context-Free Grammars

References

A. Belz. 2001. Optimising corpus-derived probabilistic grammars. In Proceedings of Corpus Linguistics 2001, pages 46–57.
Google Scholar
A. Belz. 2002. Learning Grammars for Different Parsing Tasks by Partition Search. To appear in Proceedings of COLING 2002.
Google Scholar
E. Charniak and G. Carroll. 1994. Context-sensitive statistics for improved grammatical language models. Technical Report CS-94-07, Department of Computer Science, Brown University.
Google Scholar
E. Charniak. 1996. Tree-bank grammars. Technical Report CS-96-02, Department of Computer Science, Brown University.
Google Scholar
E. M. Gold. 1967. Language Identification in the Limit. Information and Control, 10:447–474.
Article MATH Google Scholar
M. Johnson. 1998. PCFG models of linguistic tree representations. Computational Linguistics, 24(4):613–632.
Google Scholar
A. J. Korenjak. 1969. A practical method for constructing LR(k) processors. Communications of the ACM, 12(11).
Google Scholar
Po Chui Luk, Helen Meng, and Fuliang Weng. 2000. Grammar partitioning and parser composition for natural langugage understanding. In Proceedings of ICSLP 2000.
Google Scholar
J. Nerbonne, A. Belz, N. Cancedda, H. Déjean, J. Hammerton, R. Koeling, S. Konstantopoulos, M. Osborne, F. Thollard, and E. Tjong Kim Sang. 2001. Learning computational grammars. In Proceedings of CoNLL 2001, pages 97–104.
Google Scholar
H. Schmid and S. Schulte Im Walde. 2000. Robust German noun chunking with a probabilistic context-free grammar. In Proceedings of COLING 2000, pages 726–732.
Google Scholar
H. Schmid. 2000. LoPar: Design and implementation. Bericht des Sonderforschungsbereiches “Sprachtheoretische Grundlagen für die Computerlinguistik” 149, Institute for Computational Linguistics, University of Stuttgart.
Google Scholar
J. Luis Verdú-Mas, J. Calera-Rubio, and R. C. Carrasco. 2000. A comparison of PCFG models. In Proceedings of CoNLL-2000 and LLL-2000, pages 123–125.
Google Scholar
F. L. Weng and A. Stolcke. 1995. Partitioning grammars and composing parsers. In Proceedings of the 4th International Workshop on Parsing Technologies.
Google Scholar
J. G. Wolff. 1982. Language Acquisition, Data Compression and Generalization. In Language and Communication, 2(1):57–89.
Article Google Scholar

Download references

Author information

Authors and Affiliations

ITRI University of Brighton, Lewes Road, Brighton, BN2 4GJ, UK
Anja Belz

Authors

Anja Belz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Perot Systems Nederland B.V., Hoefseweg 1, 3821 AE, Amersfoort, The Netherlands
Pieter Adriaans (Senior Research Advisor, Professor of Learning and Adaptive Systems) (Senior Research Advisor, Professor of Learning and Adaptive Systems)
ILLC/Computation and Complexity Theory, Universiteit van Amsterdam, Plantage Muidergracht 24, 1018 TV, Amsterdam, The Netherlands
Pieter Adriaans (Senior Research Advisor, Professor of Learning and Adaptive Systems) (Senior Research Advisor, Professor of Learning and Adaptive Systems)
School of Electrical Engineering and Computer Science, University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia
Henning Fernau
Wilhelm-Schickard-Institut für Informatik, Universität Tübingen, Sand 13, 72076, Tübingen, Germany
Henning Fernau
FNWI/ILLC, Cognitive Systems and Information Processing Group, Universiteit van Amsterdam, Room B-5.39, Nieuwe Achtergracht 166, 1018 WV, Amsterdam, The Netherlands
Menno van Zaanen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Belz, A. (2002). PCFG Learning by Nonterminal Partition Search. In: Adriaans, P., Fernau, H., van Zaanen, M. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2002. Lecture Notes in Computer Science(), vol 2484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45790-9_2

Download citation

DOI: https://doi.org/10.1007/3-540-45790-9_2
Published: 05 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44239-4
Online ISBN: 978-3-540-45790-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

PCFG Learning by Nonterminal Partition Search

Abstract

Access this chapter

Preview

Similar content being viewed by others

Learning (k,l)-context-sensitive probabilistic grammars with nonparametric Bayesian approach

Learning Domain-Specific Grammars from a Small Number of Examples

Distributional Learning of Context-Free and Multiple Context-Free Grammars

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

PCFG Learning by Nonterminal Partition Search

Abstract

Access this chapter

Preview

Similar content being viewed by others

Learning (k,l)-context-sensitive probabilistic grammars with nonparametric Bayesian approach

Learning Domain-Specific Grammars from a Small Number of Examples

Distributional Learning of Context-Free and Multiple Context-Free Grammars

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation