Estimating Grammar Parameters Using Bounded Memory

Oates, Tim; Heeringa, Brent

doi:10.1007/3-540-45790-9_15

Tim Oates⁶ &
Brent Heeringa⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2484))

Included in the following conference series:

International Colloquium on Grammatical Inference

311 Accesses
3 Citations

Abstract

Estimating the parameters of stochastic context-free grammars (SCFGs) from data is an important, well-studied problem. Almost without exception, existing approaches make repeated passes over the training data. The memory requirements of such algorithms are ill-suited for embedded agents exposed to large amounts of training data over long periods of time. We present a novel algorithm, called HOLA, for estimating the parameters of SCFGs that computes summary statistics for each string as it is observed and then discards the string. The memory used by HOLA is bounded by the size of the grammar, not by the amount of training data. Empirical results show that HOLA performs as well as the Inside-Outside algorithm on a variety of standard problems, despite the fact that it has access to much less information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sakakibara, Y., Brown, M., Highey, R., Mian, I.S., Sjolander, K., Haussler, D.: Stochastic context-free grammars for tRNA modeling. Nucleic Acids Research 22 (1994) 5112–5120
Article Google Scholar
Jurafsky, D., Wooters, C., Segal, J., Stolcke, A., Fosler, E., Tajchman, G., Morgan, N.: Using a stochastic context-free grammar as a language model for speech recognition. In: Proceedings of ICASSP. (1995) 189–192
Google Scholar
Schabes, Y., Roth, M., Osborne, R.: Parsing the Wall Street Journal with the inside-outside algorithm. In: Proceedings of the 6th Conference of the European Chapter of the Association for Computational Linguistics. (1993) 341–346
Google Scholar
Hopcroft, J.E., Ullman, J.D.: Introductin to Automata Theory, Languages, and Computation. Addison Wesley (1979)
Google Scholar
Lari, K., Young, S.J.: The estimation of stochastic context-free grammars using the inside-outiside algorithm. Computer Speech and Language 4 (1990) 35–56
Article Google Scholar
Lari, K., Young, S.J.: Applications of stochastic context-free grammars using the inside-outside algorithm. Computer Speech and Language 5 (1991) 237–257
Article Google Scholar
Dempster, N.M., Laird, A.P., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B 39 (1977) 185–197
MathSciNet Google Scholar
Neal, R.M., Hinton, G.E.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In Jordan, M.I., ed.: Learning in Graphical Models, Kluwer Academic (1998)
Google Scholar
Boyen, X., Koller, D.: Approximate learning of dynamic models. In: Neural Information Processing Systems. (1998)
Google Scholar
Cook C.M., Rosenfeld, A., Aronson, A.: Grammatical inference by hill climbing. Informational Sciences 10 (1976) 59–80
MathSciNet Google Scholar
Stolcke, A., Omohundro, S.: Inducing probabilistic grammars by bayesian model merging. In Carrasco, R.C., Oncina, J., eds.: Grammatical Inference and Applications, Berlin, Heidelberg, Springer (1994) 106–118
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD, 21250
Tim Oates
Department of Computer Science, University of Massachusetts, Amherst, Amherst, MA, 01003
Brent Heeringa

Authors

Tim Oates
View author publications
You can also search for this author in PubMed Google Scholar
Brent Heeringa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Perot Systems Nederland B.V., Hoefseweg 1, 3821 AE, Amersfoort, The Netherlands
Pieter Adriaans (Senior Research Advisor, Professor of Learning and Adaptive Systems) (Senior Research Advisor, Professor of Learning and Adaptive Systems)
ILLC/Computation and Complexity Theory, Universiteit van Amsterdam, Plantage Muidergracht 24, 1018 TV, Amsterdam, The Netherlands
Pieter Adriaans (Senior Research Advisor, Professor of Learning and Adaptive Systems) (Senior Research Advisor, Professor of Learning and Adaptive Systems)
School of Electrical Engineering and Computer Science, University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia
Henning Fernau
Wilhelm-Schickard-Institut für Informatik, Universität Tübingen, Sand 13, 72076, Tübingen, Germany
Henning Fernau
FNWI/ILLC, Cognitive Systems and Information Processing Group, Universiteit van Amsterdam, Room B-5.39, Nieuwe Achtergracht 166, 1018 WV, Amsterdam, The Netherlands
Menno van Zaanen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oates, T., Heeringa, B. (2002). Estimating Grammar Parameters Using Bounded Memory. In: Adriaans, P., Fernau, H., van Zaanen, M. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2002. Lecture Notes in Computer Science(), vol 2484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45790-9_15

Download citation

DOI: https://doi.org/10.1007/3-540-45790-9_15
Published: 05 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44239-4
Online ISBN: 978-3-540-45790-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics