Synonyms
Automata induction; Automatic induction; Automatic language induction; Grammar induction; Grammatical induction; Grammatical inference
Definition
Grammar inference is the task of learning grammars or languages from training data. It is a type of inductive inference, the name given to learning techniques that try to guess general rules from examples.
The basic problem is to find a grammar consistent with a training set of positive examples. Usually, the target language is infinite, while the training set is finite. Some work assumes that both positive and negative examples are available, but this is not true in most real applications. Sometimes probability information is attached to each example. In this case, it is possible to learn a probability distribution for the strings in the language in addition to the grammar. This is sometimes called stochastic grammar inference.
A grammar inference algorithm must target a particular grammar representation. More expressive...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Ahonen H, Mannila H, Nikunen E. Generating grammars for SGML tagged texts lacking DTD. In: Proceedings of the Workshop on Principles of Document Processing; 1994.
Ahonen H, Mannila H, Nikunen E. Forming grammars for structured documents: an application of grammatical inference. In: Carrasco R, Oncina J, editors. Lecture notes in computer science, vol. 862. Berlin/New York: Springer; 1994. p. 153–67.
Angluin D. On the complexity of minimum inference of regular sets. Inf Control. 1978;39(3):337–50.
Angluin D. Inference of reversible languages. J ACM. 1982;29(3):741–85.
Baum LE, Petrie T, Soules G, Weiss N. A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann Math Stat. 1970;41(1):164–71.
Fankhauser P, Xu Y. MarkItUp! an incremental approach to document structure recognition. Electron Publ Orig Dissem Des. 1993;6(4):447–56.
Gold EM. Language identification in the limit. Inf Control. 1967;10(5):447–74.
Gold EM. Complexity of automaton identification from finite data. Inf Control. 1978;37(3):302–20.
Goldman R, Widom J. DataGuides: enabling query formulation and optimization in semi-structured databases. In: Proceedings of the 23th International Conference on Very Large Data Bases; 1997. p. 436–45.
Hopcroft JE, Ullman JD. Introduction to automata theory, languages and computation. Reading: Addison-Wesley; 1979.
Oncina J, GarcÃa P. Inferring regular languages in polynomial updated time. In: de la Blanca NP, Sanfeliu A, Vidal E, editors. Pattern recognition and image analysis. Singapore: World Scientific; 1992. p. 49–61.
Sánchez JA, Benedà JM. Statistical inductive learning of regular formal languages. In: Carrasco R, Oncina J, editors. Lecture notes in computer science, vol. 862; 1994. p. 130–8.
Shafer K. Creating DTDs via the GB-engine and Fred. Dublin/Ohio: OCLC Online Computer Library Center; 1995.
Stolcke A, Omohundro S. Inducing probabilistic grammars by Bayesian model merging. In: Carrasco R, Oncina J, editors. Lecture notes in computer science. 862; 1994. p. 106–18.
Young-Lai M, Tompa FW. Stochastic grammatical inference of text database structure. Mach Learn. 2000;40(2):111–37.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Young-Lai, M. (2018). Grammar Inference. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_182
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_182
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering