Automata induction; Automatic induction; Automatic language induction; Grammar induction; Grammatical induction; Grammatical inference
Grammar inference is the task of learning grammars or languages from training data. It is a type of inductive inference, the name given to learning techniques that try to guess general rules from examples.
The basic problem is to find a grammar consistent with a training set of positive examples. Usually, the target language is infinite, while the training set is finite. Some work assumes that both positive and negative examples are available, but this is not true in most real applications. Sometimes probability information is attached to each example. In this case, it is possible to learn a probability distribution for the strings in the language in addition to the grammar. This is sometimes called stochastic grammar inference.
A grammar inference algorithm must target a particular grammar representation. More expressive...
- 1.Ahonen H, Mannila H, Nikunen E. Generating grammars for SGML tagged texts lacking DTD. In: Proceedings of the Workshop on Principles of Document Processing; 1994.Google Scholar
- 2.Ahonen H, Mannila H, Nikunen E. Forming grammars for structured documents: an application of grammatical inference. In: Carrasco R, Oncina J, editors. Lecture notes in computer science, vol. 862. Berlin/New York: Springer; 1994. p. 153–67.Google Scholar
- 6.Fankhauser P, Xu Y. MarkItUp! an incremental approach to document structure recognition. Electron Publ Orig Dissem Des. 1993;6(4):447–56.Google Scholar
- 9.Goldman R, Widom J. DataGuides: enabling query formulation and optimization in semi-structured databases. In: Proceedings of the 23th International Conference on Very Large Data Bases; 1997. p. 436–45.Google Scholar
- 13.Shafer K. Creating DTDs via the GB-engine and Fred. Dublin/Ohio: OCLC Online Computer Library Center; 1995.Google Scholar
- 14.Stolcke A, Omohundro S. Inducing probabilistic grammars by Bayesian model merging. In: Carrasco R, Oncina J, editors. Lecture notes in computer science. 862; 1994. p. 106–18.Google Scholar