Abstract
We consider the problem of learning a finite automaton with recurrent neural networks from positive evidence. We train an Elman recurrent neural network with a set of sentences in a language and extract a finite automaton by clustering the states of the trained network. We observe that the generalizations beyond the training set, in the language recognized by the extracted automaton, are due to the training regime: the network performs a “loose” minimization of the prefix DFA of the training set, the automaton that has a state for each prefix of the sentences in the set.
This research is supported by DARPA/AFOSRF and DARPA under contracts No. F49620-97-1-0485 and No. N66001-96-C-8504. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright notation hereon.The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements either expressed or implied, of the or the U.S. Government.
Preview
Unable to display preview. Download preview PDF.
References
M. Casey. The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction. Neural Computation, 8:1135–1178, 1996.
A. Cleeremans, D. Servan-Schreiber, and J.L. McClelland. Finite state automata and simple recurrent networks. Neural Computation, 1:372–381, 1989.
P. R. Cohen. Empirical Methods for Artificial Intelligence. The MIT Press, 1995.
J. L. Elman. Finding structure in time. Cognitive science, 14:179–211, 1990.
J. L. Elman. Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning, 1992.
C. L. Giles, C. B. Miller, D. Chen, G. Z. Sun, H. H. Chen, and Y. C. Lee. Extracting and learning an unknown grammar with recurrent neural networks. In Advances in Neural Information Processing Systems 4. 1992.
E. M. Gold. Language identification in the limit. Information and control, 10:447–474, 1967.
John F. Kolen. Fool's gold: Extracting finite state machines from recurrent network dynamics. In Advances in Neural Information Processing Systems 6, 1994.
E. Makinen. Inferring regular languages by merging nonterminals. Technical Report A-1997-6, Department of Computer Science, University of Tampere, 1997.
Christian W. Omlin and C. Lee Giles. Constructing deterministic finite-state automata in recurrent neural networks. Journal of the ACM, 45(6):937, 1996.
Leonard Pitt and Manfred K. Warmuth. The minimum consistent dfa problem cannot be approximated within any polynomial. Journal of the ACM, 40(1):95–142, 1993.
D. E. Rumelhart, R. Durbin, R. Golden, and Y. Chauvin. Backpropagation: The basic theory. In Backpropagation: Theory, architectures, and applications. Erlbaum, 1993.
H. T Siegelmann. Theoretical Foundations of Recurrent Neural Networks. PhD thesis, Rutgers, 1992.
P. N. Werbos. The roots of backpropagation. John Wiley & Sons, Inc., 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Firoiu, L., Oates, T., Cohen, P.R. (1998). Learning a deterministic finite automaton with a recurrent neural network. In: Honavar, V., Slutzki, G. (eds) Grammatical Inference. ICGI 1998. Lecture Notes in Computer Science, vol 1433. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054067
Download citation
DOI: https://doi.org/10.1007/BFb0054067
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64776-8
Online ISBN: 978-3-540-68707-8
eBook Packages: Springer Book Archive