Skip to main content

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 154))

Abstract

A higher order recurrent neural network architecture learns to recognize and generate languages after being “trained” on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a “bifurcation” in the limit behavior of the network. This phase transition corresponds to the onset of the network’s capacity for generalizing to arbitrary-length strings. Second, a study of the automata resulting from the acquisition of previously published training sets indicates that while the architecture is not guaranteed to find a minimal finite automaton consistent with the given exemplars, which is an NP-Hard problem, the architecture does appear capable of generating non-regular languages by exploiting fractal and chaotic dynamics. I end the paper with a hypothesis relating linguistic generative capacity to the behavioral regimes of non-linear dynamical systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Anderson, J.A., Silverstein, J.W., Ritz, S.A. & Jones, R.S. (1977). Distinctive features, categorical perception, and probability learning: Some applications of a neural model.Psychological Review, 84, 413–451.

    Article  Google Scholar 

  • Angluin, D. (1978). On the complexity of minimum inference of regular sets. Information and Control, 39, 337–350.

    Article  MathSciNet  MATH  Google Scholar 

  • Angluin, D. & Smith, C.H. (1983). Inductive inference: Theory and methods. Computing Surveys, 15, 237–269.

    Article  MathSciNet  Google Scholar 

  • Barnsley, M.F. (1988). Fractals everywhere. San Diego: Academic Press.

    MATH  Google Scholar 

  • Berwick, R. (1985). The acquisition of syntactic knowledge. Cambridge: MIT Press.

    Google Scholar 

  • Chaitin, G.J. (1966). On the length of programs for computing finite binary sequences. Journal of the AM, 13, 547–569.

    MathSciNet  MATH  Google Scholar 

  • Chomsky, N. (1956). Three models for the description of language. IRE Transactions on Information Theory, IT-2, 113–124.

    Article  Google Scholar 

  • Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.

    Google Scholar 

  • Crutchfield, J.P., Farmer, J.D., Packard, N, H. & Shaw, R.S. (1986). Chaos. Scientific American, 255, 46–57.

    Article  Google Scholar 

  • Crutchfield, J.P. & Young, K. (1989). Computation at the onset of chaos. In W. Zurek, (Ed.), Complexity, entropy and the physics of Information. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Derrida, B. & Meir, R. (1988). Chaotic behavior of a layered neural network. Phys. Rev. A, 38.

    MathSciNet  Google Scholar 

  • Devaney, R.L. (1987) An introduction to chaotic dynamical systems. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Elman, J.L. (1990). Finding structure in time. Cognitive Science, 14, 179–212.

    Article  Google Scholar 

  • Feldman, J. A. (1972). Some decidability results in grammatical inference. Information & Control, 20, 244–462.

    Article  MATH  Google Scholar 

  • Fodor, J. & Pylyshyn, A. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28, 3–71.

    Article  Google Scholar 

  • Giles, C.L., Sun, G.Z., Chen, H.H., Lee, Y.C. & Chen, D. (1990). Higher order recurrent networks and grammatical inference. In D.S. Touretzky, (Ed.), Advances in neural information processing systems. Los Gatos, CA: Morgan Kaufmann.

    Google Scholar 

  • Gleick, J. (1987). Chaos: Making a new science. New York: Viking.

    MATH  Google Scholar 

  • Gold, E.M. (1967). Language identification in the limit. Information & Control, 10, 441–414.

    Article  Google Scholar 

  • Gold, E.M. (1978). Complexity of automaton identification from given data. Information and Control, 37, 302–320.

    Article  MathSciNet  MATH  Google Scholar 

  • Grassberger, P. & Procaccia, I. (1983). Measuring the strangeness of strange attractors. Physica, 9D, 189–208.

    MathSciNet  Google Scholar 

  • Grebogi, C., Ott, E. & Yorke, J.A. (1987). Chaos, strange attractors, and fractal basin boundaries in nonlinear dynamics. Science, 238, 632–638.

    Article  MathSciNet  MATH  Google Scholar 

  • Hendin, O., Horn, D. & Usher, M. (1991). Chaotic behavior of a neural network with dynamical thresholds. Int. Journal of Neural Systems, to appear.

    Google Scholar 

  • Hopfield, J.J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences USA, 79, 2554–2558.

    Article  MathSciNet  Google Scholar 

  • Hornik, K., Stinchcombe, M. & White, H. (1990). Multi-layer feedforward networks are universal approximators. Neural networks, 3.

    Google Scholar 

  • Huberman, B.A. & Hogg, T. (1987). Phase transitions in artificial intelligence systems. Artificial Intelligence, 33, 155–172.

    Article  Google Scholar 

  • Joshi, A.K. (1985). Tree adjoining grammars: How much context-sensitivity is required to provide reasonable structural descriptions? In D.R. Dowty, L. Karttunen & A.M. Zwicky, (Eds.), Natural language parsing. Cambridge, Cambridge University Press.

    Google Scholar 

  • Joshi, A.K., Vijay-shanker, K. & Weir, D.J. (1989). Convergence of mildly context-sensitive grammar formalism. In T. Wasow & P. Sells, (Eds.), The processing of linguistic structure. Cambridge: MIT Press.

    Google Scholar 

  • Kolen, J.F. & Pollack, J.B. (1990). Back-propagation is sensitive to initial conditions. Complex Systems, 4, 269–280.

    MATH  Google Scholar 

  • Kurten, K.E. (1987). Phase transitions in quasirandom neural networks. In Institute of Electrical and Electronics Engineers First International Conference on Neural Networks. San Diego, II-197-20.

    Google Scholar 

  • Lapedes, A.S. & Farber, R.M. (1988). How neural nets work (LAUR-88-418): Los Alamos, NM.

    Google Scholar 

  • Lieberman, P. (1984). The biology and evolution of language. Cambridge: Harvard University Press.

    Google Scholar 

  • Lindgren, K. & Nordahl, M.G. (1990). Universal computation in simple one-dimensional cellular automata. Complex Systems, 4, 299–318.

    MathSciNet  MATH  Google Scholar 

  • Lippman, R.P. (1987). An introduction to computing with neural networks. Institute of Electrical and Electronics Engineers ASSP Magazine, April, 4–22.

    Google Scholar 

  • MacLennan, B.J. (1989). Continuous computation (CS-89-83). Knoxville, TN: University of Tennessee, Computer Science Dept.

    Google Scholar 

  • MacWhinney, B. (1987). Mechanisms of language acquisition. Hillsdale: Lawrence Erlbaum Associates.

    Google Scholar 

  • Mandelbrot, B. (1982). The fractal geometry of nature. San Francisco: Freeman.

    MATH  Google Scholar 

  • McCulloch, WS. & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5, 115–133.

    Article  MathSciNet  MATH  Google Scholar 

  • Mealy, G.H. (1955). A method for synthesizing sequential circuits. Bell System Technical Journal, 43, 1045–1079.

    MathSciNet  Google Scholar 

  • Metcalfe, J. & Wiebe, D. (1987). Intuition in insight and noninsight problem solving. Memory and Cognition, 15, 238–246.

    Article  Google Scholar 

  • Minsky, M. (1972). Computation: Finite and infinite machines. Cambridge, MA: MIT Press.

    Google Scholar 

  • Minsky, M. & Poper, S. (1988). Perceptrons. Cambridge, MA: MIT Press.

    MATH  Google Scholar 

  • Moore, C. (1990). Unpredictability and undecidability in dynamical systems. Physical Review Letters, 62, 2354–2357.

    Article  Google Scholar 

  • Mozer, M. (1988). A focused back-propagation algorithm for temporal pattern recognition (CRG-Technical Report-88-3). University of Toronto.

    Google Scholar 

  • Pearlmutter, B.A. (1989). Learning state space trajectories in recurrent neural networks. Neural Computation, 1, 263–269.

    Article  Google Scholar 

  • Pineda, F.J. (1987). Generalization of back-propagation to recurrent neural networks. Physical Review Letters, 59, 2229–2232.

    Article  MathSciNet  Google Scholar 

  • Pinker, S. (1984). Language learnability and language development. Cambridge: Harvard University Press.

    Google Scholar 

  • Pinker, S. & Prince, A. (1988). On language and connectionism: Analysis of a parallel distributed processing model of language inquisition. Cognition, 28, 73–193.

    Article  Google Scholar 

  • Pinker, S. & Bloom, P. (1990). Natural language and natural selection. Brain and Behavioral Sciences, 12, 707–784.

    Article  Google Scholar 

  • Plunkett, K. & Marchman, V. (1989). Pattern association in a back-propagation network: Implications for child language acquisition (Technical Report 8902). San Diego: UCSD Center for Research in Language.

    Google Scholar 

  • Pollack, J.B. (1987). Cascaded back propagation on dynamic connectionist networks. Proceedings of the Ninth Conference of the Cognitive Science Society (pp. 391–404). Seattle, WA.

    Google Scholar 

  • Pollack, J.B. (1987). On connectionist models of natural language processing. Ph.D. Thesis, Computer Science Department, University of Illinois, Urbana, IL. (Available as MCCS-87-100, Computing Research Laboratory, Las Cruces, NM)

    Google Scholar 

  • Pollack, J.B. (1989). Implications of recursive distributed representations. In D.S. Touretzky, (Ed.), Advances in neural information processing systems. Los Gatos, CA: Morgan Kaufmann.

    Google Scholar 

  • Pollack, J.B. (1990). Recursive distributed representation. Artificial Intelligence, 46, 77–105.

    Article  Google Scholar 

  • Pollard, C. (1984). Generalized context-free grammars, head grammars and natural language. Doctoral Dissertation, Dept. of Linguistics, Stanford University, Palo Alto, CA.

    Google Scholar 

  • Rivest, R.L. & Schapire, R.E. (1987). A new approach to unsupervised learning in deterministic environments. Proceedings of the Fourth International Workshop on Machine Learning (pp. 364–475). Irvine, CA.

    Google Scholar 

  • Rumelhart, D.E. & McClelland, J.L. (1986). PDP models and general issues in cognitive science. In D.E. Rumelhart, J.L. McClelland & the PDP Research Group, (Eds.), Parallel distributed processing: Experiments in the microstructure of cognition, Vol. 1. Cambridge: MIT Press.

    Google Scholar 

  • Rumelhart, D.E., Hinton, G. & Williams, R. (1986). Learning internal representations through error propagation. In D.E. Rumelhart, J.L. McClelland & the PDP Research Group, (Eds.), Parallel distributed processing: Experiments in the microstructure of cognition, Vol. 1. Cambridge: MIT Press.

    Google Scholar 

  • Servan-Schreiber, D, Cleeremans, A. & McClelland, J.L. (1989). Encoding sequential structure in simple recurrent networks. In D.S. Touretzky, (Ed.), Advances in neural information processing systems. Los Gatos, CA: Morgan Kaufmann.

    Google Scholar 

  • Skarda, C.A. & Freeman, W.J. (1987). How brains make chaos. Brain & Behavioral Science, 10.

    Google Scholar 

  • Smolensky, P. (1986). Information processing in dynamical systems: Foundations of harmony theory. In D.E. Rumelhart, J.L. McClelland & the PDP Research Group, (Eds.), Parallel distributed processing: Experiments in the microstructure of cognition, Vol. 1. Cambridge: MIT Press.

    Google Scholar 

  • Tomita, M. (1982). Dynamic construction of finite-state automata from examples using hill-climbing. Proceedings of the Fourth Annual Cognitive Science Conference (pp. 105–108). Ann Arbor, MI.

    Google Scholar 

  • Touretzky, D.S. & Geva, S. (1987). A distributed connectionist representation for concept structures. Proceedings of the Ninth Annual Conference of the Cognitive Science Society (pp. 155–164). Seattle, WA.

    Google Scholar 

  • van der Maas, H., Verschure, P. & Molenaar, P. (1990). A note on chaotic behavior in simple neural networks. Neural Networks, 3, 119–122.

    Article  Google Scholar 

  • Wexler, K. & Culicover, P.W. (1980). Formal principles of language acquisition. Cambridge: MIT Press.

    Google Scholar 

  • Wolfram, S. (1984). Universality and complexity in cellular automata. Physica, 10D, 1–35.

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1991 Springer Science+Business Media New York

About this chapter

Cite this chapter

Pollack, J.B. (1991). The Induction of Dynamical Recognizers. In: Touretzky, D. (eds) Connectionist Approaches to Language Learning. The Springer International Series in Engineering and Computer Science, vol 154. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-4008-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-4008-3_6

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-6792-5

  • Online ISBN: 978-1-4615-4008-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics