Skip to main content

Cooperativity in Networks of Pattern Recognizing Stochastic Learning Automata

  • Chapter
Adaptive and Learning Systems

Abstract

A class of learning tasks is described that combines aspects of learning automaton tasks and supervised learning pattern-classification tasks. We call these associative reinforcement learning tasks. An algorithm is presented, called the associative reward-penalty, or A R−P , algorithm, for which a form of optimal performance has been proved. This algorithm simultaneously generalizes a class of stochastic learning automata and a class of supervised learning pattern-classification methods. Simulation results are presented that illustrate the associative reinforcement learning task and the performance of the the A R−P algorithm. Additional simulation results are presented showing how cooperative activity in networks of interconnected A R−P automata can olve difficult nonlinear associative learning problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K.S. Narendra and M.A.L. Thathachar, “Learning Automata—A Survey,” IEEE Trans. Syst., Man, Cybern., vol. 4, pp. 323–334, 1974.

    MathSciNet  MATH  Google Scholar 

  2. K.S. Narendra and S. Lakshmivarahan, “Learning Automata—A Critique,” J. Cybern. and Inf. Sci., vol. 1, pp. 53–65, 1977.

    Google Scholar 

  3. P. Mars, K.S. Narendra, and M. Crystall, “Learning Automata Control of Computer Communication Networks,” Proc. Third Yale Workshop on Applications of Adaptive Systems Theory, 1983.

    Google Scholar 

  4. L.G. Mason, “Learning Automata and Telecommunications Switching,” Proc. Third Yale Workshop on Applications of Adaptive Systems Theory, 1983.

    Google Scholar 

  5. R.M. Wheeler and K.S. Narendra, “Models for Decentralized Decisionmaking,” Report No. 8403, Electrical Engineering, Yale University, 1984.

    Google Scholar 

  6. R.A. Jarvis, “Teaching a Stochastic Automaton to Skillfully Play Hand/Eye Games,” J. of Cybern. and Inf. Sci., vol. 1, pp. 161–177, 1977.

    Google Scholar 

  7. S. Lakshmivarahan, Learning Algorithms and Applications Springer-Verlag, New York, 1981.

    Google Scholar 

  8. I.H. Witten, “An Adaptive Optimal Controller for Discrete-time Markov Environments,” Inf. and Contr., vol. 34, pp. 286–295, 1977.

    Article  MathSciNet  MATH  Google Scholar 

  9. A.G. Barto and P. Anandan, “Pattern Recognizing Stochastic Learning Automata,” IEEE Trans. on Syst., Man, Cybern., vol. 15, pp. 360–375, 1985.

    MathSciNet  MATH  Google Scholar 

  10. R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis Wiley, New York, 1973.

    Google Scholar 

  11. M.A.L. Thathachar and K.R. Ramakrishnan, “An Automaton Model of a Hierarchical Learning System,” Proc. 8th Triennial World Congress, IFAC Control Science and Technology, Kyoto, Japan, pp. 1065–1070, 1981.

    Google Scholar 

  12. A.G. Barto, Editor. “Simulation Experiments with Goal-seeking Adaptive Elements,” Air Force Wright Aeronautical Laboratories/Avionics Laboratory Technical Report AFWAL-TR-84–1022, Wright-Patterson AFB, Ohio, 1984.

    Google Scholar 

  13. A.G. Barto, C.W. Anderson, and R.S. Sutton, “Synthesis of Nonlinear Control Surfaces by a Layered Associative Search Network,” Biol. Cybern., vol. 43, pp. 175–185, 1982.

    Article  MATH  Google Scholar 

  14. A.G. Barto and R.S. Sutton, “Landmark Learning: An Illustration of Associative Search,” Biol. Cybern., vol. 42, pp. 1–8, 1981.

    Article  MATH  Google Scholar 

  15. A.G. Barto, R.S. Sutton, and C.W. Anderson, “Neuronlike Elements That Can Solve Difficult Learning Control Problems,” IEEE Trans. on Syst., Man, Cybern., vol. SMC13, pp. 834–846, 1983.

    Google Scholar 

  16. A.G. Barto, R.S. Sutton, and P.S. Brouwer, “Associative Search Network: A Reinforcement Learning Associative Memory,” Biol. Cybern., vol. 40, pp 201–211, 1981.

    Article  MATH  Google Scholar 

  17. R.S. Sutton and A.G. Barto, “Toward a Modern Theory of Adaptive Networks: Expectation and Prediction,” Psych. Rev., vol. 88, pp. 135–171, 1981.

    Article  Google Scholar 

  18. J.A. Feldman (Ed.), Special Issue on Connectionist Models and Their Applications, Cognitive Science, vol. 9, 1985.

    Google Scholar 

  19. G. Hinton and J. Anderson, Parallel Models of Associative Memory Erlbaum, Hilsdale, N. J., 1981.

    Google Scholar 

  20. T. Kohonen, Associative Memory: A System Theoretic Approach Springer, Berlin, 1977.

    Google Scholar 

  21. A.H. Klopf, The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence Hemisphere, Washington, D.C., 1982.

    Google Scholar 

  22. D.H. Ackley, G.E. Hinton, and T.J. Sejnowski, “A Learning Algorithm for Boltzmann Machines,” Cognitive Science, vol. 9, pp. 147–169, 1985.

    Article  Google Scholar 

  23. D.E. Rumelhart, G.E. Hinton, and R.J. Williams, “Learning Internal Representations by Error Propagation,” ICS Report 8506, Institute for Cognitive Science, University of California, San Diego, 1985.

    Google Scholar 

  24. B. Widrow and M.E. Hoff, “Adaptive Switching Circuits,” 1960 WESCON Convention Record Part IV, pp. 96–104, 1960.

    Google Scholar 

  25. R.L. Kasyap, C.C. Blaydon, and K.S. Fu, “Stochastic Approximation,” in Adaptation, Learning and Pattern Recognition Systems: Theory and Applications J.M. Mendel and K.S. Fu, Eds. Academic Press, New York, 1970.

    Google Scholar 

  26. B. Widrow, N.K.. Gupta, and S. Maitra, “Punish/Reward: Learning with a Critic in Adaptive Threshold Systems,” IEEE Trans. on Syst., Man, Cybern., vol. 5, pp. 455465, 1973.

    Google Scholar 

  27. S. Lakshmivarahan, “e-optimal Learning Algorithms—Non-absorbing Barrier Type,” Technical Report EECS 7901, School of Electrical Engineering and Computer Sciences, University of Oklahoma, Norman, Oklahoma, 1979.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1986 Springer Science+Business Media New York

About this chapter

Cite this chapter

Barto, A.G., Anandan, P., Anderson, C.W. (1986). Cooperativity in Networks of Pattern Recognizing Stochastic Learning Automata. In: Narendra, K.S. (eds) Adaptive and Learning Systems. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-1895-9_16

Download citation

  • DOI: https://doi.org/10.1007/978-1-4757-1895-9_16

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4757-1897-3

  • Online ISBN: 978-1-4757-1895-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics