Q-learning and redundancy reduction in classifier systems with internal state

  • Antonella Giani
  • Andrea Sticca
  • Fabrizio Baiardi
  • Antonina Starita
Reinforcement Learning
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1398)


The Q-Credit Assignment (QCA) is a method, based on Q-learning, for allocating credit to rules in Classifier Systems with internal state. It is more powerful than other proposed methods, because it correctly evaluates shared rules, but it has a large computational cost, due to the Multi-Layer Perceptron (MLP) that stores the evaluation function. We present a method for reducing this cost by reducing redundancy in the input space of the MLP through feature extraction. The experimental results show that the QCA with Redundancy Reduction (QCA-RR) preserves the advantages of the QCA while it significantly reduces both the learning time and the evaluation time after learning.


Classifier System Credit Assignment Large Computational Cost Redundancy Reduction High Stable Performance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    L. B. Booker, D. E. Goldberg, and J. H. Holland. Classifier systems and genetic algorithms. In J. G. Carbonell, editor, Machine learning: paradigms and methods. MIT Press, 1990.Google Scholar
  2. 2.
    A. Giani, F. Baiardi, and A. Starita. Q-learning in evolutionary rule based systems. In Proceedings of the 3rd Parallel Problem Solving from Nature/ International Conference on Evolutionary Computing, LNCS 866. Springer-Verlag, 1994.Google Scholar
  3. 3.
    A. Giani, F. Baiardi, and A. Starita. Using Q-learning in classifier systems with internal state and rule sharing. Submitted for pubblication, 1997.Google Scholar
  4. 4.
    J. H. Holland. Escaping brittleness: The possibilities of general-purpose learning algorithm applied to parallel rule-based systems. In R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, editors, Machine learning: An artificial inteligence approach, volume 2. Morgan Kaufmann, 1986.Google Scholar
  5. 5.
    M. Loéve. Probability Theory. Van Nostrand, New York, 1963.Google Scholar
  6. 6.
    D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning internal representation by error propagation. In D. E. Rumelhart and J. McClelland, editors, Parallel Distributed Processing, volume 1. MIT Press, 1986.Google Scholar
  7. 7.
    T. D. Sanger. Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks, 12:459–473, 1989.CrossRefGoogle Scholar
  8. 8.
    C. J. C. H. Watkins. Learning with delayed rewards. PhD thesis, University of Cambridge, England, 1989.Google Scholar
  9. 9.
    S. W. Wilson. ZCS: A zeroth order classifier system. Evolutionary Computation, 2(1):1–18, 1994.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Antonella Giani
    • 1
  • Andrea Sticca
    • 1
  • Fabrizio Baiardi
    • 1
  • Antonina Starita
    • 1
  1. 1.Dip. di InformaticaUniversità di PisaPisaItaly

Personalised recommendations