Skip to main content

Anticipatory Learning Classifier Systems and Factored Reinforcement Learning

  • Conference paper
Anticipatory Behavior in Adaptive Learning Systems (ABiALS 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5499))

Included in the following conference series:

Abstract

Factored Reinforcement Learning (frl) is a new technique to solve Factored Markov Decision Problems (fmdps) when the structure of the problem is not known in advance. Like Anticipatory Learning Classifier Systems (alcss), it is a model-based Reinforcement Learning approach that includes generalization mechanisms in the presence of a structured domain. In general, frl and alcss are explicit, state-anticipatory approaches that learn generalized state transition models to improve system behavior based on model-based reinforcement learning techniques. In this contribution, we highlight the conceptual similarities and differences between frl and alcss, focusing on the one hand on spiti, an instance of frl method, and on alcss, macs and xacs, on the other hand. Though frl systems seem to benefit from a clearer theoretical grounding, an empirical comparison between spiti and xacs on two benchmark problems reveals that the latter scales much better than the former when some combination of state variables do not occur. Based on this finding, we discuss the mechanisms in xacs that result in the better scalability and propose importing these mechanisms into frl systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Butz, M.V., Sigaud, O., Gérard, P.: Anticipatory behavior: Exploiting knowledge about the future to improve current behavior. In: Butz, M.V., Sigaud, O., Gérard, P. (eds.) Anticipatory Behavior in Adaptive Learning Systems. LNCS, vol. 2684, pp. 1–10. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  2. Butz, M.V.: Anticipatory Learning Classifier Systems. Kluwer Academic Publishers, Boston (2002)

    Book  MATH  Google Scholar 

  3. Sutton, R.S.: Planning by incremental dynamic programming. In: Proceedings of the Eighth International Conference on Machine Learning, pp. 353–357. Morgan Kaufmann, San Mateo (1990)

    Google Scholar 

  4. Gérard, P., Sigaud, O.: Designing efficient exploration with MACS: Modules and function approximation. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2723, pp. 1882–1893. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Proceedings of the 14th International Joint Conference in Artificial Intelligence, pp. 1104–1111 (1995)

    Google Scholar 

  6. Degris, T., Sigaud, O., Wuillemin, P.H.: Chi-square tests driven method for learning the structure of factored MDPs. In: Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence, Massachusetts Institute of Technology, Cambridge, pp. 122–129. AUAI Press (2006)

    Google Scholar 

  7. Degris, T., Sigaud, O., Wuillemin, P.H.: Learning the structure of factored markov decision processes in reinforcement learning problems. In: Proceedings of the 23rd International Conference in Machine Learning, pp. 257–264. ACM, Pittsburgh (2006)

    Google Scholar 

  8. Sigaud, O., Wilson, S.W.: Learning Classifier Systems: a survey. Journal of Soft Computing 11(11), 1065–1078 (2007)

    Article  MATH  Google Scholar 

  9. Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. University of Michigan Press, Ann Arbor (1975)

    MATH  Google Scholar 

  10. Wilson, S.W.: ZCS, a Zeroth level Classifier System. Evolutionary Computation 2(1), 1–18 (1994)

    Article  MathSciNet  Google Scholar 

  11. Wilson, S.W.: Classifier Fitness Based on Accuracy. Evolutionary Computation 3(2), 149–175 (1995)

    Article  Google Scholar 

  12. Riolo, R.L.: Lookahead planning and latent learning in a Classifier System. In: Meyer, J.A., Wilson, S.W. (eds.) From animals to animats: Proceedings of the First International Conference on Simulation of Adaptative Behavior, pp. 316–326. MIT Press, Cambridge (1991)

    Google Scholar 

  13. Holland, J.H., Reitman, J.S.: Cognitive Systems based on Adaptive Algorithms. Pattern Directed Inference Systems 7(2), 125–149 (1978)

    Google Scholar 

  14. Stolzmann, W.: Anticipatory Classifier Systems. In: Koza, J., Banzhaf, W., Chellapilla, K., Deb, K., Dorigo, M., Fogel, D.B., Garzon, M.H., Goldberg, D.E., Iba, H., Riolo, R. (eds.) Proceedings of the 1998 Genetic and Evolutionary Computation Conference, pp. 658–664. Morgan Kaufmann Publishers, Inc., San Francisco (1998)

    Google Scholar 

  15. Butz, M.V., Goldberg, D.E., Stolzmann, W.: Introducing a genetic generalization pressure to the Anticipatory Classifier Systems part I: Theoretical approach. In: Proceedings of the 2000 Genetic and Evolutionary Computation Conference (GECCO 2000), pp. 34–41 (2000)

    Google Scholar 

  16. Hoffmann, J.: Vorhersage und Erkenntnis [Anticipation and Cognition]. Hogrefe, Göttingen (1993)

    Google Scholar 

  17. Butz, M.V.: An Algorithmic Description of ACS2. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS, vol. 2321, pp. 211–229. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  18. Butz, M.V., Goldberg, D.E., Stolzmann, W.: The Anticipatory Classifier System and Genetic Generalization. Natural Computing 1(4), 427–467 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  19. Butz, M.V., Goldberg, D.E.: Generalized state values in an anticipatory Learning Classifier System. In: Butz, M.V., Sigaud, O., Gérard, P. (eds.) Anticipatory Behavior in Adaptive Learning Systems. LNCS (LNAI), vol. 2684, pp. 282–301. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  20. Gérard, P., Stolzmann, W., Sigaud, O.: YACS: a new Learning Classifier System with Anticipation. Journal of Soft Computing: Special Issue on Learning Classifier Systems 6(3-4), 216–228 (2002)

    Article  MATH  Google Scholar 

  21. Gérard, P., Meyer, J.A., Sigaud, O.: Combining latent learning with dynamic programming in MACS. European Journal of Operational Research 160, 614–637 (2005)

    Article  MATH  Google Scholar 

  22. Dean, T., Kanazawa, K.: A Model for Reasoning about Persistence and Causation. Computational Intelligence 5, 142–150 (1989)

    Article  Google Scholar 

  23. Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artificial Intelligence 121(1-2), 10–49 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  24. Hoey, J., St-Aubin, R., Hu, A., Boutilier, C.: SPUDD: Stochastic Planning using Decision Diagrams. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 279–288. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  25. Utgoff, P.E.: Incremental induction of decision trees. Machine Learning 4, 161–186 (1989)

    Article  Google Scholar 

  26. Butz, M.V.: Rule-Based Evolutionary Online Learning Systems: A Principled Approach to LCS Analysis and Design. Springer, Heidelberg (2006)

    MATH  Google Scholar 

  27. Butz, M., Kovacs, T., Lanzi, P.L., Wilson, S.W.: Toward a theory of generalization and learning in XCS. IEEE Transactions on Evolutionary Computation 8(1), 28–46 (2004)

    Article  Google Scholar 

  28. Butz, M.V., Lanzi, P.L., Wilson, S.W.: Function approximation with XCS: Hyperellipsoidal conditions, recursive least squares, and compaction. IEEE Transactions on Evolutionary Computation 12, 355–376 (2008)

    Article  Google Scholar 

  29. Potts, D.: Incremental learning of linear model trees. In: Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), pp. 663–670 (2004)

    Google Scholar 

  30. Schaal, S., Atkeson, C.G.: Constructive incremental learning from only local information. Neural Computation 10, 2047–2084 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sigaud, O., Butz, M.V., Kozlova, O., Meyer, C. (2009). Anticipatory Learning Classifier Systems and Factored Reinforcement Learning. In: Pezzulo, G., Butz, M.V., Sigaud, O., Baldassarre, G. (eds) Anticipatory Behavior in Adaptive Learning Systems. ABiALS 2008. Lecture Notes in Computer Science(), vol 5499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02565-5_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02565-5_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02564-8

  • Online ISBN: 978-3-642-02565-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics