Advertisement

A Learning Classifier System Approach to Relational Reinforcement Learning

  • Drew Mellor
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4998)

Abstract

This article describes a learning classifier system (LCS) approach to relational reinforcement learning (RRL). The system, Foxcs-2, is a derivative of Xcs that learns rules expressed as definite clauses over first-order logic. By adopting the LCS approach, Foxcs-2, unlike many RRL systems, is a general, model-free and “tabula rasa” system. The change in representation from bit-strings in Xcs to first-order logic in Foxcs-2 necessitates modifications, described within, to support matching, covering, mutation and several other functions. Evaluation on inductive logic programming (ILP) and RRL tasks shows that the performance of Foxcs-2 is comparable to other systems. Further evaluation on RRL tasks highlights a significant advantage of Foxcs-2’s rule language: in some environments it is able to represent policies that are genuinely scalable; that is, policies that are independent of the size of the environment.

Keywords

Markov Decision Process Predicate Symbol Inductive Logic Programming Rule Discovery Rule Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bernadó, E., Llorà, X., Garrel, J.M.: XCS and GALE: A comparative study of two learning classifier systems on data mining. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 115–132. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Bernadó-Mansilla, E., Garrell-Guiu, J.M.: Accuracy-based learning classifier systems: Models, analysis and applications to classification tasks. Evolutionary Computation 11(3), 209–238 (2003)CrossRefGoogle Scholar
  3. 3.
    Beyer, H.-G., O’Reilly, U.-M. (eds.): Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2005. ACM Press, New York (2005)Google Scholar
  4. 4.
    Blockeel, H., Raedt, L.D.: Top-down induction of first-order logical decision trees. Artificial Intelligence 101(1–2), 285–297 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Blockeel, H., Džeroski, S., Kompare, B., Kramer, S., Pfahringer, B., Laer, W.V.: Experiments in predicting biodegradability. Applied Artificial Intelligence 18(2), 157–181 (2004)CrossRefGoogle Scholar
  6. 6.
    Bull, L., O’Hara, T.: Accuracy-based neuro and neuro-fuzzy classifier systems. In: Langdon, et al. (eds.) [28], pp. 905–911Google Scholar
  7. 7.
    Butz, M.V.: Rule-based Evolutionary Online Learning Systems: Learning Bounds, Classification, and Prediction. PhD thesis, University of Illinois at Urbana-Champaign, 104 S. Mathews Avenue, Urbana, IL 61801, U.S.A (2004)Google Scholar
  8. 8.
    Martin, V.: Kernel-based, ellipsoidal conditions in the real-valued XCS classifier system. In: Beyer, O’Reilly (eds.) [3], pp. 1835–1842Google Scholar
  9. 9.
    Butz, M.V., Kovacs, T., Lanzi, P.L., Wilson, S.W.: Toward a theory of generalization and learning in XCS. IEEE Transactions on Evolutionary Computation 8(1), 28–46 (2004)CrossRefGoogle Scholar
  10. 10.
    Butz, M.V., Pelikan, M.: Analyzing the evolutionary pressures in XCS. In: Spector, et al. (eds.) [40], pp. 935–942Google Scholar
  11. 11.
    Butz, M.V., Sastry, K., Goldberg, D.E.: Strong, stable, and reliable fitness pressure in XCS due to tournament selection. Genetic Programming and Evolvable Machines 6(1), 53–77 (2005)CrossRefGoogle Scholar
  12. 12.
    Butz, M.V., Wilson, S.W.: An algorithmic description of XCS. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) Advances in Learning Classifier Systems. Third International Workshop (IWLCS-2000), pp. 253–272. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  13. 13.
    Casillas, J., Carse, B., Bull, L.: Fuzzy XCS: an accuracy-based fuzzy classifier system. In: Proceedings of the XII Congreso Espanol sobre Tecnologia y Logica Fuzzy (ESTYLF 2004), pp. 369–376 (2004)Google Scholar
  14. 14.
    Cole, J., Lloyd, J., Ng, K.S.: Symbolic learning for adaptive agents. In: Proceedings of the Annual Partner Conference, Smart Internet Technology Cooperative Research Centre (2003), http://users.rsise.anu.edu.au/~jwl/crc_paper.pdf
  15. 15.
    Raedt, L.D., Laer, W.V.: Inductive constraint logic. In: Jantke, K.P., Shinohara, T., Zeugmann, T. (eds.) ALT 1995. LNCS, vol. 997, pp. 80–94. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  16. 16.
    Divina, F.: Hybrid Genetic Relational Search for Inductive Learning. PhD thesis, Department of Computer Science, Vrije Universiteit, Amsterdam, The Netherlands (2004)Google Scholar
  17. 17.
    Divina, F., Marchiori, E.: Evolutionary concept learning. In: Langdon, et al. (eds.) [28], pp. 343–350Google Scholar
  18. 18.
    Driessens, K., Džeroski, S.: Combining model-based and instance-based learning for first order regression. In: Raedt, L.D., Wrobel, S. (eds.) Proceedings of the Twenty-Second International Conference on Machine Learning (ICML 2005). ACM International Conference Proceeding Series, vol. 119, pp. 193–200. ACM Press, New York (2005)Google Scholar
  19. 19.
    Driessens, K., Ramon, J.: Relational instance based regression for relational reinforcement learning. In: Fawcett, T., Mishra, N. (eds.) Machine Learning, Proceedings of the Twentieth International Conference (ICML 2003), pp. 123–130. AAAI Press, Menlo Park (2003)Google Scholar
  20. 20.
    Driessens, K., Ramon, J., Blockeel, H.: Speeding up relational reinforcement learning through the use of an incremental first order decision tree learner. In: Raedt, L.D., Flach, P. (eds.) Proceedings of the 12th European Conference on Machine Learning, pp. 97–108. Springer, Heidelberg (2001)Google Scholar
  21. 21.
    Džeroski, S., Raedt, L.D., Driessens, K.: Relational reinforcement learning. Machine Learning 43(1–2), 7–52 (2001)CrossRefzbMATHGoogle Scholar
  22. 22.
    Džeroski, S., Jacobs, N., Molina, M., Moure, C., Muggleton, S., van Laer, W.: Detecting traffic problems with ILP. In: Page, D.L. (ed.) ILP 1998, vol. 1446, pp. 281–290. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  23. 23.
    Gärtner, T., Driessens, K., Ramon, J.: Graph kernels and Gaussian processes for relational reinforcement learning. In: Horváth, T., Yamamoto, A. (eds.) ILP 2003. LNCS (LNAI), vol. 2835, pp. 146–163. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  24. 24.
    Genesereth, M.R., Nilsson, N.J.: Logical Foundations of Artificial Intelligence. Morgan Kaufmann, San Francisco (1987)zbMATHGoogle Scholar
  25. 25.
    Holland, J.H.: Adaptation. In: Rosen, R., Snell, F.M. (eds.) Progress in Theoretical Biology, vol. 4. Plenum, NY (1976)Google Scholar
  26. 26.
    Kovacs, T.: Towards a theory of strong overgeneral classifiers. In: Martin, W., Spears, W. (eds.) Foundations of Genetic Algorithms 6, pp. 165–184. Morgan Kaufmann, San Francisco (2001)CrossRefGoogle Scholar
  27. 27.
    Kovacs, T.: A Comparison of Strength and Accuracy-Based Fitness in Learning Classifier Systems. PhD thesis, School of Computer Science, University of Birmingham, UK (2002)Google Scholar
  28. 28.
    Langdon, W.B., Cantú-Paz, E., Mathias, K.E., Roy, R., Davis, D., Poli, R., Balakrishnan, K., Honavar, V., Rudolph, G., Wegener, J., Bull, L., Potter, M.A., Schultz, A.C., Miller, J.F., Burke, E.K., Jonoska, N.: GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, 9-13 July 2002. Morgan Kaufmann, San Francisco (2002)Google Scholar
  29. 29.
    Lanzi, P.L.: Extending the representation of classifer conditions, part II: From messy codings to S-expressions. In: Banzhaf, W., Daida, J., Eiben, A.E., Garzon, M.H., Honavar, V., Jakiela, M., Smith, R.E. (eds.) Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 1999), pp. 345–352. Morgan Kaufmann, San Francisco (1999)Google Scholar
  30. 30.
    Lanzi, P.L.: Mining interesting knowledge from data with the XCS classifier system. In: Spector, et al. (eds.) [40], pp. 958–965Google Scholar
  31. 31.
    Lanzi, P.L., Loiacono, D., Wilson, S.W., Goldberg, D.E.: XCS with computed prediction in multistep environments. In: Beyer, O’Reilly (eds.) [3], pp. 1859–1866Google Scholar
  32. 32.
    Mellor, D.: A first order logic classifier system. In: Beyer, O’Reilly (eds.) [3], pp. 1819–1826Google Scholar
  33. 33.
    Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)zbMATHGoogle Scholar
  34. 34.
    Muggleton, S.: Inductive Logic Programming. In: The MIT Encyclopedia of the Cognitive Sciences (MITECS). Academic Press, London (1992)Google Scholar
  35. 35.
    Muggleton, S.: Inverse entailment and Progol. New Generation Computing, Special issue on Inductive Logic Programming 13(3–4), 245–286 (1995)CrossRefGoogle Scholar
  36. 36.
    Nienhuys-Cheng, S.-H., de Wolf, R.: Foundations of Inductive Logic Programming. LNCS, vol. 1228. Springer, Heidelberg (1997)zbMATHGoogle Scholar
  37. 37.
    Plotkin, G.D.: Automatic Methods of Inductive Inference. PhD thesis, Edinburgh University (1971)Google Scholar
  38. 38.
    Quinlan, J.R.: Learning logical definition from relations. Machine Learning 5(3), 239–266 (1990)Google Scholar
  39. 39.
    Slaney, J., Thiébaux, S.: Blocks World revisited. Artificial Intelligence 125, 119–153 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Spector, L., Goodman, E.D., Wu, A., Langdon, W.B., Voigt, H.-M., Gen, M., Sen, S., Dorigo, M., Pezeshk, S., Garzon, M.H., Burke, E. (eds.): Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), July 7-11 2001. Morgan Kaufmann, San Francisco (2001)Google Scholar
  41. 41.
    Srinivasan, A., Muggleton, S., De King, R.: Comparing the use of background knowledge by inductive logic programming systems. In: Raedt, L.D. (ed.) Proceedings of the Fifth International Inductive Logic Programming Workshop, Katholieke Universteit, Leuven (1995); Withdrawn from publication and replaced by [42]Google Scholar
  42. 42.
    Srinivasan, A., King, R.D., Muggleton, S.: The role of background knowledge: using a problem from chemistry to examine the performance of an ILP program. Technical Report PRG-TR-08-99, Oxford University Computing Laboratory, Oxford, UK (1999)Google Scholar
  43. 43.
    Srinivasan, A., Muggleton, S., Sternberg, M.J.E., King, R.D.: Theories for mutagenicity: A study in first-order and feature-based induction. Artificial Intelligence 85(1-2), 277–299 (1996)CrossRefGoogle Scholar
  44. 44.
    Stone, C., Bull, L.: For real! XCS with continuous-valued inputs. Evolutionary Computation 11(3), 299–336 (2003)CrossRefGoogle Scholar
  45. 45.
    Tadepalli, P., Givan, R., Driessens, K.: Relational reinforcement learning: an overview. In: Tadepalli, P., Givan, R., Driessens, K. (eds.) Proceedings of the ICML2004 Workshop on Relational Reinforcement Learning, pp. 1–9 (2004), http://eecs.oregonstate.edu/research/rrl/index.html
  46. 46.
    Van Laer, W.: From Propositional to First Order Logic in Machine Learning and Data Mining. PhD thesis, Katholieke Universiteit Leuven, Belgium (2002)Google Scholar
  47. 47.
    Van Laer, W., De Raedt, L.: How to upgrade propositional learners to first order logic: A case study. In: Paliouras, G., Karkaletsis, V., Spyropoulos, C.D. (eds.) ACAI 1999. LNCS (LNAI), vol. 2049, pp. 102–126. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  48. 48.
    van Otterlo, M.: A survey of reinforcement learning in relational domains. Technical Report TR-CTIT-05-31, University of Twente, The Netherlands (2005)Google Scholar
  49. 49.
    Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3(2), 149–175 (1995)CrossRefGoogle Scholar
  50. 50.
    Wilson, S.W.: Generalization in the XCS classifier system. In: Koza, J.R., Banzhaf, W., Chellapilla, K., Deb, K., Dorigo, M., Fogel, D.B., Garzon, M.H., Goldberg, D.E., Iba, H., Riolo, R. (eds.) Genetic Programming 1998: Proceedings of the Third Annual Conference, University of Wisconsin, Madison, Wisconsin, USA, pp. 665–674. Morgan Kaufmann, San Francisco (1998)Google Scholar
  51. 51.
    Wilson, S.W.: Get real! XCS with continuous-valued inputs. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 1999. LNCS (LNAI), vol. 1813, pp. 209–222. Springer, Heidelberg (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Drew Mellor
    • 1
  1. 1.School of Electrical Engineering and Computer ScienceThe University of NewcastleCallaghanAustralia

Personalised recommendations