Skip to main content

Program Search for Machine Learning Pipelines Leveraging Symbolic Planning and Reinforcement Learning

  • Chapter
  • First Online:
Book cover Genetic Programming Theory and Practice XVI

Part of the book series: Genetic and Evolutionary Computation ((GEVO))

Abstract

In this paper we investigate an alternative knowledge representation and learning strategy for the automated machine learning (AutoML) task. Our approach combines a symbolic planner with reinforcement learning to evolve programs that process data and train machine learning classifiers. The planner, which generates all feasible plans from the initial state to the goal state, gives preference first to shortest programs and then later to ones that maximize rewards. The results demonstrate the efficacy of the approach for finding good machine learning pipelines, while at the same time showing that the representation can be used to infer new knowledge relevant for the problem instances being solved. These insights can be useful for other automatic programming approaches, like genetic programming (GP) and Bayesian optimization pipeline learning, with respect to representation and learning strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://potassco.sourceforge.net/.

  2. 2.

    Tildes are used to distinguish actions in MDP space from actions in symbolic planning space.

  3. 3.

    http://www.cs.cornell.edu/people/pabo/movie-review-data/.

References

  1. Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Systems Journal 13, 41–77 (2003)

    Article  MathSciNet  Google Scholar 

  2. Cimatti, A., Pistore, M., Traverso, P.: Automated planning. In: F. van Harmelen, V. Lifschitz, B. Porter (eds.) Handbook of Knowledge Representation. Elsevier (2008)

    Google Scholar 

  3. Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)

    Google Scholar 

  4. Gebser, M., Kaufmann, B., Schaub, T.: Conflict-driven answer set solving: From theory to practice. Artificial Intelligence 187–188, 52–89 (2012)

    Article  MathSciNet  Google Scholar 

  5. Gelfond, M., Lifschitz, V.: Action languages. Electronic Transactions on Artificial Intelligence (ETAI) 6 (1998)

    Google Scholar 

  6. Gulwani, S., Harris, W.R., Singh, R.: Spreadsheet data manipulation using examples. Commun. ACM 55(8), 97–105 (2012)

    Article  Google Scholar 

  7. Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: International Conference on Learning and Intelligent Optimization, pp. 507–523. Springer (2011)

    Google Scholar 

  8. Lee, J., Lifschitz, V., Yang, F.: Action Language \(\mathcal {BC}\): A Preliminary Report. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 983–989 (2013)

    Google Scholar 

  9. Lifschitz, V.: What is answer set programming? In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1594–1597. MIT Press (2008)

    Google Scholar 

  10. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150. Association for Computational Linguistics, Portland, Oregon, USA (2011)

    Google Scholar 

  11. Mahadevan, S.: Average reward reinforcement learning: Foundations, algorithms, and empirical results. Machine Learning 22, 159–195 (1996)

    MATH  Google Scholar 

  12. Martineau, J., Finin, T.: Delta TFIDF: An Improved Feature Space for Sentiment Analysis. In: Proceedings of the Third AAAI Internatonal Conference on Weblogs and Social Media, pp. 258–261. AAAI Press, San Jose, CA (2009)

    Google Scholar 

  13. McDermott, D., Ghallab, M., Howe, A., Knoblock, C., Ram, A., Veloso, M., Weld, D., Wilkins, D.: PDDL-the planning domain definition language. Tech. Rep. CVC-TR-98–003, Yale Center for Computational Vision and Control (1998)

    Google Scholar 

  14. Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO ’16, pp. 485–492. ACM, New York, NY, USA (2016)

    Google Scholar 

  15. O’Reilly, U.M., Oppacher, F.: Program search with a hierarchical variable length representation: Genetic programming, simulated annealing and hill climbing. In: Y. Davidor, H.P. Schwefel, R. Männer (eds.) Parallel Problem Solving from Nature — PPSN III, pp. 397–406. Springer Berlin Heidelberg, Berlin, Heidelberg (1994)

    Chapter  Google Scholar 

  16. Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, p. 271. Association for Computational Linguistics (2004)

    Google Scholar 

  17. Puterman, M.L.: Markov Decision Processes. Wiley Interscience, New York, USA (1994)

    Book  Google Scholar 

  18. Schwartz, A.: A reinforcement learning method for maximizing undiscounted rewards. In: Proceedings of the Tenth International Conference on International Conference on Machine Learning, ICML’93, pp. 298–305. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993)

    Chapter  Google Scholar 

  19. Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In: Proc. of KDD-2013, pp. 847–855 (2013)

    Google Scholar 

  20. Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Computation 8, 1341–1390 (1996)

    Article  Google Scholar 

  21. Yang, F., Lyu, D., Liu, B., Gustafson, S.: Peorl: Integrating symbolic planning and hierarchical reinforcement learning for robust decision-making. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 4860–4866. International Joint Conferences on Artificial Intelligence Organization (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven Gustafson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Yang, F., Gustafson, S., Elkholy, A., Lyu, D., Liu, B. (2019). Program Search for Machine Learning Pipelines Leveraging Symbolic Planning and Reinforcement Learning. In: Banzhaf, W., Spector, L., Sheneman, L. (eds) Genetic Programming Theory and Practice XVI. Genetic and Evolutionary Computation. Springer, Cham. https://doi.org/10.1007/978-3-030-04735-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04735-1_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04734-4

  • Online ISBN: 978-3-030-04735-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics