Skip to main content

Algorithm Selection as a Bandit Problem with Unbounded Losses

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6073))

Abstract

Algorithm selection is typically based on models of algorithm performance learned during a separate offline training sequence, which can be prohibitively expensive. In recent work, we adopted an online approach, in which a performance model is iteratively updated and used to guide selection on a sequence of problem instances. The resulting exploration-exploitation trade-off was represented as a bandit problem with expert advice, using an existing solver for this game, but this required the setting of an arbitrary bound on algorithm runtimes, thus invalidating the optimal regret of the solver. In this paper, we propose a simpler framework for representing algorithm selection as a bandit problem, with partial information, and an unknown bound on losses. We adapt an existing solver to this game, proving a bound on its expected regret, which holds also for the resulting algorithm selection technique. We present experiments with a set of SAT solvers on a mixed SAT-UNSAT benchmark.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rice, J.R.: The algorithm selection problem. In: Rubinoff, M., Yovits, M.C. (eds.) Advances in Computers, vol. 15, pp. 65–118. Academic Press, New York (1976)

    Google Scholar 

  2. Vilalta, R., Drissi, Y.: A perspective view and survey of meta-learning. Artif. Intell. Rev. 18(2), 77–95 (2002)

    Article  Google Scholar 

  3. Gagliolo, M., Zhumatiy, V., Schmidhuber, J.: Adaptive online time allocation to search algorithms. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 134–143. Springer, Heidelberg (2004)

    Google Scholar 

  4. Gagliolo, M., Schmidhuber, J.: A neural network model for inter-problem adaptive online time allocation. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 7–12. Springer, Heidelberg (2005)

    Google Scholar 

  5. Gagliolo, M., Schmidhuber, J.: Learning dynamic algorithm portfolios. Annals of Mathematics and Artificial Intelligence 47(3-4), 295–328 (2006); AI&MATH 2006 Special Issue

    Article  MATH  MathSciNet  Google Scholar 

  6. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48–77 (2003)

    Article  MathSciNet  Google Scholar 

  7. Gagliolo, M., Schmidhuber, J.: Learning restart strategies. In: Veloso, M.M. (ed.) IJCAI 2007 – Twentieth International Joint Conference on Artificial Intelligence, January 2007, vol. 1, pp. 792–797. AAAI Press, Menlo Park (2007)

    Google Scholar 

  8. Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS (LNAI), vol. 3559, pp. 217–232. Springer, Heidelberg (2005)

    Google Scholar 

  9. Hoos, H.H., Stützle, T.: Local search algorithms for SAT: An empirical evaluation. Journal of Automated Reasoning 24(4), 421–481 (2000)

    Article  MATH  Google Scholar 

  10. Hutter, F., Hamadi, Y.: Parameter adjustment based on performance prediction: Towards an instance-aware problem solver. Technical Report MSR-TR-2005-125, Microsoft Research, Cambridge, UK (December 2005)

    Google Scholar 

  11. Petrik, M.: Statistically optimal combination of algorithms. Presented at SOFSEM 2005 - 31st Annual Conference on Current Trends in Theory and Practice of Informatics (2005)

    Google Scholar 

  12. Fürnkranz, J.: On-line bibliography on meta-learning. In: EU ESPRIT METAL Project (26.357): A Meta-Learning Assistant for Providing User Support in Machine Learning Mining (2001)

    Google Scholar 

  13. Giraud-Carrier, C., Vilalta, R., Brazdil, P.: Introduction to the special issue on meta-learning. Machine Learning 54(3), 187–193 (2004)

    Article  Google Scholar 

  14. Leyton-Brown, K., Nudelman, E., Shoham, Y.: Learning the empirical hardness of optimization problems: The case of combinatorial auctions. In: Van Hentenryck, P. (ed.) CP 2002. LNCS, vol. 2470, p. 556. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  15. Nudelman, E., Leyton-Brown, K., Hoos, H.H., Devkar, A., Shoham, Y.: Understanding random sat: Beyond the clauses-to-variables ratio. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 438–452. Springer, Heidelberg (2004)

    Google Scholar 

  16. Huberman, B.A., Lukose, R.M., Hogg, T.: An economic approach to hard computational problems. Science 275, 51–54 (1997)

    Article  Google Scholar 

  17. Gomes, C.P., Selman, B.: Algorithm portfolios. Artificial Intelligence 126(1-2), 43–62 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  18. Petrik, M., Zilberstein, S.: Learning static parallel portfolios of algorithms. In: Ninth International Symposium on Artificial Intelligence and Mathematics (2006)

    Google Scholar 

  19. Kautz, H.A., Horvitz, E., Ruan, Y., Gomes, C.P., Selman, B.: Dynamic restart policies. In: AAAI/IAAI, pp. 674–681 (2002)

    Google Scholar 

  20. Sutton, R., Barto, A.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  21. Lagoudakis, M.G., Littman, M.L.: Algorithm selection using reinforcement learning. In: Proc. 17th ICML, pp. 511–518. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  22. Finkelstein, L., Markovitch, S., Rivlin, E.: Optimal schedules for parallelizing anytime algorithms: the case of independent processes. In: Eighteenth national conference on Artificial intelligence, pp. 719–724. AAAI Press, Menlo Park (2002)

    Google Scholar 

  23. Finkelstein, L., Markovitch, S., Rivlin, E.: Optimal schedules for parallelizing anytime algorithms: The case of shared resources. Journal of Artificial Intelligence Research 19, 73–138 (2003)

    MATH  MathSciNet  Google Scholar 

  24. Sayag, T., Fine, S., Mansour, Y.: Combining multiple heuristics. In: Durand, B., Thomas, W. (eds.) STACS 2006. LNCS, vol. 3884, pp. 242–253. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  25. Streeter, M.J., Golovin, D., Smith, S.F.: Combining multiple heuristics online. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, Vancouver, British Columbia, Canada, July 22-26, pp. 1197–1203. AAAI Press, Menlo Park (2007)

    Google Scholar 

  26. Streeter, M., Smith, S.F.: New techniques for algorithm portfolio design. In: UAI 2008: Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (2008)

    Google Scholar 

  27. Beck, C.J., Freuder, E.C.: Simple rules for low-knowledge algorithm selection. In: CPAIOR, pp. 50–64 (2004)

    Google Scholar 

  28. Carchrae, T., Beck, J.C.: Applying machine learning to low knowledge control of optimization algorithms. Computational Intelligence 21(4), 373–387 (2005)

    Article  MathSciNet  Google Scholar 

  29. Cicirello, V.A., Smith, S.F.: The max k-armed bandit: A new model of exploration applied to search heuristic selection. In: Twentieth National Conference on Artificial Intelligence, pp. 1355–1361. AAAI Press, Menlo Park (2005)

    Google Scholar 

  30. Streeter, M.J., Smith, S.F.: An asymptotically optimal algorithm for the max k-armed bandit problem. In: Twenty-First National Conference on Artificial Intelligence. AAAI Press, Menlo Park (2006)

    Google Scholar 

  31. Smith-Miles, K.A.: Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Computing Surveys 41(1), 1–25 (2008)

    Article  Google Scholar 

  32. Robbins, H.: Some aspects of the sequential design of experiments. Bulletin of the AMS 58, 527–535 (1952)

    Article  MATH  MathSciNet  Google Scholar 

  33. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322–331. IEEE Computer Society Press, Los Alamitos (1995)

    Google Scholar 

  34. Gomes, C.P., Selman, B., Crato, N., Kautz, H.: Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. J. Autom. Reason. 24(1-2), 67–100 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  35. Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. Machine Learning 66(2-3), 321–352 (2007)

    Article  Google Scholar 

  36. Allenberg, C., Auer, P., Györfi, L., Ottucsák, G.: Hannan consistency in on-line learning in case of unbounded losses under partial monitoring. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) ALT 2006. LNCS (LNAI), vol. 4264, pp. 229–243. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  37. Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  38. Gagliolo, M., Schmidhuber, J.: Towards distributed algorithm portfolios. In: Corchado, J.M., et al. (eds.) International Symposium on Distributed Computing and Artificial Intelligence 2008 (DCAI 2008). Advances in Soft Computing, vol. 50, pp. 634–643. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  39. Li, C.M., Huang, W.: Diversification and determinism in local search for satisfiability. In: Bacchus, F., Walsh, T. (eds.) SAT 2005. LNCS, vol. 3569, pp. 158–172. Springer, Heidelberg (2005)

    Google Scholar 

  40. Hoos, H.H., Stützle, T.: SATLIB: An Online Resource for Research on SAT. In: Gent, I.P., et al. (eds.) SAT 2000, pp. 283–292 (2000), http://www.satlib.org

  41. Cesa-Bianchi, N.: Personal Communication (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gagliolo, M., Schmidhuber, J. (2010). Algorithm Selection as a Bandit Problem with Unbounded Losses. In: Blum, C., Battiti, R. (eds) Learning and Intelligent Optimization. LION 2010. Lecture Notes in Computer Science, vol 6073. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13800-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13800-3_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13799-0

  • Online ISBN: 978-3-642-13800-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics