Skip to main content

Part of the book series: Springer Theses ((Springer Theses))

  • 3739 Accesses

Abstract

The goal of this work is to investigate under what parameter conditions reinforcement learning works, and furthermore, how these parameters affect the performance. We therefore break this problem into two parts. The first part attempts to find parameter subregions, within a large parameter space, for which reinforcement learning is generally successful; we call these regions convergent subregions of the parameter space such that reinforcement learning runs frequently converge. The second part takes a closer look at these convergent subregions and attempts to understand how these parameters affect learning performance and what parameters are the most influential. The problem domains analyzed later in this work use very similar experimental methodologies and analysis procedures, and instead of repeating the methodology used for each problem domain, we present the methods used in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Ankenman, B., Nelson, B. L., & Staum, J. (2010). Stochastic kriging for simulation metamodeling. Operations Research, 58(2), 371–382.

    Article  MATH  MathSciNet  Google Scholar 

  • Breiman, L. (2001). Random forestss. Machine Learning, 45(1), 5–32.

    Article  MATH  Google Scholar 

  • Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. New York, NY: Chapman & Hall.

    MATH  Google Scholar 

  • Chen, X. & Kim, K. (2014). Stochastic kriging with biased sample estimates. ACM Transactions on Modeling and Computer Simulation, 24(2). doi: 10.1145/2567893

    Google Scholar 

  • Chen, V. C. P., Tsui, K.-L., Barton, R. R., & Mechesheimer, M. (2006). A review on design, modeling and applications of computer experiments. IIE Transactions, 38(4), 273–291.

    Article  Google Scholar 

  • Cho, K. & Dunn, S. M. (1994). Learning shape classes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(9), 882–888.

    Article  Google Scholar 

  • Cressie, N. A. C. (1993). Statistics for Spatial Data (2nd edition). New York, NY: Wiley.

    Google Scholar 

  • Faure, H. (1982). Discrepancy of sequences associated with a number system (in dimension s). Acta Arithmetica, 41(4), 337–351.

    MATH  MathSciNet  Google Scholar 

  • Fruth, J., Muehlenstaedt, T., & Roustant, O. (2013). fanovaGraph: Building Kriging models from FANOVA graphs (Manual for R package fanovaGraph, version 1.4.7). Retrieved from http://cran.r-project.org/web/packages/fanovaGraph/ fanovaGraph.pdf.

  • Gatti, C. J. & Embrechts, M. J. (2012). Reinforcement learning with neural networks: Tricks of the trade. In Georgieva, P., Mihayolva, L., & Jain, L. (Eds.), Advances in Intelligent Signal Processing and Data Mining (pp. 275–310). New York, NY: Springer-Verlag.

    Google Scholar 

  • Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2011a). Parameter settings of reinforcement learning for the game of Chung Toi. In Proceedings of the 2011 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2011), Anchorage, AK, 9–12 October (pp. 3530–3535). doi: 10.1109/ICSMC.2011.6084216

    Google Scholar 

  • Halton, J. (1960). On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numerische Mathematik, 2(1), 84–90.

    Article  MATH  MathSciNet  Google Scholar 

  • Hammersley, J. M. (1960). Monte carlo methods for solving multivariate problems. Annals of the New York Academy of Sciences, 86, 844–874.

    Article  MATH  MathSciNet  Google Scholar 

  • Hornberger, G. M. & Spear, R. C. (1981). An approach to the preliminary analysis of environmental systems. Journal of Environmental Management, 12, 7–18.

    Google Scholar 

  • Jansen, M. J. W. (1999). Analysis of variance designs for model output. Computational Physics Communications, 117(1), 35–43.

    Article  MATH  Google Scholar 

  • Krige, D. G. (1951). A statistical approach to some basic mine valuation problems on the witwatersrand. Journal of the Chemical, Metallurgical and Mining Society of South Africa, 52(6), 119–139.

    Google Scholar 

  • Matheron, G. (1963). Principles of geostatistics. Economic Geology, 58(8), 1246–1266.

    Article  Google Scholar 

  • Monod, H., Naud, C., & Makowski, D. (2006). Uncertainty and sensitivity analysis for crop models. In Working with Dynamic Crop Models: Evaluation, Analysis, Parameterization, and Applications. Amsterdam, Netherlands: Elsevier.

    Google Scholar 

  • Niederreiter, H. (1992). Random Number Generation and Quasi-Monte Carlo Methods. Philadelphia: SIAM.

    Book  MATH  Google Scholar 

  • Pujol, G., Iooss, B., & Janon, A. (2012). sensitivity: Sensitivity Analysis (Manual for R package sensitivity, version 1.8-2). Retrieved from http://cran.r-project.org/web/packages/sensitivity/sensitivity.pdf.

  • Qu, H. & Fu, M. C. (2013). Gradient extrapolated stochastic kriging. ACM Transactions on Modeling and Computer Simulation, 9(4). doi: 10.1145/0000000. 0000000

    Google Scholar 

  • Rasmussen, C. & Williams, C. (2006). Gaussian Processes for Machine Learning. Cambridge, MA: MIT Press.

    MATH  Google Scholar 

  • Ratto, M., Pagano, A., & Young, P. (2007). Factor mapping and metamodeling (Technical Report EUR 21878 EN - 2007, European Commission, Joint Research Centre). Retrieved from http://publications.jrc.ec.europa.eu/repository/bitstream/111111111/13310/1/ reqno_jrc37692_eur 21878 - factor mapping and metamodelling[2].pdf

  • Robertson, B. L., Price, C. J., & Reale, M. (2013). CARTopt: A random search method for nonsmooth unconstrained optimization. Computational Optimization and Applications, 56(2), 291–315.

    Article  MATH  MathSciNet  Google Scholar 

  • Roustant, O., Ginsbourger, D., & Deville, Y. (2012a). DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization. Journal of Statistical Software, 51(1), 1–55.

    Google Scholar 

  • Sacks, J., Welch, W. J., Mitchell, T. J., & Wynn, H. P. (1989). Design and analysis of computer experiments. Statistical Science, 4(4), 409–423.

    Article  MATH  MathSciNet  Google Scholar 

  • Saltelli, A., Tarantola, S., & Chan, K. P.-S. (1999). A quantitative model-independent method for global sensitivity analysis of model output. Technometrics, 41(1), 39–56.

    Article  Google Scholar 

  • Saltelli, A., Tarantola, S., Campolongo, F., & Ratto, M. (2004). Sensitivity Analysis in Practice. Hoboken, NJ: Wiley.

    MATH  Google Scholar 

  • Sobol’, I. M. (1967). On the distribution of points in a cube and the approximate evaluation of integrals. U.S.S.R. Computational Mathematics and Mathematical Physics, 7(4), 86–112.

    Article  MathSciNet  Google Scholar 

  • Sobol’, I. M. (2001). Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Mathematics and Computers in Simulation, 55(1–3), 271–280.

    Article  MATH  MathSciNet  Google Scholar 

  • Therneau, T., Atkinson, B., & Ripley, B. (2012). rpart: Recursive Partitioning and Regression Trees (Manual for R package rpart, version 4.1-8). Retrieved from http://cran.r-project.org/web/packages/rpart/rpart.pdf.

  • van Beers, W. & Kleijnen, J. P. C. (2003). Kriging for interpolation in random simulations. Journal of the Operational Research Society, 54(3), 2233–2241.

    Article  Google Scholar 

  • Xie, W., Nelson, B., & Staum, J. (2010). The influence of correlation function on stochastic kriging metamodels. In Proceedings of the 2010 Winter Simulation Conference (WSC), Baltimore, MD, 5–8 December (pp. 1067–1078). doi: 10.1109/WSC.2010.5679083

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Gatti .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Gatti, C. (2015). Methodology. In: Design of Experiments for Reinforcement Learning. Springer Theses. Springer, Cham. https://doi.org/10.1007/978-3-319-12197-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12197-0_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12196-3

  • Online ISBN: 978-3-319-12197-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics