Methodology

Gatti, Christopher

doi:10.1007/978-3-319-12197-0_4

Christopher Gatti²

Part of the book series: Springer Theses ((Springer Theses))

3739 Accesses

Abstract

The goal of this work is to investigate under what parameter conditions reinforcement learning works, and furthermore, how these parameters affect the performance. We therefore break this problem into two parts. The first part attempts to find parameter subregions, within a large parameter space, for which reinforcement learning is generally successful; we call these regions convergent subregions of the parameter space such that reinforcement learning runs frequently converge. The second part takes a closer look at these convergent subregions and attempts to understand how these parameters affect learning performance and what parameters are the most influential. The problem domains analyzed later in this work use very similar experimental methodologies and analysis procedures, and instead of repeating the methodology used for each problem domain, we present the methods used in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ankenman, B., Nelson, B. L., & Staum, J. (2010). Stochastic kriging for simulation metamodeling. Operations Research, 58(2), 371–382.
Article MATH MathSciNet Google Scholar
Breiman, L. (2001). Random forestss. Machine Learning, 45(1), 5–32.
Article MATH Google Scholar
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. New York, NY: Chapman & Hall.
MATH Google Scholar
Chen, X. & Kim, K. (2014). Stochastic kriging with biased sample estimates. ACM Transactions on Modeling and Computer Simulation, 24(2). doi: 10.1145/2567893
Google Scholar
Chen, V. C. P., Tsui, K.-L., Barton, R. R., & Mechesheimer, M. (2006). A review on design, modeling and applications of computer experiments. IIE Transactions, 38(4), 273–291.
Article Google Scholar
Cho, K. & Dunn, S. M. (1994). Learning shape classes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(9), 882–888.
Article Google Scholar
Cressie, N. A. C. (1993). Statistics for Spatial Data (2nd edition). New York, NY: Wiley.
Google Scholar
Faure, H. (1982). Discrepancy of sequences associated with a number system (in dimension s). Acta Arithmetica, 41(4), 337–351.
MATH MathSciNet Google Scholar
Fruth, J., Muehlenstaedt, T., & Roustant, O. (2013). fanovaGraph: Building Kriging models from FANOVA graphs (Manual for R package fanovaGraph, version 1.4.7). Retrieved from http://cran.r-project.org/web/packages/fanovaGraph/ fanovaGraph.pdf.
Gatti, C. J. & Embrechts, M. J. (2012). Reinforcement learning with neural networks: Tricks of the trade. In Georgieva, P., Mihayolva, L., & Jain, L. (Eds.), Advances in Intelligent Signal Processing and Data Mining (pp. 275–310). New York, NY: Springer-Verlag.
Google Scholar
Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2011a). Parameter settings of reinforcement learning for the game of Chung Toi. In Proceedings of the 2011 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2011), Anchorage, AK, 9–12 October (pp. 3530–3535). doi: 10.1109/ICSMC.2011.6084216
Google Scholar
Halton, J. (1960). On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numerische Mathematik, 2(1), 84–90.
Article MATH MathSciNet Google Scholar
Hammersley, J. M. (1960). Monte carlo methods for solving multivariate problems. Annals of the New York Academy of Sciences, 86, 844–874.
Article MATH MathSciNet Google Scholar
Hornberger, G. M. & Spear, R. C. (1981). An approach to the preliminary analysis of environmental systems. Journal of Environmental Management, 12, 7–18.
Google Scholar
Jansen, M. J. W. (1999). Analysis of variance designs for model output. Computational Physics Communications, 117(1), 35–43.
Article MATH Google Scholar
Krige, D. G. (1951). A statistical approach to some basic mine valuation problems on the witwatersrand. Journal of the Chemical, Metallurgical and Mining Society of South Africa, 52(6), 119–139.
Google Scholar
Matheron, G. (1963). Principles of geostatistics. Economic Geology, 58(8), 1246–1266.
Article Google Scholar
Monod, H., Naud, C., & Makowski, D. (2006). Uncertainty and sensitivity analysis for crop models. In Working with Dynamic Crop Models: Evaluation, Analysis, Parameterization, and Applications. Amsterdam, Netherlands: Elsevier.
Google Scholar
Niederreiter, H. (1992). Random Number Generation and Quasi-Monte Carlo Methods. Philadelphia: SIAM.
Book MATH Google Scholar
Pujol, G., Iooss, B., & Janon, A. (2012). sensitivity: Sensitivity Analysis (Manual for R package sensitivity, version 1.8-2). Retrieved from http://cran.r-project.org/web/packages/sensitivity/sensitivity.pdf.
Qu, H. & Fu, M. C. (2013). Gradient extrapolated stochastic kriging. ACM Transactions on Modeling and Computer Simulation, 9(4). doi: 10.1145/0000000. 0000000
Google Scholar
Rasmussen, C. & Williams, C. (2006). Gaussian Processes for Machine Learning. Cambridge, MA: MIT Press.
MATH Google Scholar
Ratto, M., Pagano, A., & Young, P. (2007). Factor mapping and metamodeling (Technical Report EUR 21878 EN - 2007, European Commission, Joint Research Centre). Retrieved from http://publications.jrc.ec.europa.eu/repository/bitstream/111111111/13310/1/ reqno_jrc37692_eur 21878 - factor mapping and metamodelling[2].pdf
Robertson, B. L., Price, C. J., & Reale, M. (2013). CARTopt: A random search method for nonsmooth unconstrained optimization. Computational Optimization and Applications, 56(2), 291–315.
Article MATH MathSciNet Google Scholar
Roustant, O., Ginsbourger, D., & Deville, Y. (2012a). DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization. Journal of Statistical Software, 51(1), 1–55.
Google Scholar
Sacks, J., Welch, W. J., Mitchell, T. J., & Wynn, H. P. (1989). Design and analysis of computer experiments. Statistical Science, 4(4), 409–423.
Article MATH MathSciNet Google Scholar
Saltelli, A., Tarantola, S., & Chan, K. P.-S. (1999). A quantitative model-independent method for global sensitivity analysis of model output. Technometrics, 41(1), 39–56.
Article Google Scholar
Saltelli, A., Tarantola, S., Campolongo, F., & Ratto, M. (2004). Sensitivity Analysis in Practice. Hoboken, NJ: Wiley.
MATH Google Scholar
Sobol’, I. M. (1967). On the distribution of points in a cube and the approximate evaluation of integrals. U.S.S.R. Computational Mathematics and Mathematical Physics, 7(4), 86–112.
Article MathSciNet Google Scholar
Sobol’, I. M. (2001). Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Mathematics and Computers in Simulation, 55(1–3), 271–280.
Article MATH MathSciNet Google Scholar
Therneau, T., Atkinson, B., & Ripley, B. (2012). rpart: Recursive Partitioning and Regression Trees (Manual for R package rpart, version 4.1-8). Retrieved from http://cran.r-project.org/web/packages/rpart/rpart.pdf.
van Beers, W. & Kleijnen, J. P. C. (2003). Kriging for interpolation in random simulations. Journal of the Operational Research Society, 54(3), 2233–2241.
Article Google Scholar
Xie, W., Nelson, B., & Staum, J. (2010). The influence of correlation function on stochastic kriging metamodels. In Proceedings of the 2010 Winter Simulation Conference (WSC), Baltimore, MD, 5–8 December (pp. 1067–1078). doi: 10.1109/WSC.2010.5679083
Google Scholar

Download references

Author information

Authors and Affiliations

Industrial and Systems Engineering, Rensselaer Polytechnic Institute, Troy, New York, USA
Christopher Gatti

Authors

Christopher Gatti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christopher Gatti .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gatti, C. (2015). Methodology. In: Design of Experiments for Reinforcement Learning. Springer Theses. Springer, Cham. https://doi.org/10.1007/978-3-319-12197-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-12197-0_4
Published: 23 November 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12196-3
Online ISBN: 978-3-319-12197-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics