The New Palgrave Dictionary of Economics

2018 Edition
| Editors: Macmillan Publishers Ltd

Dynamic Programming

  • John Rust
Reference work entry
DOI: https://doi.org/10.1057/978-1-349-95189-5_1932

Abstract

This article reviews the history and theory of dynamic programming (DP), a recursive method of solving sequential decision problems under uncertainty. It discusses computational algorithms for the numerical solution of DP problems, and an important limitation in our ability to solve realistic large-scale dynamic programming problems, the ‘curse of dimensionality’. It also summarizes recent research in complexity theory that delineates situations where the curse can be broken (allowing us to solve DPs using fast polynomial time algorithms), and situations where it is insuperable. The literature on econometric estimation and testing of DP models is reviewed, as is another ‘scientific limit to knowledge’, namely, the identification problem.

Keywords

Backward induction Bellman equation Computational complexity Computational experiments Concavity Continuous and discrete time models Curse of dimensionality Decision variables Discount factor Dynamic discrete choice models Dynamic games Dynamic programming Econometric estimation Euler equations Game tree Identification Independence Indirect inference Infinite horizons Kalman filtering Kuhn–Tucker th Markov chain Monte Carlo methods Markovian decision problems Maximum likelihood Method of simulated moments Method of simulated scores Minimum residual method Monotonicity Monte Carlo integration Neural networks Nonlinear regression Nonparametric regression Optimal decision rules Policy iteration Principle of optimality Rational expectations Sequential decision problems Simulated maximum likelihood Simulated method of moments Simulated minimum distance Simulation-based estimation State variables Stationarity Statistical decision theory Structural estimation Subgame perfection Uncertainty Wald, A 

JEL Classification

C51 C61 
This is a preview of subscription content, log in to check access.

Bibliography

  1. This article has benefited from helpful feedback from Kenneth Arrow, Daniel Benjamin, Larry Blume, Moshe Buchinsky, Larry Epstein, Chris Phelan and Arthur F. Veinott, Jr.Google Scholar
  2. Adda, J., and R. Cooper. 2003. Dynamic economics quantitative methods and applications. Cambridge, MA: MIT Press.Google Scholar
  3. Aguirregabiria, V., and P. Mira. 2004. Swapping the nested fixed point algorithm: A class of estimators for discrete Markov decision models. Econometrica 70: 1519–1543.CrossRefGoogle Scholar
  4. Aguirregabiria, V., and P. Mira. 2007. Sequential estimation of dynamic discrete games. Econometrica 75: 1–53.CrossRefGoogle Scholar
  5. Arrow, K.J., D. Blackwell, and M.A. Girshik. 1949. Bayes and minimax solutions of sequential decision problems. Econometrica 17: 213–244.CrossRefGoogle Scholar
  6. Bajari, P., and H. Hong. 2006. Semiparametric estimation of a dynamic game of incomplete information, Technical Working Paper No. 320. Cambridge, MA: NBER.CrossRefGoogle Scholar
  7. Bajari, P., L. Benkard, and J. Levin. 2007. Estimating dynamic models of imperfect competition. Econometrica 75: 1331–1370.CrossRefGoogle Scholar
  8. Barto, A.G., S.J. Bradtke, and S.P. Singh. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72: 81–138.CrossRefGoogle Scholar
  9. Bellman, R. 1957. Dynamic programming. Princeton: Princeton University Press.Google Scholar
  10. Bellman, R. 1984. Eye of the hurricane. Singapore: World Scientific.CrossRefGoogle Scholar
  11. Bellman, R., and S. Dreyfus. 1962. Applied dynamic programming. Princeton: Princeton University Press.CrossRefGoogle Scholar
  12. Bertsekas, D.P. 1995. Dynamic programming and optimal control, vols 1 and 2. Belmont: Athena Scientific.Google Scholar
  13. Bertsekas, D.P., and J. Tsitsiklis. 1996. Neuro-dynamic programming. Belmont: Athena Scientific.Google Scholar
  14. Bhattacharya, R.N., and M. Majumdar. 1989. Controlled semi-Markov models – The discounted case. Journal of Statistical Planning and Inference 21: 365–381.CrossRefGoogle Scholar
  15. Binmore, K., J. McCarthy, G. Ponti, L. Samuelson, and A. Shaked. 2002. A backward induction experiment. Journal of Economic Theory 104: 48–88.CrossRefGoogle Scholar
  16. Blackwell, D. 1962. Discrete dynamic programming. Annals of Mathematical Statistics 33: 719–726.CrossRefGoogle Scholar
  17. Blackwell, D. 1965a. Positive dynamic programming. Proceedings of the 5th Berkeley Symposium 3: 415–428.Google Scholar
  18. Blackwell, D. 1965b. Discounted dynamic programming. Annals of Mathematical Statistics 36: 226–235.CrossRefGoogle Scholar
  19. Cayley, A. 1875. Mathematical qsts and their solutions. Problem No. 4528. Educational Times 27, 237.Google Scholar
  20. Chow, C.S., and J.N. Tsitsiklis. 1989. The complexity of dynamic programming. Journal of Complexity 5: 466–488.CrossRefGoogle Scholar
  21. Denardo, E. 1967. Contraction mappings underlying the theory of dynamic programming. SIAM Review 9: 165–177.CrossRefGoogle Scholar
  22. Dvoretzky, A., J. Kiefer, and J. Wolfowitz. 1952. The inventory problem: I. Case of known distributions of demand. Econometrica 20: 187–222.CrossRefGoogle Scholar
  23. Eckstein, Z., and K.I. Wolpin. 1989. The specification and estimation of dynamic stochastic discrete choice models: A survey. Journal of Human Resources 24: 562–598.CrossRefGoogle Scholar
  24. Gallant, A.R., and G.E. Tauchen. 1996. Which moments to match? Econometric Theory 12: 657–681.CrossRefGoogle Scholar
  25. Gihman, I.I., and A.V. Skorohod. 1979. Controlled stochastic processes. New York: Springer.CrossRefGoogle Scholar
  26. Gittins, J.C. 1979. Bandit processes and dynamic allocation indices. Journal of the Royal Statistical Society B 41: 148–164.Google Scholar
  27. Gotz, G.A., and J.J. McCall. 1980. Estimation in sequential decision-making models: A methodological note. Economics Letters 6: 131–136.CrossRefGoogle Scholar
  28. Gourieroux, C., and A. Monfort. 1997. Simulation-based methods of inference. Oxford: Oxford University Press.CrossRefGoogle Scholar
  29. Grüne, L., and W. Semmler. 2004. Using dynamic programming with adaptive grid scheme for optimal control problems in economics. Journal of Economic Dynamics and Control 28: 2427–2456.CrossRefGoogle Scholar
  30. Hall, G., and J. Rust. 2006. Econometric methods for endogenously sampled time series: The case of commodity price speculation in the steel market. Manuscript: Yale University.Google Scholar
  31. Hansen, L.P., and T.J. Sargent. 1980. Formulating and estimating dynamic linear rational expectations models. Journal of Economic Dynamics and Control 2: 7–46.CrossRefGoogle Scholar
  32. Hansen, L.P., and K. Singleton. 1982. Generalized instrumental variables estimation of nonlinear rational expectations models. Econometrica 50: 1269–1281.CrossRefGoogle Scholar
  33. Howard, R.A. 1960. Dynamic programming and Markov processes. New York: Wiley.Google Scholar
  34. Imai, S., N. Jain, and A. Ching. 2005. Bayesian estimation of dynamic discrete choice models. Manuscript: University of Illinois.Google Scholar
  35. Judd, K. 1998. Numerical methods in economics. Cambridge, MA: MIT Press.Google Scholar
  36. Keane, M., and K.I. Wolpin. 1994. The solution and estimation of discrete choice dynamic programming models by simulation: Monte Carlo evidence. Review of Economics and Statistics 76: 648–672.CrossRefGoogle Scholar
  37. Kurzweil, R. 2005. The singularity is near when humans transcend biology. New York: Viking Press.Google Scholar
  38. Kushner, H.J. 1990. Numerical methods for stochastic control problems in continuous time. SIAM Journal on Control and Optimization 28: 999–1048.CrossRefGoogle Scholar
  39. Lancaster, A. 1997. Exact structural inference in optimal job search models. Journal of Business Economics and Statistics 15: 165–179.Google Scholar
  40. Ledyard, J. 1986. The scope of the hypothesis of Bayesian equilibrium. Journal of Economic Theory 39: 59–82.CrossRefGoogle Scholar
  41. Lucas Jr., R.E. 1976. Econometric policy evaluation: A critique. In The phillips curve and labour markets, Carnegie-Rochester Conference on Public Policy, ed. K. Brunner and A.K. Meltzer. Amsterdam: North-Holland.Google Scholar
  42. Lucas Jr., R.E. 1978. Asset prices in an exchange economy. Econometrica 46: 1426–1445.CrossRefGoogle Scholar
  43. Luenberger, D.G. 1969. Optimization by vector space methods. New York: Wiley.Google Scholar
  44. Magnac, T., and D. Thesmar. 2002. Identifying dynamic discrete decision processes. Econometrica 70: 801–816.CrossRefGoogle Scholar
  45. Marschak, T. 1953. Economic measurements for policy and prediction. In Studies in econometric method, ed. W.C. Hood and T.J. Koopmans. New York: Wiley.Google Scholar
  46. Massé, P. 1945. Application des probabilités en chaîne á l’hydrologie statistique et au jeu des réservoirs. Report to the Statistical Society of Paris. Paris: Berger-Levrault.Google Scholar
  47. Massé, P. 1946. Les réserves et la régulation de l’avenir. Paris: Hermann.Google Scholar
  48. McFadden, D. 1989. A method of simulated moments for estimation of discrete response models without numerical integration. Econometrica 57: 995–1026.CrossRefGoogle Scholar
  49. Nemirovsky, A.S., and D.B. Yudin. 1983. Problem complexity and method efficiency in optimization. New York: Wiley.Google Scholar
  50. Nourets, A. 2006. Inference in dynamic discrete choice models with serially correlated unobserved state variables. Manuscript, University of Iowa.Google Scholar
  51. Paarsch, H.J., and J. Rust. 2007. Stochastic dynamic programming in space: An application to British Columbia forestry. Working paper.Google Scholar
  52. Pakes, A. 1986. Patents as options: Some estimates of the values of holding European patent stocks. Econometrica 54: 755–784.CrossRefGoogle Scholar
  53. Pakes, A. 2001. Stochastic algorithms, symmetric Markov perfect equilibria and the ‘curse’ of dimensionality. Econometrica 69: 1261–1281.CrossRefGoogle Scholar
  54. Pakes, A., and P. McGuire. 1994. Computing Markov perfect Nash equilibrium: Numerical implications of a dynamic differentiated product model. RAND Journal of Economics 25: 555–589.CrossRefGoogle Scholar
  55. Penrose, R. 1989. The emperor’s new mind. New York: Penguin.Google Scholar
  56. Pesendorfer, M., and P. Schmidt-Dengler. 2003. Identification and estimation of dynamic games. Manuscript, University College London.Google Scholar
  57. Pollard, D. 1989. Asymptotics via empirical processes. Statistical Science 4: 341–386.CrossRefGoogle Scholar
  58. Puterman, M.L. 1994. Markovian decision problems. New York: Wiley.CrossRefGoogle Scholar
  59. Rust, J. 1985. Stationary equilibrium in a market for durable goods. Econometrica 53: 783–805.CrossRefGoogle Scholar
  60. Rust, J. 1987. Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher. Econometrica 55: 999–1033.CrossRefGoogle Scholar
  61. Rust, J. 1988. Maximum likelihood estimation of discrete control processes. SIAM Journal on Control and Optimization 26: 1006–1024.CrossRefGoogle Scholar
  62. Rust, J. 1994. Structural estimation of Markov decision processes. In Handbook of econometrics, vol. 4, ed. R.F. Engle and D.L. McFadden. Amsterdam: North-Holland.Google Scholar
  63. Rust, J. 1996. Numerical dynamic programming in economics. In Handbook of computational economics, ed. H. Amman, D. Kendrick, and J. Rust. Amsterdam: North-Holland.Google Scholar
  64. Rust, J. 1997. Using randomization to break the curse of dimensionality. Econometrica 65: 487–516.CrossRefGoogle Scholar
  65. Rust, J., and G.J. Hall. 2007. The (S, s) rule is an optimal trading strategy in a class of commodity price speculation problems. Economic Theory 30: 515–538.CrossRefGoogle Scholar
  66. Rust, J., and C. Phelan. 1997. How social security and medicare affect retirement behavior in a world with incomplete markets. Econometrica 65: 781–832.CrossRefGoogle Scholar
  67. Rust, J., J.F. Traub, and H. Woźniakowski. 2002. Is there a curse of dimensionality for contraction fixed points in the worst case? Econometrica 70: 285–329.CrossRefGoogle Scholar
  68. Santos, M., and J. Rust. 2004. Convergence properties of policy iteration. SIAM Journal on Control and Optimization 42: 2094–2115.CrossRefGoogle Scholar
  69. Sargent, T.J. 1978. Estimation of dynamic labor demand schedules under rational expectations. Journal of Political Economy 86: 1009–1044.CrossRefGoogle Scholar
  70. Sargent, T.J. 1981. Interpreting economic time series. Journal of Political Economy 89: 213–248.CrossRefGoogle Scholar
  71. Selten, R. 1975. Reexamination of the perfectness concept for equilibrium points in extensive games. International Journal of Game Theory 4: 25–55.CrossRefGoogle Scholar
  72. Tauchen, G., and R. Hussey. 1991. Quadrature-based methods for obtaining approximate solutions to nonlinear asset pricing models. Econometrica 59: 371–396.CrossRefGoogle Scholar
  73. Todd, P., and K.I. Wolpin. 2005. Ex ante evaluation of social programs. Manuscript, University of Pennsylvania.Google Scholar
  74. Traub, J.F., and A.G. Werschulz. 1998. Complexity and information. Cambridge: Cambridge University Press.Google Scholar
  75. Von Neumann, J., and O. Morgenstern. 1944. Theory of games and economic behavior. Princeton: Princeton University Press. 3rd edn, 1953.Google Scholar
  76. Wald, A. 1947a. Foundations of a general theory of sequential decision functions. Econometrica 15: 279–313.CrossRefGoogle Scholar
  77. Wald, A. 1947b. Sequential analysis. New York: Dover.Google Scholar
  78. Wald, A., and J. Wolfowitz. 1948. Optimum character of the sequential probability ratio test. Annals of Mathematical Statistics 19: 326–339.CrossRefGoogle Scholar
  79. Wolpin, K. 1984. An estimable dynamic stochastic model of fertility and child mortality. Journal of Political Economy 92: 852–874.CrossRefGoogle Scholar

Copyright information

© Macmillan Publishers Ltd. 2018

Authors and Affiliations

  • John Rust
    • 1
  1. 1.