Information bounds, certainty equivalence and learning in asymptotically efficient adaptive control of time-invariant stochastic systems

Lai, Tze Leung

doi:10.1007/BFb0009310

Tze Leung Lai¹

Part of the book series: Lecture Notes in Control and Information Sciences ((LNCIS,volume 161))

374 Accesses
6 Citations

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

P. R. Kumar, A survey of some results in stochastic adaptive control, SIAM J. Contr. Optimiz., 23 (1985), pp. 329–380.
Article Google Scholar
D. Feldman, Contributions to the “two-armed bandit” problem, Ann. Math. Staist., 33 (1962), pp. 847–856.
Google Scholar
J. Fabius and W. R. van Zwet, Some remarks on the two-armed bandit, Ann. Math. Statist., 41 (1970), pp. 1906–1916.
Google Scholar
D. A. Berry, A Bernoulli two-armed bandit, Ann. Math. Statist., 43 (1972), pp. 871–897.
Google Scholar
J. C. Gittins and D. M. Jones, A dynamic allocation index for the sequential design of experiments, in Progress in Statistics (J. Gani et al., Eds.), North Holland, Amsterdam, 1974, pp. 241–266.
Google Scholar
J. C. Gittins, Bandit processes and dynamic allocation indices, J. Roy. Statist. Soc. Ser. B, 41 (1979), pp. 148–177.
Google Scholar
P. Whittle, Multi-armed bandits and the Gittins index, J. Roy. Statist. Soc. Ser. B, 42 (1980), pp. 143–149.
Google Scholar
F. Chang and T. L. Lai, Optimal stopping and dynamic allocation, Adv. Appl. Prob., 19 (1987), pp. 829–853.
Google Scholar
T. L. Lai, Adaptive treatment allocation and the multi-armed bandit problem, Ann. Statist., 16 (1987), pp. 1091–1114.
Google Scholar
H. Chernoff and S. N. Ray, A Bayes sequential sampling inspection plan, Ann. Math. Statist., 36 (1965), pp. 1387–1407.
Google Scholar
H. Chernoff, Sequential models for clinical trials, Proc. Fifth Berkeley Symp. Math. Statist. Prob., 4 (1967), Univ. California Press, pp. 805–812.
Google Scholar
H. Robbins, Some aspects of the sequential design of experiments, Bull. Amer. Math. Soc., 55 (1952), pp. 527–535.
Google Scholar
T. L. Lai and H. Robbins, Asymptotically efficient adaptive allocation rules, Adv. Appl. Math., 6 (1985), pp. 4–22.
Article Google Scholar
T. L. Lai, Asymptotic solutions of bandit problems, in Stochastic Differential Systems, Stochastic Control Theory and Applications (W. Fleming and P. L. Lions, Eds.), Springer-Verlag, New York-Berlin-Heidelberg, 1988, pp. 275–292.
Google Scholar
V. Ananthraman, P. Varaiya, and J. Walrand, Asymptotically efficient allocation rules for multi-armed bandit problems with multiple plays. Part I: I.I.D. Rewards. Part II: Markovian Rewards, IEEE Trans. Automat. Contr., AC-32 (1987), pp. 968–982.
Article Google Scholar
R. Agrawal, M. Hedge and D. Teneketzis, Asymptotically efficient rules for the multi-armed bandit problem with switching costs, IEEE Trans. Automat. Contr., AC-33 (1988), pp. 899–906.
Article Google Scholar
A. Zellner, An Introduction to Bayesian Inference in Econometrics, Wiley, New York, 1971.
Google Scholar
G. C. Chow, Analysis and Control of Dynamic Economic Systems, Wiley, New York, 1975.
Google Scholar
E. C. Prescott, The multiperiod control problem under uncertainty, Econometrica, 40 (1972), pp. 1043–1058.
Google Scholar
M. Aoki, On some price adjustment schemes, Ann. Econ. Soc. Measurements, 3 (1974), pp. 95–116.
Google Scholar
T. W. Anderson and J. Taylor, Some experimental results on the statistical properties of least squares estimates in control problems, Econometrica, 44 (1976), pp. 1289–1302.
Google Scholar
T. L. Lai and H. Robbins, Rerated least squares in multiperiod control, Adv. Appl. Math., 3 (1982), pp. 50–73.
Article Google Scholar
T. L. Lai and H. Robbins, Adaptive design and the multiperiod control problem, in Statistical Decision Theory and Related Topics III, Vol. 2 (S. S. Gupta and J. O. Berger, Eds.) Academic Press, 1982, pp. 103–120.
Google Scholar
T. L. Lai and H. Robbins, Adaptive design and stochastic approximtion, Ann. Statist., 7 (1979), pp. 1196–1221.
Google Scholar
H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statist., 22 (1951), pp. 400–407.
Google Scholar
C. Z. Wei, Multivariate adaptive stochastic approximation, Ann. Statist., 15 (1987), pp. 1115–1130.
Google Scholar
T. L. Lai, Asymptotically efficient adaptive control in stochastic regression models, Adv. Appl. Math., 7 (1986), pp. 23–45.
Article Google Scholar
A. A. Feldbaum, The theory of dual control I–IV, Automation and Remote Control, 21 (1961), pp. 874–883 (Part I) and pp. 1033–1039 (Part II), 22 (1962), pp. 1–12 (Part III) and pp. 109–121 (Part IV).
Google Scholar
K. J. Åström, Theory and applications of adaptive control — A survey, Automatica-J. IFAC, 19 (1983), pp. 471–486.
Article Google Scholar
K. J. Åström and B. Wittenmark, On self-tuning regulators, Automatica-J. IFAC, 9 (1973), pp. 195–199.
Google Scholar
T. L. Lai and C. Z. Wei, Asymptotically efficient self-tuning regulators, SIAM J. Contr. Optimiz., 25 (1987), pp. 466–481.
Article Google Scholar
T. L. Lai and C. Z. Wei, On the concept of excitation in least squares identification and adaptive control, Stochastics, 16 (1986), pp. 227–254.
Google Scholar
V. Solo, The convergence of AML, IEEE Trans. Automat Contr., AC-21 (1979), pp. 958–962.
Article Google Scholar
G. C. Goodwin, P. J. Ramadge, and P. E. Caines, Discrete time stochastic adaptive control, SIAM J. Contr. Optimiz., 19 (1981), pp. 829–853.
Article Google Scholar
G. C. Goodwin and K. S. Sin, Adaptive Filtering, Prediction and Control, Prentice-Hall, Englewood Cliffs, 1984.
Google Scholar
T. L. Lai and C. Z. Wei, Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems, Ann. Statist., 10 (1982), pp. 154–166.
Google Scholar
T. L. Lai and C. Z. Wei, Extended least squares and their applications to adaptive control and prediction in linear systems, IEEE Trans. Automat. Contr., AC-31 (1986), pp. 898–906.
Google Scholar
T. L. Lai and Z. Ying, Parallel recursive algorithms in asymptotically efficient adaptive control of linear stochastic systems, SIAM J. Contr. Optimiz., 29 (1991), in press.
Google Scholar
L. Ljung and T. Söderström, Theory and Practice of Recursive Identification, MIT Press, Cambridge, 1983.
Google Scholar
P. E. Caines and S. Lafortune, Adaptive control with recursive identification for stochastic linear systems, IEEE Trans. Automat. Contr., AC-29 (1984), pp. 312–321.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Stanford University, 94305, Stanford, CA
Tze Leung Lai

Authors

Tze Leung Lai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

L. Gerencséer P. E. Caines

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lai, T.L. (1991). Information bounds, certainty equivalence and learning in asymptotically efficient adaptive control of time-invariant stochastic systems. In: Gerencséer, L., Caines, P.E. (eds) Topics in Stochastic Systems: Modelling, Estimation and Adaptive Control. Lecture Notes in Control and Information Sciences, vol 161. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0009310

Download citation

DOI: https://doi.org/10.1007/BFb0009310
Published: 06 October 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-54133-2
Online ISBN: 978-3-540-47435-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics