Adaptive control of Markov chains
Consider a controlled Markov chain whose transition probabilities are parameterized by α known to be in a finite set A. To each α is associated a prespecified control law φ(α). The adaptive controller selects at each time t the control action indicated by φ(αt), where αt is the maximum likelihood estimate of α. The asymptotic behavior of αt is studied.
Unable to display preview. Download preview PDF.
- M.H. Degroot, Optimal Statistical Decisions, McGraw-Hill, New York, 1970.Google Scholar
- Y. Bar-Shalom and E. Tse, Dual effect, certainty equivalence, and separation in stochastic control, IEEE Trans. Aut. Cont. AC-19(5), 494–500, 1974.Google Scholar
- P. Mandl, Estimation and control in Markov chains, Adv. Appl. Prob. 6, 40–60, 1974.Google Scholar
- K.J. Astrom and B. Wittenmark, On self-tuning regulators, Automatica 9, 185–199, 1973.Google Scholar