Adaptive control of a partially observed controlled Markov chain

Fernández-Gaucherand, Emmanuel; Arapostathis, Aristotle; Marcus, Steven I.

doi:10.1007/BFb0113238

Emmanuel Fernández-Gaucherand¹,
Aristotle Arapostathis² &
Steven I. Marcus³

Part of the book series: Lecture Notes in Control and Information Sciences ((LNCIS,volume 184))

165 Accesses

Abstract

We consider an adaptive finite state controlled Markov chain with partial state information, motivated by a class of replacement problems. We present parameter estimation techniques based on the information available after actions that reset the state to a known value are taken. We prove that the parameter estimates converge w.p.1 to the true (unknown) parameter, under the feedback structure induced by a certainty equivalent adaptive policy. We also show that the adaptive policy is self-optimizing, in a long-run average sense, for any (measurable) sequence of parameter estimates converging w.p.1 to the true parameter.

This work was supported in part by the Texas Advanced Technology Program under Grant No. 003658-093, in part by the Air Force Office of Scientific Research under Grants AFOSR-91-0033, F49620-92-J-0045, and F49620-92-J-0083, and in part by the National Science Foundation under Grant CDR-8803012.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. Arapostathis and S. I. Marcus, “Analysis of an Identification Algorithm Arising in the Adaptive Estimation of Markov Chains,” Mathematics of Control, Signals and Systems, vol. 3, 1990, pp. 1–29.
Article MATH MathSciNet Google Scholar
A. Arapostathis, V. S. Borkar, E. Fernández-Gaucherand, M. K. Ghosh, and S. I. Marcus, “Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey,” submitted for publication.
Google Scholar
A. Arapostathis, E. Fernández-Gaucherand, and S. I. Marcus, “Analysis of an Adaptive Control Scheme for a Partially Observed Controlled Markov Chain,” Proc. 29th IEEE Conf. Decision and Control, Honolulu, HI, 1990, pp. 1438–1444.
Google Scholar
K. J. Åström, “Optimal Control of Markov Processes with Incomplete State Information,” J. Math. Anal. Appl., vol. 10, 1965, pp. 174–205.
Article MATH MathSciNet Google Scholar
D. P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ, 1987.
MATH Google Scholar
E. Fernández-Gaucherand, “Controlled Markov Processes on the Infinite Planning Horizon: Optimal & Adaptive Control,” Ph.D. Dissertation, The University of Texas at Austin, August 1991.
Google Scholar
E. Fernández-Gaucherand, A. Arapostathis and S.I. Marcus, “On the Adaptive Control of Partially Observable Markov Decision Processes,” Proc. 27th IEEE Conf. Decision and Control, Austin, TX, 1988, pp. 1204–1210.
Google Scholar
E. Fernández-Gaucherand, A. Arapostathis and S. I. Marcus, “On the Adaptive Control of a Partially Observable Binary Markov Decision Process,” in Advances in Computing and Control, W. A. Porter, S. C. Kak, J. L. Aravena, eds., Lecture Notes in Control and Information Sciences, vol. 130, Springer-Verlag, Berlin, 1989, pp. 217–228.
Chapter Google Scholar
E. Fernández-Gaucherand, A. Arapostathis and S. I. Marcus, “On the Average Cost Optimality Equation and the Structure of Optimal Policies for Partially Observable Markov Decision Processes,” Annals of Operations Research, vol. 29, 1991, pp. 439–470.
Article MATH MathSciNet Google Scholar
O. Hernández-Lerma, Adaptive Markov Control Processes, Springer Verlag, New York, 1989.
MATH Google Scholar
P. R. Kumar and P. Varaiya, Stochastic Systems: Estimation, Identification and Adaptive Control, Prentice-Hall, Englewood Cliffs, NJ, 1986.
MATH Google Scholar
H. J. Kushner, “An Averaging Method for Stochastic Approximations with Discontinuous Dynamics, Constraints, and State Dependent Noise,” in Recent Advances in Statistics, Rizvi, Rustagi and Siegmund, Eds., Academic Press, New York, 1983, pp. 211–235.
Google Scholar
H. J. Kushner and D. S. Clark, Stochastic Approximation Methods for Constrained and Unconstrained Systems, Springer-Verlag, New York, 1978.
Google Scholar
P. Mandl, “Estimation and Control in Markov Chains,” Adv. Appl. Prob., vol. 6, 1974, pp. 40–60.
Article MATH MathSciNet Google Scholar
M. Ohnishi, H. Mine and H. Kawai, “An Optimal Inspection and Replacement Policy Under Incomplete State Information: Average Cost Criterion,” in Stochastic Models in Reliability Theory, S. Osaki and Y. Hatoyama, eds., Lecture Notes in Econ. and Math. Systems No. 235, Springer-Verlag, Berlin, 1984, pp. 187–197.
Google Scholar
L. K. Platzman, “Optimal Infinite-Horizon Undiscounted Control of Finite Probabilistic Systems,” SIAM J. Control Optim., Vol. 18, 1980, pp. 362–380.
Article MATH MathSciNet Google Scholar
A. Shwartz and A. M. Makowski, “Comparing Policies in Markov Decision Processes: Mandl's Lemma Revisited,” Math. Oper. Res., vol. 15, 1990, pp. 155–174.
Article MATH MathSciNet Google Scholar
C. C. White, “A Markov Quality Control Process Subject to Partial Observation,” Mang. Sci., Vol. 23, 1977, pp. 843–852.
MATH Google Scholar
E. Fernández-Gaucherand, A. Arapostathis and S. I. Marcus, “Analysis of an Adaptive Control Scheme for a Partially Observed Controlled Markov Chain,” Department of Systems and Industrial Eng. Working Paper #91-038, University of Arizona, Tucson, Arizona.
Google Scholar

Download references

Author information

Authors and Affiliations

Systems and Industrial Engineering Department, The University of Arizona, 85721, Tucson, Arizona
Emmanuel Fernández-Gaucherand
Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712-1084, Austin, Texas
Aristotle Arapostathis
Department of Electrical Engineering and Systems Research Center, The University of Maryland, 20742, College Park, Maryland
Steven I. Marcus

Authors

Emmanuel Fernández-Gaucherand
View author publications
You can also search for this author in PubMed Google Scholar
Aristotle Arapostathis
View author publications
You can also search for this author in PubMed Google Scholar
Steven I. Marcus
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

T. E. Duncan B. Pasik-Duncan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fernández-Gaucherand, E., Arapostathis, A., Marcus, S.I. (1992). Adaptive control of a partially observed controlled Markov chain. In: Duncan, T.E., Pasik-Duncan, B. (eds) Stochastic Theory and Adaptive Control. Lecture Notes in Control and Information Sciences, vol 184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0113238

Download citation

DOI: https://doi.org/10.1007/BFb0113238
Published: 01 December 2007
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55962-7
Online ISBN: 978-3-540-47327-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics