Skip to main content

Adaptive control of a partially observed controlled Markov chain

  • Conference paper
  • First Online:
Stochastic Theory and Adaptive Control

Abstract

We consider an adaptive finite state controlled Markov chain with partial state information, motivated by a class of replacement problems. We present parameter estimation techniques based on the information available after actions that reset the state to a known value are taken. We prove that the parameter estimates converge w.p.1 to the true (unknown) parameter, under the feedback structure induced by a certainty equivalent adaptive policy. We also show that the adaptive policy is self-optimizing, in a long-run average sense, for any (measurable) sequence of parameter estimates converging w.p.1 to the true parameter.

This work was supported in part by the Texas Advanced Technology Program under Grant No. 003658-093, in part by the Air Force Office of Scientific Research under Grants AFOSR-91-0033, F49620-92-J-0045, and F49620-92-J-0083, and in part by the National Science Foundation under Grant CDR-8803012.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Arapostathis and S. I. Marcus, “Analysis of an Identification Algorithm Arising in the Adaptive Estimation of Markov Chains,” Mathematics of Control, Signals and Systems, vol. 3, 1990, pp. 1–29.

    Article  MATH  MathSciNet  Google Scholar 

  2. A. Arapostathis, V. S. Borkar, E. Fernández-Gaucherand, M. K. Ghosh, and S. I. Marcus, “Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey,” submitted for publication.

    Google Scholar 

  3. A. Arapostathis, E. Fernández-Gaucherand, and S. I. Marcus, “Analysis of an Adaptive Control Scheme for a Partially Observed Controlled Markov Chain,” Proc. 29th IEEE Conf. Decision and Control, Honolulu, HI, 1990, pp. 1438–1444.

    Google Scholar 

  4. K. J. Åström, “Optimal Control of Markov Processes with Incomplete State Information,” J. Math. Anal. Appl., vol. 10, 1965, pp. 174–205.

    Article  MATH  MathSciNet  Google Scholar 

  5. D. P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ, 1987.

    MATH  Google Scholar 

  6. E. Fernández-Gaucherand, “Controlled Markov Processes on the Infinite Planning Horizon: Optimal & Adaptive Control,” Ph.D. Dissertation, The University of Texas at Austin, August 1991.

    Google Scholar 

  7. E. Fernández-Gaucherand, A. Arapostathis and S.I. Marcus, “On the Adaptive Control of Partially Observable Markov Decision Processes,” Proc. 27th IEEE Conf. Decision and Control, Austin, TX, 1988, pp. 1204–1210.

    Google Scholar 

  8. E. Fernández-Gaucherand, A. Arapostathis and S. I. Marcus, “On the Adaptive Control of a Partially Observable Binary Markov Decision Process,” in Advances in Computing and Control, W. A. Porter, S. C. Kak, J. L. Aravena, eds., Lecture Notes in Control and Information Sciences, vol. 130, Springer-Verlag, Berlin, 1989, pp. 217–228.

    Chapter  Google Scholar 

  9. E. Fernández-Gaucherand, A. Arapostathis and S. I. Marcus, “On the Average Cost Optimality Equation and the Structure of Optimal Policies for Partially Observable Markov Decision Processes,” Annals of Operations Research, vol. 29, 1991, pp. 439–470.

    Article  MATH  MathSciNet  Google Scholar 

  10. O. Hernández-Lerma, Adaptive Markov Control Processes, Springer Verlag, New York, 1989.

    MATH  Google Scholar 

  11. P. R. Kumar and P. Varaiya, Stochastic Systems: Estimation, Identification and Adaptive Control, Prentice-Hall, Englewood Cliffs, NJ, 1986.

    MATH  Google Scholar 

  12. H. J. Kushner, “An Averaging Method for Stochastic Approximations with Discontinuous Dynamics, Constraints, and State Dependent Noise,” in Recent Advances in Statistics, Rizvi, Rustagi and Siegmund, Eds., Academic Press, New York, 1983, pp. 211–235.

    Google Scholar 

  13. H. J. Kushner and D. S. Clark, Stochastic Approximation Methods for Constrained and Unconstrained Systems, Springer-Verlag, New York, 1978.

    Google Scholar 

  14. P. Mandl, “Estimation and Control in Markov Chains,” Adv. Appl. Prob., vol. 6, 1974, pp. 40–60.

    Article  MATH  MathSciNet  Google Scholar 

  15. M. Ohnishi, H. Mine and H. Kawai, “An Optimal Inspection and Replacement Policy Under Incomplete State Information: Average Cost Criterion,” in Stochastic Models in Reliability Theory, S. Osaki and Y. Hatoyama, eds., Lecture Notes in Econ. and Math. Systems No. 235, Springer-Verlag, Berlin, 1984, pp. 187–197.

    Google Scholar 

  16. L. K. Platzman, “Optimal Infinite-Horizon Undiscounted Control of Finite Probabilistic Systems,” SIAM J. Control Optim., Vol. 18, 1980, pp. 362–380.

    Article  MATH  MathSciNet  Google Scholar 

  17. A. Shwartz and A. M. Makowski, “Comparing Policies in Markov Decision Processes: Mandl's Lemma Revisited,” Math. Oper. Res., vol. 15, 1990, pp. 155–174.

    Article  MATH  MathSciNet  Google Scholar 

  18. C. C. White, “A Markov Quality Control Process Subject to Partial Observation,” Mang. Sci., Vol. 23, 1977, pp. 843–852.

    MATH  Google Scholar 

  19. E. Fernández-Gaucherand, A. Arapostathis and S. I. Marcus, “Analysis of an Adaptive Control Scheme for a Partially Observed Controlled Markov Chain,” Department of Systems and Industrial Eng. Working Paper #91-038, University of Arizona, Tucson, Arizona.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

T. E. Duncan B. Pasik-Duncan

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag

About this paper

Cite this paper

Fernández-Gaucherand, E., Arapostathis, A., Marcus, S.I. (1992). Adaptive control of a partially observed controlled Markov chain. In: Duncan, T.E., Pasik-Duncan, B. (eds) Stochastic Theory and Adaptive Control. Lecture Notes in Control and Information Sciences, vol 184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0113238

Download citation

  • DOI: https://doi.org/10.1007/BFb0113238

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-55962-7

  • Online ISBN: 978-3-540-47327-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics