Information Directed Policy Sampling for Partially Observable Markov Decision Processes with Parametric Uncertainty
This paper formulates partially observable Markov decision processes, where state-transition probabilities and measurement outcome probabilities are characterized by unknown parameters. An information theoretic solution method that adaptively manages the resulting exploitation-exploration trade-off is proposed. Numerical experiments for response guided dosing in healthcare are presented.
This research was funded in part by the National Science Foundation via grant CMMI #1536717.
- 3.Kumar P. Information theoretic learning methods for Markov decision processes with parametric uncertainty. Ph.D. thesis, University of Washington, Seattle; 2018.Google Scholar
- 4.Kumar P, Ghate A. Information directed policy sampling for Markov decision processes with parameteric uncertaint. unpublished; 2018.Google Scholar