Abstract
We modify Q-MDP value method and observe the behaviors of a robot with the modified method in an environment, where state information of the robot is essentially indefinite. In Q-MDP value method, an action in every time step is chosen based on a calculation of expectation values with a probability distribution, which is the output of a probabilistic state estimator. The modified method uses a weighting function with the probability distribution in the calculation so as to give precedence to the states near the goal of the task. We applied our method to a simple robot navigation problem in an incomplete sensor environment. As a result, the method makes the robot take a kind of searching behavior without explicit implementation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Silver, D., Veness, J.: Monte-Carlo Planning in Large POMDPs. In: NIPS. Volume 23. (2010) 2164–2172
Bonet, B., Geffner, H.: Solving POMDPs: RTDP-BEL vs. Point-based Algorithms. In: IJCAI. (2009) 1641–1646
Ong, S.C., Png, S.W., Hsu, D., Lee, W.S.: Planning under Uncertainty for Robotic Tasks with Mixed Observability. The International Journal of Robotics Research 29(8) (2010) 1053–1068
Roy, N., Burgard, W., Fox, D., Thrun, S.: Coastal Navigation - Mobile Robot Navigation with Uncertainty in Dynamic Environments. In: Proc. of IEEE ICRA. (1999) 35–40
Thrun, S., Burgard, W., Fox, D.: Probabilistic ROBOTICS. MIT Press (2005)
Bellman, R.: Dynamic Programming. Princeton University Press, Princeton, NJ (1957)
Littman, M.L., et al.: Learning Policies for Partially Observable Environments: Scaling Up. In: Proceedings of International Conference on Machine Learning. (1995) 362–370
Ueda, R., Arai, T., Sakamoto, K., Jitsukawa, Y., Umeda, K., Osumi, H., Kikuchi, T., Komura, M.: Real-Time Decision Making with State-Value Function under Uncertainty of State Estimation. In: Proc. of ICRA. (2005)
Jitsukawa, Y., et al.: Fast Decision Making of Autonomous Robot under Dynamic Environment by Sampling Real-Time Q-MDP Value Method. In: Proc. of IROS. (2007) 1644–1650
Thrun, S., et al.: Probabilistic ROBOTICS. MIT Press (2005)
Latombe, J.C.: Robot Motion Planning. Kluwer Academic Publishers, Boston, MA (1991)
Fox, D., Thrun, S., Burgard, W., Dellaert, F.: Particle Filters for Mobile Robot Localization. A. Doucet, N. de Freitas, and N. Gordon, editors, Sequential Monte Carlo Methods in Practice (2000) 470–498
Lenser, S., Veloso, M.: Sensor resetting localization for poorly modelled robots. In: Proc. of IEEE ICRA. (2000) 1225–1232
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ueda, R. (2016). Generation of Search Behavior by a Modification of Q-MDP Value Method. In: Menegatti, E., Michael, N., Berns, K., Yamaguchi, H. (eds) Intelligent Autonomous Systems 13. Advances in Intelligent Systems and Computing, vol 302. Springer, Cham. https://doi.org/10.1007/978-3-319-08338-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-08338-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08337-7
Online ISBN: 978-3-319-08338-4
eBook Packages: EngineeringEngineering (R0)