Abstract
Object handover is a basic, but essential capability for robots interacting with humans in many applications, e.g., caring for the elderly and assisting workers in manufacturing workshops. It appears deceptively simple, as humans perform object handover almost flawlessly. The success of humans, however, belies the complexity of object handover as collaborative physical interaction between two agents with limited communication. This paper presents a learning algorithm for dynamic object handover, for example, when a robot hands over water bottles to marathon runners passing by the water station. We formulate the problem as contextual policy search, in which the robot learns object handover by interacting with the human. A key challenge here is to learn the latent reward of the handover task under noisy human feedback. Preliminary experiments show that the robot learns to hand over a water bottle naturally and that it adapts to the dynamics of human motion. One challenge for the future is to combine the model-free learning algorithm with a model-based planning approach and enable the robot to adapt over human preferences and object characteristics, such as shape, weight, and surface texture.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agah, A., Tanie, K.: Human interaction with a service robot: mobile-manipulator handing over an object to a human. In: Proceedings of the IEEE International Conference on Robotics and Automation (1997)
Ben Amor, H., Neumann, G., Kamthe, S., Kroemer, O., Peters, J.: Interaction primitives for human-robot cooperation tasks. In: Proceedings of the IEEE International Conference on Robotics and Automation (2014)
Bruno, S., Khatib, O. (eds.): Handbook of Robotics. Springer, Berlin (2008)
Cakmak, M., Srinivasa, S., Lee, M., Forlizzi, J., Kiesler, S.: Human preferences for robot-human hand-over configurations. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (2011)
Chan, W., Parker, C., Van der Loos, H., Croft, E.: A human-inspired object handover controller. Int. J. Robot. Res. 32(8), 971–983 (2013)
Chan, W.P., Kumagai, I., Nozawa, S., Kakiuchi, Y., Okada, K., Inaba, M.: Implementation of a robot-human object handover controller on a compliant underactuated hand using joint position error measurements for grip force and load force estimations. In: Proceedings of the IEEE International Conference on Robotics and Automation (2014)
Chu, W., Ghahramani, Z.: Preference learning with Gaussian processes. In: Proceedings of the International Conference on Machine Learning (2005)
da Silva, B., Konidaris, G., Barto, A.: Learning parameterized skills. In: Proceedings of the International Conference on Machine Learning (2012)
Daniel, C., Neumann, G., Peters, J.: Hierarchical relative entropy policy search. In: AISTATS (2012)
Daniel, C., Viering, M., Metz, J., Kroemer, O., Peters, J.: Active reward learning. In: Proceedings of the Robotics: Science and Systems (2014)
Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Found. Trends Robot. 2(1–2), 1–142 (2013)
Dragan, A., Srinivasa, S.: Generating legible motion. In: Proceedings of the Robotics: Science and Systems (2013)
Grigore, E.C., Eder, K., Pipe, A.G., Melhuish, C., Leonards, U.: Joint action understanding improves robot-to-human object handover. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4622–4629. IEEE (2013)
Huang, C.-M., Cakmak, M., Mutlu, B.: Adaptive coordination strategies for human-robot handovers. In: Proceedings of the Robotics: Science and Systems (2015)
Huber, M., Kupferberg, A., Lenz, C., Knoll, A., Brandt, T., Glasauer, S.: Spatiotemporal movement planning and rapid adaptation for manual interaction. PLoS One (2013)
Ijspeert, A.J., Schaal, S.: Learning attractor landscapes for learning motor primitives. In: Advances in Neural Information Processing Systems (2003)
Jain, A., Wojcik, B., Joachims, T., Saxena, A.: Learning trajectory preferences for manipulators via iterative improvement. In: Advances in Neural Information Processing Systems (2013)
Kupcsik, A., Deisenroth, M., Peters, J., Ai Poh, L., Vadakkepat, V., Neumann, G.: Model-based contextual policy search for data-efficient generalization of robot skills. Artif. Intell. (2015)
Kupcsik, A., Deisenroth, M.P., Peters, J., Neumann, G.: Data-efficient contextual policy search for robot movement skills. In: Proceedings of the AAAI Conference on Artificial Intelligence (2013)
Mainprice, J., Gharbi, M., Siméon, T., Alami, R.: Sharing effort in planning human-robot handover tasks. In: Proceedings of the International Symposium on Robot and Human Interactive Communication (2012)
Nagata, K., Oosaki, Y., Kakikura, M., Tsukune, H.: Delivery by hand between human and robot based on fingertip force-torque information. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (1998)
Ng, A., Russell, S.: Algorithms for inverse reinforcement learning. In: Proceedings of the International Conference on Machine Learning (2000)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2005)
Ratliff, N., Silver, D., Bagnell, J.: Learning to search: functional gradient techniques for imitation learning. Auton. Robot. 27(1), 25–53 (2009)
Sisbot, E., Alami, R., Siméon, T., Dautenhahn, K., Walters, M., Woods, S.: Navigation in the presence of humans. In: Proceedings of the IEEE-RAS International Conference on Humanoid Robots (2005)
Strabala, K., Lee, M.K., Dragan, A., Forlizzi, J., Srinivasa, S., Cakmak, M., Micelli, V.: Towards seamless human-robot handovers. J. Hum.-Robot Interact. (2013)
Wilson, A., Fern, A., Tadepalli, P.: A Bayesian approach for policy learning from trajectory preference queries. In: Advances in Neural Information Processing Systems (2012)
Wirth, C., Fürnkranz, J.: Preference-based reinforcement learning: a preliminary survey. In: Fürnkranz, J., Hüllermeier, E. (eds.) Proceedings of the ECML/PKDD Workshop on Reinforcement Learning from Generalized Feedback: Beyond Numeric Rewards (2013)
Acknowledgements
This research was supported in part an A*STAR Industrial Robotics Program grant (R-252-506-001-305) and a SMART Phase-2 Pilot grant (R-252-000-571-592).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Kupcsik, A., Hsu, D., Lee, W.S. (2018). Learning Dynamic Robot-to-Human Object Handover from Human Feedback. In: Bicchi, A., Burgard, W. (eds) Robotics Research. Springer Proceedings in Advanced Robotics, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-319-51532-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-51532-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51531-1
Online ISBN: 978-3-319-51532-8
eBook Packages: EngineeringEngineering (R0)