Abstract
Intelligent robots such as social robots and home service robots that are being developed nowadays are required to have the ability to naturally communicate with users in natural language. Beyond simple data retrieval and simple dialogue level, we propose a new Human-Robot Interaction (HRI) system that enables a robot to understand and reason the environment around a user and present information about them in natural language, whenever the user asks a question to the robot. For its intelligent HRI, based on Dynamic Memory Networks (DMN), a neural network for Visual Question Answering (VQA), we propose a new full sentence VQA network model called Full-Sentence Highway Memory Network (FSHMN). For its robot platform, a three DOF robotic head was used which has a neck with three motors and a tablet PC head. To verify the feasibility of the proposed system, an experiment is performed in which a user and a robot interact with each other in a way of question answering in a customized kitchen environment. Through the experiment, we not only demonstrated the effectiveness of applying deep learning to HRI applications in real environments but also presented a new insight into HRI.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The video clip is available at https://goo.gl/35nFqz.
References
Breazeal CL (2004) Designing sociable robots. MIT press
Park G-M, Kim D-H, Jeong I-B, Ko W-R, Yoo Y-H, Kim J-H (2017) Taskintelligence of robots: neural model-based mechanism of thought and online motion planning. IEEE Trans Emerg Top Comput Intell 1(1):41–50
Lin P, Abney K, Bekey GA (2011) Robot ethics: the ethical and social implications of robotics. MIT press
Scheutz Matthias (2013) What is robot ethics?[tc spotlight]. IEEE Robot Autom Mag 20(4):20–165
Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), vol 1631. Citeseer, p 1642
Ma L, Lu Z, Li H (2015) Learning to answer questions from imageusing convolutional neural network. arXiv preprint arXiv:1506.00333
Yang Z, He X, Gao J, Deng L, Smola A (2016) Stacked attention networks for image question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 21–29
Xiong C, Merity S, Socher R (2016) Dynamic memory networks for visual and textual question answering. arXiv, 1603
Srivastava RK, Greff K, Schmidhuber J (2015) Training very deepnetworks. In: Advances in neural information processing systems, pp 2377–2385
Zilly JG, Srivastava RK, Koutn´ık J, Schmidhuber J (2016) Recurrent highway networks. arXiv preprint arXiv:1607.03474
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning withneural networks. In: Advances in neural information processing systems, pp 3104–3112
Cho S-H, Lee W-H, Kim J-H (2017) Implementation of human-robot vqa interaction system with dynamic memory networks. In: 2017 IEEE international conference on systems, man, and cybernetics (SMC), pp 495–500
Kingma D, Adam JB (2014) A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Shin A, Ushiku Y, Harada T (2016) The color of the cat isgray: 1 million full-sentences visual question answering (fsvqa). arXiv preprint arXiv:1609.06657
Acknowledgements
This work was supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2017R1A2A1A17069837), and in part by the ICT R&D program of MSIT/IITP [R7124-16-0005, Research on adaptive machine learning technology development for intelligent autonomous digital companion].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Cho, S., Park, JM., Song, TJ., Kim, JH. (2020). Human-Robot Full-Sentence VQA Interaction System with Highway Memory Network. In: P. P. Abdul Majeed, A., Mat-Jizat, J., Hassan, M., Taha, Z., Choi, H., Kim, J. (eds) RITA 2018. Lecture Notes in Mechanical Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-13-8323-6_12
Download citation
DOI: https://doi.org/10.1007/978-981-13-8323-6_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8322-9
Online ISBN: 978-981-13-8323-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)