Human-Robot Full-Sentence VQA Interaction System with Highway Memory Network

Cho, Sanghyun; Park, Jin-Man; Song, Taek-Jin; Kim, Jong-Hwan

doi:10.1007/978-981-13-8323-6_12

Human-Robot Full-Sentence VQA Interaction System with Highway Memory Network

Sanghyun Cho⁷,
Jin-Man Park⁷,
Taek-Jin Song⁷ &
…
Jong-Hwan Kim⁷

Conference paper
First Online: 16 June 2019

824 Accesses

Part of the book series: Lecture Notes in Mechanical Engineering ((LNME))

Abstract

Intelligent robots such as social robots and home service robots that are being developed nowadays are required to have the ability to naturally communicate with users in natural language. Beyond simple data retrieval and simple dialogue level, we propose a new Human-Robot Interaction (HRI) system that enables a robot to understand and reason the environment around a user and present information about them in natural language, whenever the user asks a question to the robot. For its intelligent HRI, based on Dynamic Memory Networks (DMN), a neural network for Visual Question Answering (VQA), we propose a new full sentence VQA network model called Full-Sentence Highway Memory Network (FSHMN). For its robot platform, a three DOF robotic head was used which has a neck with three motors and a tablet PC head. To verify the feasibility of the proposed system, an experiment is performed in which a user and a robot interact with each other in a way of question answering in a customized kitchen environment. Through the experiment, we not only demonstrated the effectiveness of applying deep learning to HRI applications in real environments but also presented a new insight into HRI.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
The video clip is available at https://goo.gl/35nFqz.

References

Breazeal CL (2004) Designing sociable robots. MIT press
Google Scholar
Park G-M, Kim D-H, Jeong I-B, Ko W-R, Yoo Y-H, Kim J-H (2017) Taskintelligence of robots: neural model-based mechanism of thought and online motion planning. IEEE Trans Emerg Top Comput Intell 1(1):41–50
Article Google Scholar
Lin P, Abney K, Bekey GA (2011) Robot ethics: the ethical and social implications of robotics. MIT press
Google Scholar
Scheutz Matthias (2013) What is robot ethics?[tc spotlight]. IEEE Robot Autom Mag 20(4):20–165
Article Google Scholar
Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), vol 1631. Citeseer, p 1642
Google Scholar
Ma L, Lu Z, Li H (2015) Learning to answer questions from imageusing convolutional neural network. arXiv preprint arXiv:1506.00333
Yang Z, He X, Gao J, Deng L, Smola A (2016) Stacked attention networks for image question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 21–29
Google Scholar
Xiong C, Merity S, Socher R (2016) Dynamic memory networks for visual and textual question answering. arXiv, 1603
Google Scholar
Srivastava RK, Greff K, Schmidhuber J (2015) Training very deepnetworks. In: Advances in neural information processing systems, pp 2377–2385
Google Scholar
Zilly JG, Srivastava RK, Koutn´ık J, Schmidhuber J (2016) Recurrent highway networks. arXiv preprint arXiv:1607.03474
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning withneural networks. In: Advances in neural information processing systems, pp 3104–3112
Google Scholar
Cho S-H, Lee W-H, Kim J-H (2017) Implementation of human-robot vqa interaction system with dynamic memory networks. In: 2017 IEEE international conference on systems, man, and cybernetics (SMC), pp 495–500
Google Scholar
Kingma D, Adam JB (2014) A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Shin A, Ushiku Y, Harada T (2016) The color of the cat isgray: 1 million full-sentences visual question answering (fsvqa). arXiv preprint arXiv:1609.06657

Download references

Acknowledgements

This work was supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2017R1A2A1A17069837), and in part by the ICT R&D program of MSIT/IITP [R7124-16-0005, Research on adaptive machine learning technology development for intelligent autonomous digital companion].

Author information

Authors and Affiliations

School of Electrical Engineering, KAIST, 291 Daehangno, Yuseong-gu, Daejeon, 305-701, Republic of Korea
Sanghyun Cho, Jin-Man Park, Taek-Jin Song & Jong-Hwan Kim

Authors

Sanghyun Cho
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Man Park
View author publications
You can also search for this author in PubMed Google Scholar
Taek-Jin Song
View author publications
You can also search for this author in PubMed Google Scholar
Jong-Hwan Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jong-Hwan Kim .

Editor information

Editors and Affiliations

Faculty of Manufacturing Engineering, Universiti Malaysia Pahang, Pekan, Pahang Darul Makmur, Malaysia
Anwar P. P. Abdul Majeed
Faculty of Manufacturing Engineering, Universiti Malaysia Pahang, Pekan, Pahang Darul Makmur, Malaysia
Jessnor Arif Mat-Jizat
Faculty of Manufacturing Engineering, Universiti Malaysia Pahang, Pekan, Pahang Darul Makmur, Malaysia
Mohd Hasnun Arif Hassan
Faculty of Manufacturing Engineering, Universiti Malaysia Pahang, Pekan, Pahang Darul Makmur, Malaysia
Zahari Taha
Department of Aerospace Engineering, KAIST, Daejeon, Taejon-jikhalsi, Korea (Republic of)
Han Lim Choi
Department of Aerospace Engineering, KAIST, Daejeon, Taejon-jikhalsi, Korea (Republic of)
Junmo Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cho, S., Park, JM., Song, TJ., Kim, JH. (2020). Human-Robot Full-Sentence VQA Interaction System with Highway Memory Network. In: P. P. Abdul Majeed, A., Mat-Jizat, J., Hassan, M., Taha, Z., Choi, H., Kim, J. (eds) RITA 2018. Lecture Notes in Mechanical Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-13-8323-6_12

Download citation

DOI: https://doi.org/10.1007/978-981-13-8323-6_12
Published: 16 June 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8322-9
Online ISBN: 978-981-13-8323-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics