Skip to main content
Book cover

RITA 2018 pp 131–148Cite as

Human-Robot Full-Sentence VQA Interaction System with Highway Memory Network

  • Conference paper
  • First Online:
  • 824 Accesses

Part of the book series: Lecture Notes in Mechanical Engineering ((LNME))

Abstract

Intelligent robots such as social robots and home service robots that are being developed nowadays are required to have the ability to naturally communicate with users in natural language. Beyond simple data retrieval and simple dialogue level, we propose a new Human-Robot Interaction (HRI) system that enables a robot to understand and reason the environment around a user and present information about them in natural language, whenever the user asks a question to the robot. For its intelligent HRI, based on Dynamic Memory Networks (DMN), a neural network for Visual Question Answering (VQA), we propose a new full sentence VQA network model called Full-Sentence Highway Memory Network (FSHMN). For its robot platform, a three DOF robotic head was used which has a neck with three motors and a tablet PC head. To verify the feasibility of the proposed system, an experiment is performed in which a user and a robot interact with each other in a way of question answering in a customized kitchen environment. Through the experiment, we not only demonstrated the effectiveness of applying deep learning to HRI applications in real environments but also presented a new insight into HRI.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The video clip is available at https://goo.gl/35nFqz.

References

  1. Breazeal CL (2004) Designing sociable robots. MIT press

    Google Scholar 

  2. Park G-M, Kim D-H, Jeong I-B, Ko W-R, Yoo Y-H, Kim J-H (2017) Taskintelligence of robots: neural model-based mechanism of thought and online motion planning. IEEE Trans Emerg Top Comput Intell 1(1):41–50

    Article  Google Scholar 

  3. Lin P, Abney K, Bekey GA (2011) Robot ethics: the ethical and social implications of robotics. MIT press

    Google Scholar 

  4. Scheutz Matthias (2013) What is robot ethics?[tc spotlight]. IEEE Robot Autom Mag 20(4):20–165

    Article  Google Scholar 

  5. Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), vol 1631. Citeseer, p 1642

    Google Scholar 

  6. Ma L, Lu Z, Li H (2015) Learning to answer questions from imageusing convolutional neural network. arXiv preprint arXiv:1506.00333

  7. Yang Z, He X, Gao J, Deng L, Smola A (2016) Stacked attention networks for image question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 21–29

    Google Scholar 

  8. Xiong C, Merity S, Socher R (2016) Dynamic memory networks for visual and textual question answering. arXiv, 1603

    Google Scholar 

  9. Srivastava RK, Greff K, Schmidhuber J (2015) Training very deepnetworks. In: Advances in neural information processing systems, pp 2377–2385

    Google Scholar 

  10. Zilly JG, Srivastava RK, Koutn´ık J, Schmidhuber J (2016) Recurrent highway networks. arXiv preprint arXiv:1607.03474

  11. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning withneural networks. In: Advances in neural information processing systems, pp 3104–3112

    Google Scholar 

  12. Cho S-H, Lee W-H, Kim J-H (2017) Implementation of human-robot vqa interaction system with dynamic memory networks. In: 2017 IEEE international conference on systems, man, and cybernetics (SMC), pp 495–500

    Google Scholar 

  13. Kingma D, Adam JB (2014) A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  14. Shin A, Ushiku Y, Harada T (2016) The color of the cat isgray: 1 million full-sentences visual question answering (fsvqa). arXiv preprint arXiv:1609.06657

Download references

Acknowledgements

This work was supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2017R1A2A1A17069837), and in part by the ICT R&D program of MSIT/IITP [R7124-16-0005, Research on adaptive machine learning technology development for intelligent autonomous digital companion].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jong-Hwan Kim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cho, S., Park, JM., Song, TJ., Kim, JH. (2020). Human-Robot Full-Sentence VQA Interaction System with Highway Memory Network. In: P. P. Abdul Majeed, A., Mat-Jizat, J., Hassan, M., Taha, Z., Choi, H., Kim, J. (eds) RITA 2018. Lecture Notes in Mechanical Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-13-8323-6_12

Download citation

Publish with us

Policies and ethics