Synthesizing Robot Programs with Interactive Tutor Mode


With the rapid development of the robotic industry, domestic robots have become increasingly popular. As domestic robots are expected to be personal assistants, it is important to develop a natural language-based human-robot interactive system for end-users who do not necessarily have much programming knowledge. To build such a system, we developed an interactive tutoring framework, named “Holert”, which can translate task descriptions in natural language to machine-interpretable logical forms automatically. Compared to previous works, Holert allows users to teach the robot by further explaining their intentions in an interactive tutor mode. Furthermore, Holert introduces a semantic dependency model to enable the robot to “understand” similar task descriptions. We have deployed Holert on an open-source robot platform, Turtlebot 2. Experimental results show that the system accuracy could be significantly improved by 163.9% with the support of the tutor mode. This system is also efficient. Even the longest task session with 10 sentences can be handled within 0.7 s.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA


  1. [1]

    J. Scholtz. Theory and evaluation of human robot interactions. In Proceedings of the 36th Annual Hawaii International Conference on System Sciences, IEEE, Big Island, USA, 2003. DOI:

  2. [2]

    I. G. Alonso, M. Fernández, J. M. Maestre, M. del Pilar Almudena García Fuente. Service Robotics within the Digital Home: Applications and Future Prospects, Dordrecht, Holland: Springer, 2011. DOI:

  3. [3]

    R Borja, J. R. De La Pinta, A. Álvarez, J. M. Maestre. Integration of service robots in the smart home by means of UPnP: A surveillance robot case study. Robotics and Autonomous Systems, vol. 61, no. 2, pp. 153–160, 2013. DOI:

  4. [4]

    C. Zhou, M. H. Jin, Y. C. Liu, Z. Zhang, Y. Liu, H. Liu. Singularity robust path planning for real time base attitude adjustment of free-floating space robot. International Journal of Automation and Computing, vol. 14, no. 2, pp. 169–178, 2017. DOI:

  5. [5]

    K. C. D. Fu, Y. Nakamura, T. Yamamoto, H. Ishiguro. Analysis of motor synergies utilization for optimal movement generation for a human-like robotic arm. International Journal of Automation and Computing, vol. 10, no. 6, pp. 515–524, 2013. DOI:

  6. [6]

    S. Alexandrova, Z. Tatlock, M. Cakmak. RoboFlow: A flow-based visual programming language for mobile manipulation tasks. In Proceedings of IEEE International Conference on Robotics and Automation, Seattle, USA, pp. 5537–5544, 2015. DOI:

  7. [7]

    C. Datta, C. Jayawardena, I. H. Kuo, B. A. MacDonald. RoboStudio: A visual programming environment for rapid authoring and customization of complex services on a personal service robot. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, Portugal, pp. 2352–2357, 2012. DOI:

  8. [8]

    J. Li, A. Q. Xu, G. Dudek. Graphical state space programming: A visual programming paradigm for robot task specification. In Proceedings of IEEE International Confer-ence on Robotics and Automation, Shanghai, China, pp. 4846–4853, 2011. DOI:

  9. [9]

    M. A. Goodrich, A. C. Schultz. Human-robot interaction: A survey. Foundations and Trends in Human-computer Interaction, vol. 1, no. 3, pp. 203–275, 2007. DOI:

  10. [10]

    J. Dzifcak, M. Scheutz, C. Baral, P. Schermerhorn. What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In Proceedings of IEEE International Conference on Robotics and Automation, Kobe, Japan, pp. 4163–4168, 2009. DOI:

  11. [11]

    C. Matuszek, E. Herbst, L. Zettlemoyer, D. Fox. Learning to parse natural language commands to a robot control system. In Proceedings of the 13th International Symposium on Experimental Robotics, Springer, Heidelberg, Germany, 403–415, 2013. DOI:

  12. [12]

    R. F. Ge, R. J. Mooney. A statistical semantic parser that integrates syntax and semantics. In Proceedings of the 9th Conference on Computational Natural Language Learning, Association for Computational Linguistics, Ann Arbor, USA, 9–16, 2005. DOI:

  13. [13]

    K. Zhao, L. Huang. Type-driven incremental semantic parsing with polymorphism. In Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the ACL, Denver, USA, 2015. DOI:

  14. [14]

    Y. Artzi, L. Zettlemoyer. Bootstrapping semantic parsers from conversations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, 421–432, 2011.

  15. [15]

    J. Berant, A. Chou, R. Frostig, P. Liang. Semantic parsing on freebase from question-answer pairs. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Washington, USA, 2013.

  16. [16]

    L. Dong, M. Lapata. Language to logical form with neural attention. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016. DOI:

  17. [17]

    Y. Kim. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014. DOI:

  18. [18]

    Y. N. Dauphin, D. Z. Hakkani-Tur, G. Tur, L. P. Heck. Deep learning for semantic parsing including semantic utterance classification, USA. Patent 20150310862, October 2015.

  19. [19]

    R. Collobert, J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, ACM, Helsinki, Finland, 160–167, 2008. DOI:

  20. [20]

    Z. P. Tu, Z. D. Lu, Y. Liu, X. H. Liu, H. Li. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016. DOI:

  21. [21]

    A. M. Rush, S. Chopra, J. Weston. A neural attention model for sentence summarization. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015. DOI:

  22. [22]

    M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, A. Y. Ng. ROS: An open-source robot operating system. In Proceedings of ICRA Workshop on Open Source Software, Kobe, Japan, 2009.

  23. [23]

    A. Voutilainen. Part-of-speech tagging. The Oxford Handbook of Computational Linguistics, R. Mitkov, Ed., Oxford, UK: Oxford University Press, 219–232, 2003.

  24. [24]

    D. Q. Chen, C. Manning. A fast and accurate dependency parser using neural networks. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Doha, Qatar, 2014.

  25. [25]

    G. Look, B. Kottahachchi, R. Laddaga, H. Shrobe. A location representation for generating descriptive walking directions. In Proceedings of the 10th International Conference on Intelligent User Interfaces, ACM, San Diego, USA, 122–129, 2005. DOI:

  26. [26]

    T. Kollar, S. Tellex, D. Roy, N. Roy. Toward understanding natural language directions. In Proceedings of the 5th ACM/IEEE International Conference on Human-robot Interaction, Osaka, Japan, 259–266, 2010. DOI:

  27. [27]

    G. Bugmann, E. Klein, S. Lauria, T. Kyriacou. Corpusbased robotics: A route instruction example. In Proceedings of the 8th International Conference on Intelligent Autonomous Systems, Amsterdam, Netherlands, 96–103, 2004.

  28. [28]

    M. MacMahon, B. Stankiewicz, B. Kuipers. Walk the talk: Connecting language, knowledge, and action in route instructions. In Proceedings of National Conference on Artificial Intelligence, AAAI, Austin, UK, 2006.

  29. [29]

    P. E. Rybski, J. Stolarz, K. Yoon, M. Veloso. Using dialog and human observations to dictate tasks to a learning robot assistant. Intelligent Service Robotics, vol. 1, no. 2, pp. 159–167, 2008. DOI:

  30. [30]

    J. Thomason, S. Q. Zhang, R. J. Mooney, P. Stone. Learning to interpret natural language commands through human-robot dialog. In Proceedings of the 24th International Conference on Artificial Intelligence, AAAI, Buenos Aires, Argentina, 1923–1929, 2015.

  31. [31]

    L. B. She, S. H. Yang, Y. Cheng, Y. Y. Jia, J. Y. Chai, N. Xi. Back to the blocks world: Learning new actions through situated human-robot dialogue. In Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Philadelphia, USA, 89–97, 2014.

  32. [32]

    E. Grefenstette, P. Blunsom, N. de Freitas, K. M. Hermann. A deep architecture for semantic parsing. In Proceedings of the ACL Workshop on Semantic Parsing, Association for Computational Linguistics, Baltimore, USA, 2014.

  33. [33]

    N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, A. Swami. The limitations of deep learning in adversarial settings. In Proceedings of IEEE European Symposium on Security and Privacy, Saarbrucken, Germany, 372–387, 2016. DOI:

  34. [34]

    B. Biggio, B. Nelson, P. Laskov. Support vector machines under adversarial label noise. In Proceedings of the 3rd Asian Conference on Machine Learning, Taoyuan, China, 97–112, 2011.

Download references


This work was supported by Tsinghua University Initiative Scientific Research Program (No. 20141081140).

Author information

Correspondence to Yu-Ping Wang.

Additional information

Recommended by Associate Editor James Whidborne

Hao Li received the B. Sc. degree in computer science and technology from Xidian University, China in 2012. He is now a Ph.D. degree candidate at Tsinghua University, China under the supervision of professor Shi-Min Hu. His work has been published in journals including Communications in Information and Systems, International Journal of Software Engineering and Knowledge Engineering.

His research interests include program synthesis and system reliability.

Yu-Ping Wang received the Ph. D. degree in computer science and technology from Tsinghua University, China in 2009. He is currently an associate professor of Tsinghua University, China. He has published papers in important journals and conferences, including IEEE Transactions on Visualization and Computer Graphics, IEEE Transactions on Computers, Journal of Systems and Software, USENIX Annual Technical Conference, International Symposium on Code Generation and Optimization, International Symposium on Software Reliability Engineering, IEEE International Conference on Computers, Software and Applications (COMPSAC) and Asia-Pacific Software Engineering Conference. He received the COMPSAC 2014 Best Paper Award.

His research interests include robotic system and system reliability.

Tai-Jiang Mu received the B. Sc. and Ph. D. degrees in computer science and technology from Tsinghua University, China in 2011 and 2016, respectively. He is currently a postdoctoral researcher in Department of Computer Science and Technology, Tsinghua University, China.

His research interests include computer graphics, image/video processing and human-robot interaction.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, H., Wang, Y. & Mu, T. Synthesizing Robot Programs with Interactive Tutor Mode. Int. J. Autom. Comput. 16, 462–474 (2019) doi:10.1007/s11633-018-1154-7

Download citation


  • Human-robot interaction
  • semantic parsing
  • program synthesis
  • intelligent robotic systems
  • natural language understanding