Skip to main content

RoboNLU: Advancing Command Understanding with a Novel Lightweight BERT-Based Approach for Service Robotics

  • Conference paper
  • First Online:
RoboCup 2023: Robot World Cup XXVI (RoboCup 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14140))

Included in the following conference series:

  • 114 Accesses

Abstract

This paper proposes a novel approach to natural language understanding (NLU) in service robots called RoboNLU, which leverages pre-trained language models along with specialized classifiers to extract meaning from user commands. Specifically, the proposed system utilizes the Bidirectional Encoder Representations from Transformers (BERT) model in conjunction with slot, intent, and pronoun resolution classifiers. The model was trained on a newly created, large-scale, and high-quality GPSR (General Purpose Service Robot) command dataset, yielding impressive results in intent classification, slot filling, and pronoun resolution tasks while also demonstrating robustness in out-of-vocabulary scenarios. Furthermore, the system was optimized for real-time processing on a service robot by leveraging smaller, quantized versions of the BERT-base model and deploying the system using the ONNXruntime framework (Code and data available at https://github.com/RoboBreizh-RoboCup-Home/RoboNLU).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/kyordhel/GPSRCmdGen.

  2. 2.

    https://github.com/onnx/optimizer.

References

  1. Bastianelli, E., Croce, D., Vanzo, A., Basili, R., Nardi, D., et al.: A discriminative approach to grounded spoken language understanding in interactive robotics. In: IJCAI, pp. 2747–2753 (2016)

    Google Scholar 

  2. Cai, F., Zhou, W., Mi, F., Faltings, B.: SLIM: explicit slot-intent mapping with BERT for joint multi-intent detection and slot filling. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7607–7611. IEEE (2022)

    Google Scholar 

  3. Chen, K., Lu, D., Chen, Y., Tang, K., Wang, N., Chen, X.: The intelligent techniques in robot KeJia – the champion of RoboCup@Home 2014. In: Bianchi, R.A.C., Akin, H.L., Ramamoorthy, S., Sugiura, K. (eds.) RoboCup 2014. LNCS (LNAI), vol. 8992, pp. 130–141. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18615-3_11

    Chapter  Google Scholar 

  4. Chen, Q., Zhuo, Z., Wang, W.: BERT for joint intent classification and slot filling. arXiv preprint arXiv:1902.10909 (2019)

  5. Costello, C., Lin, R., Mruthyunjaya, V., Bolla, B., Jankowski, C.: Multi-layer ensembling techniques for multilingual intent classification (2018)

    Google Scholar 

  6. ONNX Runtime developers: ONNX runtime. https://onnxruntime.ai/ (2021). Version: x.y.z

  7. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  8. Dowding, J., et al.: GEMINI: a natural language system for spoken-language understanding. arXiv preprint cmp-lg/9407007 (1994)

    Google Scholar 

  9. Dzifcak, J., Scheutz, M., Baral, C., Schermerhorn, P.: What to do and how to do it: translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In: 2009 IEEE International Conference on Robotics and Automation, pp. 4163–4168. IEEE (2009)

    Google Scholar 

  10. Eppe, M., Trott, S., Raghuram, V., Feldman, J.A., Janin, A.: Application-independent and integration-friendly natural language understanding. In: GCAI, pp. 340–352 (2016)

    Google Scholar 

  11. Firdaus, M., Bhatnagar, S., Ekbal, A., Bhattacharyya, P.: A deep learning based multi-task ensemble model for intent detection and slot filling in spoken language understanding. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11304, pp. 647–658. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04212-7_57

    Chapter  Google Scholar 

  12. Gangadharaiah, R., Narayanaswamy, B.: Joint multiple intent detection and slot labeling for goal-oriented dialog. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 564–569 (2019)

    Google Scholar 

  13. Goo, C.W., et al.: Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp. 753–757 (2018)

    Google Scholar 

  14. Guo, D., Tur, G., Yih, W.T., Zweig, G.: Joint semantic utterance classification and slot filling with recursive neural networks. In: 2014 IEEE Spoken Language Technology Workshop (SLT), pp. 554–559 (2014). https://doi.org/10.1109/SLT.2014.7078634

  15. Haffner, P., Tur, G., Wright, J.H.: Optimizing SVMs for complex call classification. In: 2003 Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), vol. 1, p. I-I. IEEE (2003)

    Google Scholar 

  16. Hromei, C.D., Croce, D., Basili, R.: Grounding end-to-end architectures for semantic role labeling in human robot interaction. In: Proceedings of the Sixth Workshop on Natural Language for Artificial Intelligence (NL4AI 2022) co-located with 21th International Conference of the Italian Association for Artificial Intelligence (AI* IA 2022) (2022)

    Google Scholar 

  17. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2017)

    Google Scholar 

  18. Kramer, E.R., Sáinz, A.O., Mitrevski, A., Plöger, P.G.: Tell your robot what to do: evaluation of natural language models for robot command processing. In: Chalup, S., Niemueller, T., Suthakorn, J., Williams, M.-A. (eds.) RoboCup 2019. LNCS (LNAI), vol. 11531, pp. 255–267. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35699-6_20

    Chapter  Google Scholar 

  19. Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454 (2016)

  20. Martins, P.H., Custódio, L., Ventura, R.: A deep learning approach for understanding natural language commands for mobile service robots. arXiv preprint arXiv:1807.03053 (2018)

  21. Masumura, R., Shinohara, Y., Higashinaka, R., Aono, Y.: Adversarial training for multi-task and multi-lingual joint modeling of utterance intent classification. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 633–639. Association for Computational Linguistics, Brussels (2018). https://doi.org/10.18653/v1/D18-1064, https://aclanthology.org/D18-1064

  22. Matuszek, C., Herbst, E., Zettlemoyer, L., Fox, D.: Learning to parse natural language commands to a robot control system. In: Desai, J., Dudek, G., Khatib, O., Kumar, V. (eds.) Experimental Robotics. Springer Tracts in Advanced Robotics, vol. 88, pp. 403–415. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-319-00065-7_28

    Chapter  Google Scholar 

  23. Peng, B., Yao, K.: Recurrent neural networks with external memory for language understanding. arXiv preprint arXiv:1506.00195 (2015)

  24. Qin, L., Wei, F., Xie, T., Xu, X., Che, W., Liu, T.: GL-GIN: fast and accurate non-autoregressive model for joint multiple intent detection and slot filling. arXiv preprint arXiv:2106.01925 (2021)

  25. Qin, L., Xu, X., Che, W., Liu, T.: AGIF: an adaptive graph-interactive framework for joint multiple intent detection and slot filling. arXiv preprint arXiv:2004.10087 (2020)

  26. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)

  27. Seneff, S.: TINA: a natural language system for spoken language applications. Comput. Linguist. 18(1), 61–86 (1992)

    Google Scholar 

  28. Sun, Z., Yu, H., Song, X., Liu, R., Yang, Y., Zhou, D.: MobileBERT: a compact task-agnostic BERT for resource-limited devices. arXiv preprint arXiv:2004.02984 (2020)

  29. Tada, Y., Hagiwara, Y., Tanaka, H., Taniguchi, T.: Robust understanding of robot-directed speech commands using sequence to sequence with noise injection. Front. Robot. AI 6, 144 (2020)

    Article  Google Scholar 

  30. Vanzo, A., Croce, D., Bastianelli, E., Basili, R., Nardi, D.: Grounded language interpretation of robotic commands through structured learning. Artif. Intell. 278 (2020). https://doi.org/10.1016/j.artint.2019.103181

  31. Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation (2016)

    Google Scholar 

  32. Yao, K., Zweig, G., Hwang, M.Y., Shi, Y., Yu, D.: Recurrent neural networks for language understanding. In: Interspeech, pp. 2524–2528 (2013)

    Google Scholar 

  33. Zheng, Y., Liu, Y., Hansen, J.H.: Intent detection and semantic parsing for navigation dialogue language processing. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pp. 1–6. IEEE (2017)

    Google Scholar 

Download references

Acknowledgment

This work benefits from the support of Britanny region.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sinuo Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, S., Neau, M., Buche, C. (2024). RoboNLU: Advancing Command Understanding with a Novel Lightweight BERT-Based Approach for Service Robotics. In: Buche, C., Rossi, A., Simões, M., Visser, U. (eds) RoboCup 2023: Robot World Cup XXVI. RoboCup 2023. Lecture Notes in Computer Science(), vol 14140. Springer, Cham. https://doi.org/10.1007/978-3-031-55015-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-55015-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-55014-0

  • Online ISBN: 978-3-031-55015-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics