RoboNLU: Advancing Command Understanding with a Novel Lightweight BERT-Based Approach for Service Robotics

Wang, Sinuo; Neau, Maëlic; Buche, Cédric

doi:10.1007/978-3-031-55015-7_3

Sinuo Wang^11,13,
Maëlic Neau^11,12,14 &
Cédric Buche^11,14

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14140))

Included in the following conference series:

Robot World Cup

114 Accesses

Abstract

This paper proposes a novel approach to natural language understanding (NLU) in service robots called RoboNLU, which leverages pre-trained language models along with specialized classifiers to extract meaning from user commands. Specifically, the proposed system utilizes the Bidirectional Encoder Representations from Transformers (BERT) model in conjunction with slot, intent, and pronoun resolution classifiers. The model was trained on a newly created, large-scale, and high-quality GPSR (General Purpose Service Robot) command dataset, yielding impressive results in intent classification, slot filling, and pronoun resolution tasks while also demonstrating robustness in out-of-vocabulary scenarios. Furthermore, the system was optimized for real-time processing on a service robot by leveraging smaller, quantized versions of the BERT-base model and deploying the system using the ONNXruntime framework (Code and data available at https://github.com/RoboBreizh-RoboCup-Home/RoboNLU).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bastianelli, E., Croce, D., Vanzo, A., Basili, R., Nardi, D., et al.: A discriminative approach to grounded spoken language understanding in interactive robotics. In: IJCAI, pp. 2747–2753 (2016)
Google Scholar
Cai, F., Zhou, W., Mi, F., Faltings, B.: SLIM: explicit slot-intent mapping with BERT for joint multi-intent detection and slot filling. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7607–7611. IEEE (2022)
Google Scholar
Chen, K., Lu, D., Chen, Y., Tang, K., Wang, N., Chen, X.: The intelligent techniques in robot KeJia – the champion of RoboCup@Home 2014. In: Bianchi, R.A.C., Akin, H.L., Ramamoorthy, S., Sugiura, K. (eds.) RoboCup 2014. LNCS (LNAI), vol. 8992, pp. 130–141. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18615-3_11
Chapter Google Scholar
Chen, Q., Zhuo, Z., Wang, W.: BERT for joint intent classification and slot filling. arXiv preprint arXiv:1902.10909 (2019)
Costello, C., Lin, R., Mruthyunjaya, V., Bolla, B., Jankowski, C.: Multi-layer ensembling techniques for multilingual intent classification (2018)
Google Scholar
ONNX Runtime developers: ONNX runtime. https://onnxruntime.ai/ (2021). Version: x.y.z
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dowding, J., et al.: GEMINI: a natural language system for spoken-language understanding. arXiv preprint cmp-lg/9407007 (1994)
Google Scholar
Dzifcak, J., Scheutz, M., Baral, C., Schermerhorn, P.: What to do and how to do it: translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In: 2009 IEEE International Conference on Robotics and Automation, pp. 4163–4168. IEEE (2009)
Google Scholar
Eppe, M., Trott, S., Raghuram, V., Feldman, J.A., Janin, A.: Application-independent and integration-friendly natural language understanding. In: GCAI, pp. 340–352 (2016)
Google Scholar
Firdaus, M., Bhatnagar, S., Ekbal, A., Bhattacharyya, P.: A deep learning based multi-task ensemble model for intent detection and slot filling in spoken language understanding. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11304, pp. 647–658. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04212-7_57
Chapter Google Scholar
Gangadharaiah, R., Narayanaswamy, B.: Joint multiple intent detection and slot labeling for goal-oriented dialog. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 564–569 (2019)
Google Scholar
Goo, C.W., et al.: Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp. 753–757 (2018)
Google Scholar
Guo, D., Tur, G., Yih, W.T., Zweig, G.: Joint semantic utterance classification and slot filling with recursive neural networks. In: 2014 IEEE Spoken Language Technology Workshop (SLT), pp. 554–559 (2014). https://doi.org/10.1109/SLT.2014.7078634
Haffner, P., Tur, G., Wright, J.H.: Optimizing SVMs for complex call classification. In: 2003 Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), vol. 1, p. I-I. IEEE (2003)
Google Scholar
Hromei, C.D., Croce, D., Basili, R.: Grounding end-to-end architectures for semantic role labeling in human robot interaction. In: Proceedings of the Sixth Workshop on Natural Language for Artificial Intelligence (NL4AI 2022) co-located with 21th International Conference of the Italian Association for Artificial Intelligence (AI* IA 2022) (2022)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2017)
Google Scholar
Kramer, E.R., Sáinz, A.O., Mitrevski, A., Plöger, P.G.: Tell your robot what to do: evaluation of natural language models for robot command processing. In: Chalup, S., Niemueller, T., Suthakorn, J., Williams, M.-A. (eds.) RoboCup 2019. LNCS (LNAI), vol. 11531, pp. 255–267. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35699-6_20
Chapter Google Scholar
Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454 (2016)
Martins, P.H., Custódio, L., Ventura, R.: A deep learning approach for understanding natural language commands for mobile service robots. arXiv preprint arXiv:1807.03053 (2018)
Masumura, R., Shinohara, Y., Higashinaka, R., Aono, Y.: Adversarial training for multi-task and multi-lingual joint modeling of utterance intent classification. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 633–639. Association for Computational Linguistics, Brussels (2018). https://doi.org/10.18653/v1/D18-1064, https://aclanthology.org/D18-1064
Matuszek, C., Herbst, E., Zettlemoyer, L., Fox, D.: Learning to parse natural language commands to a robot control system. In: Desai, J., Dudek, G., Khatib, O., Kumar, V. (eds.) Experimental Robotics. Springer Tracts in Advanced Robotics, vol. 88, pp. 403–415. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-319-00065-7_28
Chapter Google Scholar
Peng, B., Yao, K.: Recurrent neural networks with external memory for language understanding. arXiv preprint arXiv:1506.00195 (2015)
Qin, L., Wei, F., Xie, T., Xu, X., Che, W., Liu, T.: GL-GIN: fast and accurate non-autoregressive model for joint multiple intent detection and slot filling. arXiv preprint arXiv:2106.01925 (2021)
Qin, L., Xu, X., Che, W., Liu, T.: AGIF: an adaptive graph-interactive framework for joint multiple intent detection and slot filling. arXiv preprint arXiv:2004.10087 (2020)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
Seneff, S.: TINA: a natural language system for spoken language applications. Comput. Linguist. 18(1), 61–86 (1992)
Google Scholar
Sun, Z., Yu, H., Song, X., Liu, R., Yang, Y., Zhou, D.: MobileBERT: a compact task-agnostic BERT for resource-limited devices. arXiv preprint arXiv:2004.02984 (2020)
Tada, Y., Hagiwara, Y., Tanaka, H., Taniguchi, T.: Robust understanding of robot-directed speech commands using sequence to sequence with noise injection. Front. Robot. AI 6, 144 (2020)
Article Google Scholar
Vanzo, A., Croce, D., Bastianelli, E., Basili, R., Nardi, D.: Grounded language interpretation of robotic commands through structured learning. Artif. Intell. 278 (2020). https://doi.org/10.1016/j.artint.2019.103181
Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation (2016)
Google Scholar
Yao, K., Zweig, G., Hwang, M.Y., Shi, Y., Yu, D.: Recurrent neural networks for language understanding. In: Interspeech, pp. 2524–2528 (2013)
Google Scholar
Zheng, Y., Liu, Y., Hansen, J.H.: Intent detection and semantic parsing for navigation dialogue language processing. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pp. 1–6. IEEE (2017)
Google Scholar

Download references

Acknowledgment

This work benefits from the support of Britanny region.

Author information

Authors and Affiliations

CROSSING, CNRS IRL, Adelaide, 2010, Australia
Sinuo Wang, Maëlic Neau & Cédric Buche
Flinders University, Adelaide, Australia
Maëlic Neau
University of Adelaide, Adelaide, Australia
Sinuo Wang
Lab-STICC, CNRS UMR 6285, ENIB, Brest, France
Maëlic Neau & Cédric Buche

Authors

Sinuo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Maëlic Neau
View author publications
You can also search for this author in PubMed Google Scholar
Cédric Buche
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sinuo Wang .

Editor information

Editors and Affiliations

ENIB/Naval Group Pacific, Plouzané, France
Cédric Buche
University of Naples Federico II, Naples, Italy
Alessandra Rossi
Bahia State University, Salvador, Brazil
Marco Simões
University of Miami, Coral Gables, FL, USA
Ubbo Visser

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, S., Neau, M., Buche, C. (2024). RoboNLU: Advancing Command Understanding with a Novel Lightweight BERT-Based Approach for Service Robotics. In: Buche, C., Rossi, A., Simões, M., Visser, U. (eds) RoboCup 2023: Robot World Cup XXVI. RoboCup 2023. Lecture Notes in Computer Science(), vol 14140. Springer, Cham. https://doi.org/10.1007/978-3-031-55015-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-55015-7_3
Published: 14 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-55014-0
Online ISBN: 978-3-031-55015-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

RoboNLU: Advancing Command Understanding with a Novel Lightweight BERT-Based Approach for Service Robotics