Using Bidirectional Transformer-CRF for Spoken Language Understanding

Zhang, Linhao; Wang, Houfeng

doi:10.1007/978-3-030-32233-5_11

Linhao Zhang¹³ &
Houfeng Wang¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11838))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

2387 Accesses
5 Citations

Abstract

Spoken Language Understanding (SLU) is a critical component in spoken dialogue systems. It is typically composed of two tasks: intent detection (ID) and slot filling (SF). Currently, most effective models carry out these two tasks jointly and often result in better performance than separate models. However, these models usually fail to model the interaction between intent and slots and ties these two tasks only by a joint loss function. In this paper, we propose a new model based on bidirectional Transformer and introduce a padding method, enabling intent and slots to interact with each other in an effective way. A CRF layer is further added to achieve global optimization. We conduct our experiments on benchmark ATIS and Snips datasets, and results show that our model achieves state-of-the-art on both tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Li, C., Li, L., Qi, J.: A self-attentive model with gate mechanism for spoken language understanding. In: EMNLP, pp. 3824–3833 (2018)
Google Scholar
Ba, J., Kiros, R., Hinton, G.E.: Layer normalization. CoRR (2016)
Google Scholar
Deng, L., Tur, G., He, X., Hakkani-Tur, D.: Use of kernel deep convex networks and end-to-end learning for spoken language understanding. In: 2012 IEEE Spoken Language Technology Workshop (SLT), pp. 210–215. IEEE (2012)
Google Scholar
Deoras, A., Sarikaya, R.: Deep belief network based semantic taggers for spoken language understanding. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 2713–2717, January 2013
Google Scholar
Goo, C.W., et al.: Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), vol. 2, pp. 753–757 (2018)
Google Scholar
Haffner, P., Tur, G., Wright, J.H.: Optimizing SVMs for complex call classification. In: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), vol. 1, p. I. IEEE (2003)
Google Scholar
Hakkani-Tür, D., et al.: Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In: Interspeech, pp. 715–719 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Hemphill, C.T., Godfrey, J.J., Doddington, G.R.: The ATIS spoken language systems pilot corpus. In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, 24–27 June 1990 (1990)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. CoRR (2015)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR (2015)
Google Scholar
Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies, pp. 1–8. Association for Computational Linguistics (2001)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML 2001 Proceedings of the Eighteenth International Conference on Machine Learning, 8 June 2001, pp. 282–289 (2001)
Google Scholar
Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454 (2016)
Ma, S., Sun, X.: A new recurrent neural CRF for learning non-linear edge features. arXiv preprint arXiv:1611.04233 (2016)
McCallum, A., Freitag, D., Pereira, F.: Maximum entropy Markov models for information extraction and segmentation. In: ICML (2000)
Google Scholar
Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015)
Article Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Tür, G., Hakkani-Tür, D.Z., Heck, L.P., Parthasarathy, S.: Sentence simplification for spoken language understanding. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5628–5631 (2011)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Xu, P., Sarikaya, R.: Convolutional neural network based triangular CRF for joint intent detection and slot filling, pp. 78–83 (2013)
Google Scholar
Yao, K., Peng, B., Zhang, Y., Yu, D., Zweig, G., Shi, Y.: Spoken language understanding using long short-term memory neural networks. In: 2014 IEEE Spoken Language Technology Workshop (SLT), pp. 189–194. IEEE (2014)
Google Scholar
Yao, K., Zweig, G., Hwang, M.Y., Shi, Y., Yu, D.: Recurrent neural networks for language understanding. In: Interspeech, pp. 2524–2528 (2013)
Google Scholar
Zhang, X., Ma, D., Wang, H.: Learning dialogue history for spoken language understanding. In: NLPCC (2018)
Google Scholar
Zhang, X., Wang, H.: A joint model of intent determination and slot filling for spoken language understanding. In: IJCAI, pp. 2993–2999 (2016)
Google Scholar
Zhao, S., Meng, R., He, D., Andi, S., Bambang, P.: Integrating Transformer and Paraphrase Rules for Sentence Simplification. arXiv preprint arXiv:1810.11193 (2018)
Zhou, L., Zhou, Y., Corso, J.J., Socher, R., Xiong, C.: End-to-end dense video captioning with masked transformer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8739–8748 (2018)
Google Scholar

Download references

Acknowledgments

Our work is supported by the National Key Research and Development Program of China under Grant No. 2017YFB1002101 and National Natural Science Foundation of China under Grant No. 61433015.

Author information

Authors and Affiliations

MOE Key Lab of Computational Linguistics, Peking University, Beijing, 100871, China
Linhao Zhang & Houfeng Wang

Authors

Linhao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Houfeng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Houfeng Wang .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, L., Wang, H. (2019). Using Bidirectional Transformer-CRF for Spoken Language Understanding. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11838. Springer, Cham. https://doi.org/10.1007/978-3-030-32233-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-32233-5_11
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32232-8
Online ISBN: 978-3-030-32233-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)