Exploring Deep Learning Architectures Coupled with CRF Based Prediction for Slot-Filling

  • Tulika SahaEmail author
  • Sriparna Saha
  • Pushpak Bhattacharyya
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11301)


Slot-filling is one of the most crucial module of any dialogue system that focuses on extracting relevant and necessary information from the user utterances. In this paper, we propose variants of Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models for the task of slot-filling which includes LSTM/GRU networks, Bi-directional LSTM/GRU (Bi-LSTM/GRU) networks, LSTM/GRU-CRF and Bi-LSTM/GRU-CRF networks. Variants of LSTM/GRU is used for discourse modeling i.e., to capture long term dependencies in the input sentences. A Conditional Random Field (CRF) layer is integrated with the above network to capture the sentence level tag information. We show the experimental results of our proposed model on the benchmark Air Travel Information System (ATIS) dataset which indicate that our model performed exceptionally well compared to the state of the art.


Dialogue system Natural language understanding Slot-filling LSTM GRU CRF 


  1. 1.
    Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
  2. 2.
    Deng, L., Tur, G., He, X., Hakkani-Tur, D.: Use of kernel deep convex networks and end-to-end learning for spoken language understanding. In: 2012 IEEE Spoken Language Technology Workshop (SLT), pp. 210–215. IEEE (2012)Google Scholar
  3. 3.
    He, Y., Young, S.: A data-driven spoken language understanding system. In: 2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003, pp. 583–588. IEEE (2003)Google Scholar
  4. 4.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  5. 5.
    Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
  6. 6.
    Kalchbrenner, N., Blunsom, P.: Recurrent convolutional neural networks for discourse compositionality. arXiv preprint arXiv:1306.3584 (2013)
  7. 7.
    Khanpour, H., Guntakandla, N., Nielsen, R.: Dialogue act classification in domain-independent conversations using a deep recurrent neural network. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2012–2021 (2016)Google Scholar
  8. 8.
    Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)Google Scholar
  9. 9.
    Liu, B., Lane, I.: Recurrent neural network structured output prediction for spoken language understanding. In: Proceedings of the NIPS Workshop on Machine Learning for Spoken Language Understanding and Interactions (2015)Google Scholar
  10. 10.
    Macherey, K., Och, F.J., Ney, H.: Natural language understanding using statistical machine translation. In: Seventh European Conference on Speech Communication and Technology (2001)Google Scholar
  11. 11.
    Mesnil, G., et al.: Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Trans. Audio Speech Lang. Process. 23(3), 530–539 (2015)CrossRefGoogle Scholar
  12. 12.
    Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH, pp. 3771–3775 (2013)Google Scholar
  13. 13.
    Moschitti, A., Riccardi, G., Raymond, C.: Spoken language understanding with kernels for syntactic/semantic structures. In: IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU, pp. 183–188. IEEE (2007)Google Scholar
  14. 14.
    Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)Google Scholar
  15. 15.
    Price, P.J.: Evaluation of spoken language systems: the ATIS domain. In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, 24–27 June 1990 (1990)Google Scholar
  16. 16.
    Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: Conference on Empirical Methods in Natural Language Processing (1996)Google Scholar
  17. 17.
    Raymond, C., Riccardi, G.: Generative and discriminative algorithms for spoken language understanding. In: Eighth Annual Conference of the International Speech Communication Association (2007)Google Scholar
  18. 18.
    Tur, G., Deng, L., Hakkani-Tür, D., He, X.: Towards deeper understanding: deep convex networks for semantic utterance classification. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5045–5048. IEEE (2012)Google Scholar
  19. 19.
    Vu, N.T.: Sequential convolutional neural networks for slot filling in spoken language understanding. arXiv preprint arXiv:1606.07783 (2016)
  20. 20.
    Wang, Y.Y., Acero, A., Mahajan, M., Lee, J.: Combining statistical and knowledge-based spoken language understanding in conditional models. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, pp. 882–889. Association for Computational Linguistics (2006)Google Scholar
  21. 21.
    Welch, B.L.: The generalization ofstudent’s’ problem when several different population variances are involved. Biometrika 34(1/2), 28–35 (1947)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Yao, K., Zweig, G., Hwang, M.Y., Shi, Y., Yu, D.: Recurrent neural networks for language understanding. In: Interspeech, pp. 2524–2528 (2013)Google Scholar
  23. 23.
    Zhang, X., Wang, H.: A joint model of intent determination and slot filling for spoken language understanding. In: IJCAI, pp. 2993–2999 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Tulika Saha
    • 1
    Email author
  • Sriparna Saha
    • 1
  • Pushpak Bhattacharyya
    • 1
  1. 1.Department of Computer Science and EngineeringIndian Institute of Technology PatnaDealpur DaulatIndia

Personalised recommendations