Advertisement

GLEU-Guided Multi-resolution Network for Short Text Conversation

  • Xuan Liu
  • Kai Yu
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 807)

Abstract

With the recent development of sequence-to-sequence framework, generation approach for short text conversation becomes attractive. Traditional sequence-to-sequence method for short text conversation often suffers from dull response problem. Multi-resolution generation approach has been introduced to address this problem by dividing the generation process into two steps: keywords-sequence generation and response generation. However, this method still tends to generate short and dull keywords-sequence. In this work, a new multi-resolution generation framework is proposed. Instead of using the word-level maximum likelihood criterion, we optimize the sequence-level GLEU score of the entire generated keywords-sequence using a policy gradient approach in reinforcement learning. Experiments show that the proposed approach can generate longer and more diverse keywords-sequence. Meanwhile, it achieves better scores in the human evaluation.

Keywords

Short text conversation Sequence-to-sequence Multi-resolution Policy gradient 

References

  1. 1.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)Google Scholar
  2. 2.
    Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015)
  3. 3.
    Shang, L., Lu, Z., Li, H.: Neural responding machine for short-text conversation. arXiv preprint arXiv:1503.02364 (2015)
  4. 4.
    Vinyals, O., Le, Q.: A neural conversational model. arXiv preprint arXiv:1506.05869 (2015)
  5. 5.
    Theis, L., van den Oord, A., Bethge, M.: A note on the evaluation of generative models. arXiv preprint arXiv:1511.01844 (2015)
  6. 6.
    Mou, L., Song, Y., Yan, R., Li, G., Zhang, L., Jin, Z.: Sequence to backward and forward sequences: a content-introducing approach to generative short-text conversation. arXiv preprint arXiv:1607.00970 (2016)
  7. 7.
    Serban, I.V., Klinger, T., Tesauro, G., Talamadupula, K., Zhou, B., Bengio, Y., Courville, A.C.: Multiresolution recurrent neural networks: an application to dialogue response generation. In: AAAI, pp. 3288–3294 (2017)Google Scholar
  8. 8.
    Yu, L., Zhang, W., Wang, J., Yu, Y.: Seqgan: sequence generative adversarial nets with policy gradient. In: AAAI, pp. 2852–2858 (2017)Google Scholar
  9. 9.
    Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1063 (2000)Google Scholar
  10. 10.
    Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)Google Scholar
  11. 11.
    Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
  12. 12.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  13. 13.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  14. 14.
    Shang, L., Sakai, T., Lu, Z., Li, H., Higashinaka, R., Miyao, Y.: Overview of the NTCIR-12 short text conversation task. In: NTCIR (2016)Google Scholar
  15. 15.
    Che, W., Li, Z., Liu, T.: LTP: a Chinese language technology platform. In: Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations, pp. 13–16. Association for Computational Linguistics (2010)Google Scholar
  16. 16.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, SpeechLab, Department of Computer Science and Engineering, Brain Science and Technology Research CenterShanghai Jiao Tong UniversityShanghaiChina

Personalised recommendations