Advertisement

A Deep Architecture for Chinese Semantic Matching with Pairwise Comparisons and Attention-Pooling

  • Huiyuan Lai
  • Yizheng TaoEmail author
  • Chunliu Wang
  • Lunfan Xu
  • Dingyong Tang
  • Gongliang Li
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 810)

Abstract

Semantic sentence matching is a fundamental technology in natural language processing. In the previous work, neural networks with attention mechanism have been successfully extended to semantic matching. However, existing deep models often simply use some operations such as summation and max-pooling to represent the whole sentence to a single distributed representation. We present a deep architecture to match two Chinese sentences, which only relies on alignment instead of recurrent neural network after attention mechanism used to get interaction information between sentence-pairs, it becomes more lightweight and simple. In order to capture original features enough, we employ a pooling operation named attention-pooling to convergence information from the whole sentence. We also explore several excellent performance English models on Chinese data. The experimental results show that our method can achieve better results than other models on Chinese dataset.

Keywords

Chinese Semantic matching Attention mechanism Attention-pooling 

Notes

Acknowledgements

We are especially grateful to Ant Financial for allowing us to use the dataset from Ant Financial Artificial Competition for experiments.

References

  1. 1.
    Berger, A., Caruana, R., Cohn, D., Freitag, D., Mittal, V.: Bridging the lexical chasm: statistical approaches to answer-finding. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 192–199 (2000)Google Scholar
  2. 2.
    Lu, Z., Li, H.: A deep architecture for matching short texts. Adv. Neural Inf. Process. Syst. (NIPS), 1367–1375 (2013)Google Scholar
  3. 3.
    Aliguliyev, R.M.: A new sentence similarity measure and sentence based extractive technique for automatic text summarization. Expert Syst. Appl. (2009)Google Scholar
  4. 4.
    Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using click through data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management (CIKM), pp. 2333–2338 (2013)Google Scholar
  5. 5.
    Palangi, H., Deng, L., Shen, Y., Gao, J., He, X., Chen, J., Song, X., Ward, R.K.: Deep sentence embedding using the long short term memory network: analysis and application to information retrieval. CoRR abs arXiv:1502.06922 (2015)
  6. 6.
    Csernai, K.: Quora question pair dataset (2017)Google Scholar
  7. 7.
    Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2015)Google Scholar
  8. 8.
    Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. arXiv:1704.05426 (2017)
  9. 9.
    Ant Financial. Ant Financial Artificial Competition. https://dc.cloud.alipay.com/index#/-topic/data?id=3
  10. 10.
  11. 11.
    Mikolov, T., et al.: Efficient estimation of word representations in vector space. https://arxiv.org/abs/1301.3781
  12. 12.
    Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv:1505.00387 (2015)
  13. 13.
    Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. In: Wu, D., Carpuat, M., Carreras, X., Vecchi, E.M. (eds) Proceedings of SSST@EMNLP 2014 (2014)Google Scholar
  14. 14.
    Seo, M.J., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv:1611.01603 (2016)
  15. 15.
    Chen, Q., Zhu, X.: Enhanced LSTM for natural language inference. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1657–1668Google Scholar
  16. 16.
    Parikh, A.P., Täckström, O., Das, D., Uszkoreit, J.: A decomposable attention model for natural language inference. https://arxiv.org/pdf/1606.01933
  17. 17.
    Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of AISTATS (2011)Google Scholar
  18. 18.
    Lu, H., Li, Y., Chen, M., Kim, H., Serikawa, S.: Brain intelligence: go beyond artificial intelligence. Mob. Netw. Appl. 1–8 (2017)Google Scholar
  19. 19.
    Natural Language Computing Group, Microsoft Research Asia. R-NET: Machine Reading Comprehension With Self-matching Networks. https://www.microsoft.com/en-us/research/publication/mrc/
  20. 20.
    Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR (2014)Google Scholar
  21. 21.
    Kingma, D.P., Adam, J.B.: A method for stochastic optimization. https://arxiv.org/abs/1412.6980
  22. 22.
    Xu, X., He, L., Lu, H., Gao, L., Ji, Y.: Deep adversarial metric learning for cross-modal retrieval. World Wide Web J.  https://doi.org/10.1007/s11280-018-0541-x (2018)
  23. 23.
    Lu, H., Li, Y., Mu, S., Wang, D., Kim, H., Serikawa, S.: Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Internet Things J.  https://doi.org/10.1109/jiot.2017.2737479 (2017)
  24. 24.
    Deshpande, A.: Diving into natural language processing. https://dzone.com/articles/-natural-language-processing-adit-deshpande-cs-unde
  25. 25.
    Serikawa, S., Huimin, L.: Underwater image dehazing using joint trilateral filter. Comput. Electr. Eng. 40(1), 41–50 (2014)CrossRefGoogle Scholar
  26. 26.
    Lu, H., Li, Y., Uemura, T.: Low illumination underwater light field images reconstruction using deep convolutional neural networks. Future Gener. Comput. Syst.  https://doi.org/10.1016/j.future.2018.01.001 (2018)
  27. 27.
    Lu, H., et al.: Low illumination underwater light field images reconstruction using deep convolutional neural networks. Future Gener. Comput. Syst.  https://doi.org/10.1016/j.future.2018.01.001 (2018)
  28. 28.
    Choi, J., Yoo, K.M., Lee, S.: Learning to compose task-specific tree structures. AAAI (2017)Google Scholar
  29. 29.
    Nie, Y., Bansal, M.: Shortcut-stacked sentence encoders for multi-domain inference. arXiv:1708.02312 (2017)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Huiyuan Lai
    • 1
  • Yizheng Tao
    • 1
    Email author
  • Chunliu Wang
    • 1
  • Lunfan Xu
    • 1
  • Dingyong Tang
    • 1
  • Gongliang Li
    • 1
  1. 1.Institute of Computer Application, China Academy of Engineering PhysicsMianyangChina

Personalised recommendations