Skip to main content

Bi-Directional LSTM with Quantum Attention Mechanism for Sentence Modeling

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10635))

Included in the following conference series:

Abstract

Bi-directional LSTM (BLSTM) often utilizes Attention Mechanism (AM) to improve the ability of modeling sentences. But additional parameters within AM may lead to difficulties of model selection and BLSTM training. To solve the problem, this paper redefines AM from a novel perspective of the quantum cognition and proposes a parameter-free Quantum AM (QAM). Furthermore, we make a quantum interpretation for BLSTM with Two-State Vector Formalism (TSVF) and find the similarity between sentence understanding and quantum Weak Measurement (WM) under TSVF. Weak value derived from WM is employed to represent the attention for words in a sentence. Experiments show that QAM based BLSTM outperforms common AM (CAM) [1] based BLSTM on most classification tasks discussed in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://deeplearning.net/tutorial/lstm.html.

  2. 2.

    In QT, the quantum probability space is encapsulated in an Hilbert space \(\mathbb {H}^n\), which is an abstract vector space processing the structure of the inner product. A finite dimensional space is sufficient for the work reported in this paper. Thus, we limit our researches to a finite real space \(\mathbb {R}^n\). With the Dirac’s notation, a quantum state can be written as a column vector \(| \varPsi \rangle \), whose conjugate transpose is a row vector\(\langle \varPsi |\).

  3. 3.

    https://www.cs.cornell.edu/people/pabo/movie-review-data/.

  4. 4.

    http://nlp.stanford.edu/sentiment/ Data is actually provided at the phrase level. Hence both phrases and sentences are used to train the model, but only sentences are scored at test time [2, 4, 16]. Thus the training set is an order of magnitude larger than listed in Table 1.

  5. 5.

    http://cogcomp.cs.illinois.edu/Data/QA/QC/.

References

  1. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: HLT-NAACL, pp. 1480–1489 (2016)

    Google Scholar 

  2. Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 1631, p. 1642. Citeseer (2013)

    Google Scholar 

  3. Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016)

  4. Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188 (2014)

  5. Wang, Z., Busemeyer, J.R., Atmanspacher, H., Pothos, E.M.: The potential of using quantum theory to build models of cognition. Top. Cogn. Sci. 5(4), 672–688 (2013)

    Google Scholar 

  6. Bruza, P.D., Wang, Z., Busemeyer, J.R.: Quantum cognition: a new theoretical approach to psychology. Trends Cogn. Sci. 19(7), 383–393 (2015)

    Article  Google Scholar 

  7. Aharonov, Y., Vaidman, L.: Complete description of a quantum system at a given time. J. Phys. A: Math. Gen. 24(10), 2315 (1991)

    Article  MathSciNet  Google Scholar 

  8. Ravon, T., Vaidman, L.: The three-box paradox revisited. J. Phys. A: Math. Theor. 40(11), 2873 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  9. Gibran, B.: Causal realism in the philosophy of mind. Essays Philos. 15(2), 5 (2014)

    Article  Google Scholar 

  10. Aharonov, Y., Vaidman, L.: The two-state vector formalism: an updated review. In: Muga, J., Mayato, R.S., Egusquiza, Í. (eds.) Time in Quantum Mechanics. Lecture Notes in Physics, vol. 734. pp. 399–447. Springer, Heidelberg (2008). doi:10.1007/978-3-540-73473-4_13

  11. Aharonov, Y., Bergmann, P.G., Lebowitz, J.L.: Time symmetry in the quantum process of measurement. Phys. Rev. 134(6B), B1410 (1964)

    Article  MATH  MathSciNet  Google Scholar 

  12. Latta, R.L.: The Basic Humor Process: A Cognitive-shift Theory and the Case Against Incongruity, vol. 5. Walter de Gruyter (1999)

    Google Scholar 

  13. Tamir, B., Cohen, E.: Introduction to weak measurements and weak values. Quanta 2(1), 7–17 (2013)

    Article  MATH  Google Scholar 

  14. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  15. Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115–124. Association for Computational Linguistics (2005)

    Google Scholar 

  16. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML 2014, pp. 1188–1196 (2014)

    Google Scholar 

  17. Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 271. Association for Computational Linguistics (2004)

    Google Scholar 

  18. Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)

    Google Scholar 

  19. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  20. Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 151–161. Association for Computational Linguistics (2011)

    Google Scholar 

  21. Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1201–1211. Association for Computational Linguistics (2012)

    Google Scholar 

  22. Dong, L., Wei, F., Liu, S., Zhou, M., Xu, K.: A statistical parsing framework for sentiment classification. Comput. Linguist. (2015)

    Google Scholar 

  23. Nakagawa, T., Inui, K., Kurohashi, S.: Dependency tree-based sentiment classification using CRFS with hidden variables. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 786–794. Association for Computational Linguistics (2010)

    Google Scholar 

Download references

Acknowledgments

This work is funded in part by the Chinese 863 Program (grant No. 2015AA015403), the Key Project of Tianjin Natural Science Foundation (grant No. 15JCZDJC31100), the Tianjin Younger Natural Science Foundation (Grant no: 14JCQNJC00400), the Major Project of Chinese National Social Science Fund (grant No. 14ZDB153) and MSCA-ITN-ETN - European Training Networks Project (grant No. 721321, QUARTZ).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuexian Hou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Niu, X., Hou, Y., Wang, P. (2017). Bi-Directional LSTM with Quantum Attention Mechanism for Sentence Modeling. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10635. Springer, Cham. https://doi.org/10.1007/978-3-319-70096-0_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70096-0_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70095-3

  • Online ISBN: 978-3-319-70096-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics