Bi-Directional LSTM with Quantum Attention Mechanism for Sentence Modeling

Niu, Xiaolei; Hou, Yuexian; Wang, Panpan

doi:10.1007/978-3-319-70096-0_19

Xiaolei Niu¹⁸,
Yuexian Hou¹⁸ &
Panpan Wang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10635))

Included in the following conference series:

International Conference on Neural Information Processing

8226 Accesses
4 Citations

Abstract

Bi-directional LSTM (BLSTM) often utilizes Attention Mechanism (AM) to improve the ability of modeling sentences. But additional parameters within AM may lead to difficulties of model selection and BLSTM training. To solve the problem, this paper redefines AM from a novel perspective of the quantum cognition and proposes a parameter-free Quantum AM (QAM). Furthermore, we make a quantum interpretation for BLSTM with Two-State Vector Formalism (TSVF) and find the similarity between sentence understanding and quantum Weak Measurement (WM) under TSVF. Weak value derived from WM is employed to represent the attention for words in a sentence. Experiments show that QAM based BLSTM outperforms common AM (CAM) [1] based BLSTM on most classification tasks discussed in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://deeplearning.net/tutorial/lstm.html.
2.
In QT, the quantum probability space is encapsulated in an Hilbert space \(\mathbb {H}^n\), which is an abstract vector space processing the structure of the inner product. A finite dimensional space is sufficient for the work reported in this paper. Thus, we limit our researches to a finite real space \(\mathbb {R}^n\). With the Dirac’s notation, a quantum state can be written as a column vector \(| \varPsi \rangle \), whose conjugate transpose is a row vector\(\langle \varPsi |\).
3.
https://www.cs.cornell.edu/people/pabo/movie-review-data/.
4.
http://nlp.stanford.edu/sentiment/ Data is actually provided at the phrase level. Hence both phrases and sentences are used to train the model, but only sentences are scored at test time [2, 4, 16]. Thus the training set is an order of magnitude larger than listed in Table 1.
5.
http://cogcomp.cs.illinois.edu/Data/QA/QC/.

References

Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: HLT-NAACL, pp. 1480–1489 (2016)
Google Scholar
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 1631, p. 1642. Citeseer (2013)
Google Scholar
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016)
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188 (2014)
Wang, Z., Busemeyer, J.R., Atmanspacher, H., Pothos, E.M.: The potential of using quantum theory to build models of cognition. Top. Cogn. Sci. 5(4), 672–688 (2013)
Google Scholar
Bruza, P.D., Wang, Z., Busemeyer, J.R.: Quantum cognition: a new theoretical approach to psychology. Trends Cogn. Sci. 19(7), 383–393 (2015)
Article Google Scholar
Aharonov, Y., Vaidman, L.: Complete description of a quantum system at a given time. J. Phys. A: Math. Gen. 24(10), 2315 (1991)
Article MathSciNet Google Scholar
Ravon, T., Vaidman, L.: The three-box paradox revisited. J. Phys. A: Math. Theor. 40(11), 2873 (2007)
Article MATH MathSciNet Google Scholar
Gibran, B.: Causal realism in the philosophy of mind. Essays Philos. 15(2), 5 (2014)
Article Google Scholar
Aharonov, Y., Vaidman, L.: The two-state vector formalism: an updated review. In: Muga, J., Mayato, R.S., Egusquiza, Í. (eds.) Time in Quantum Mechanics. Lecture Notes in Physics, vol. 734. pp. 399–447. Springer, Heidelberg (2008). doi:10.1007/978-3-540-73473-4_13
Aharonov, Y., Bergmann, P.G., Lebowitz, J.L.: Time symmetry in the quantum process of measurement. Phys. Rev. 134(6B), B1410 (1964)
Article MATH MathSciNet Google Scholar
Latta, R.L.: The Basic Humor Process: A Cognitive-shift Theory and the Case Against Incongruity, vol. 5. Walter de Gruyter (1999)
Google Scholar
Tamir, B., Cohen, E.: Introduction to weak measurements and weak values. Quanta 2(1), 7–17 (2013)
Article MATH Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115–124. Association for Computational Linguistics (2005)
Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML 2014, pp. 1188–1196 (2014)
Google Scholar
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 271. Association for Computational Linguistics (2004)
Google Scholar
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 151–161. Association for Computational Linguistics (2011)
Google Scholar
Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1201–1211. Association for Computational Linguistics (2012)
Google Scholar
Dong, L., Wei, F., Liu, S., Zhou, M., Xu, K.: A statistical parsing framework for sentiment classification. Comput. Linguist. (2015)
Google Scholar
Nakagawa, T., Inui, K., Kurohashi, S.: Dependency tree-based sentiment classification using CRFS with hidden variables. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 786–794. Association for Computational Linguistics (2010)
Google Scholar

Download references

Acknowledgments

This work is funded in part by the Chinese 863 Program (grant No. 2015AA015403), the Key Project of Tianjin Natural Science Foundation (grant No. 15JCZDJC31100), the Tianjin Younger Natural Science Foundation (Grant no: 14JCQNJC00400), the Major Project of Chinese National Social Science Fund (grant No. 14ZDB153) and MSCA-ITN-ETN - European Training Networks Project (grant No. 721321, QUARTZ).

Author information

Authors and Affiliations

School of Computer Science and Technology, Tianjin University, Tianjin, China
Xiaolei Niu, Yuexian Hou & Panpan Wang

Authors

Xiaolei Niu
View author publications
You can also search for this author in PubMed Google Scholar
Yuexian Hou
View author publications
You can also search for this author in PubMed Google Scholar
Panpan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuexian Hou .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Niu, X., Hou, Y., Wang, P. (2017). Bi-Directional LSTM with Quantum Attention Mechanism for Sentence Modeling. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10635. Springer, Cham. https://doi.org/10.1007/978-3-319-70096-0_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-70096-0_19
Published: 26 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70095-3
Online ISBN: 978-3-319-70096-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics