Deletion-Based Sentence Compression Using Bi-enc-dec LSTM

Lai, Dac-Viet; Son, Nguyen Truong; Le Minh, Nguyen

doi:10.1007/978-981-10-8438-6_20

Dac-Viet Lai¹¹,
Nguyen Truong Son^11,12 &
Nguyen Le Minh¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 781))

Included in the following conference series:

International Conference of the Pacific Association for Computational Linguistics

898 Accesses
1 Citations

Abstract

We propose a combined model of enhanced Bidirectional Long Short Term Memory (Bi-LSTM) and well-known classifiers such as Conditional Random Field (CRF) and Support Vector Machine (SVM) for compressing sentence, in which LSTM network works as a feature extractor. The task is to classify words into two categories: to be retained or to be removed. Facing the lack of reliable feature generating techniques in many languages, we employ the obtainable word embedding as the exclusive feature. Our models are trained and evaluated on public English and Vietnamese data sets, showing their state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words (2016)
Google Scholar
Chopra, S., Auli, M., Rush, A.M., Harvard, S.: Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of NAACL-HLT16, pp. 93–98 (2016)
Google Scholar
Clarke, J., Lapata, M.: Global inference for sentence compression: an integer linear programming approach. J. Artif. Intell. Res. 31, 399–429 (2008)
MATH Google Scholar
Cohn, T., Lapata, M.: Sentence compression beyond word deletion. In: COLING, pp. 137–144. Association for Computational Linguistics (2008)
Google Scholar
Dyer, C., Ballesteros, M., Ling, W., Matthews, A., Smith, N.A.: Transition-based dependency parsing with stack long short-term memory (2015)
Google Scholar
Filippova, K., Alfonseca, E., Colmenares, C.A., Kaiser, L., Vinyals, O.: Sentence compression by deletion with LSTMs. In: EMNLP, pp. 360–368 (2015)
Google Scholar
Filippova, K., Altun, Y.: Overcoming the lack of parallel data in sentence compression. In: EMNLP, pp. 1481–1491. Citeseer (2013)
Google Scholar
Filippova, K., Strube, M.: Dependency tree based sentence compression. In: INLG, pp. 25–32 (2008)
Google Scholar
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory, vol. 9, pp. 1735–1780. MIT Press, Cambridge (1997)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Klerke, S., Goldberg, Y., Søgaard, A.: Improving sentence compression by learning to predict gaze. In: Proceedings of NAACL-HLT, pp. 1528–1533 (2016)
Google Scholar
Knight, K., Marcu, D.: Statistics-based summarization-step one: sentence compression 2000, pp. 703–710 (2000)
Google Scholar
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
Le, N.M., Horiguchi, S.: A new sentence reduction based on decision tree model. In: Proceedings of the 17th PACLIC, pp. 290–297 (2003)
Google Scholar
Le Nguyen, M., Shimazu, A., Horiguchi, S., Ho, B.T., Fukushi, M.: Probabilistic sentence reduction using support vector machines. In: Proceedings of the 20th COLING, p. 743. Association for Computational Linguistics (2004)
Google Scholar
Luong, M.T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation (2014)
Google Scholar
McDonald, R.T.: Discriminative sentence compression with soft syntactic evidence. In: EACL (2006)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)
Google Scholar
Qian, X., Liu, Y.: Fast joint compression and summarization via graph cuts. In: EMNLP, pp. 1492–1502 (2013)
Google Scholar
Quirk, C., Brockett, C., Dolan, W.B.: Monolingual machine translation for paraphrase generation. In: EMNLP, pp. 142–149 (2004)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS, pp. 3104–3112. MIT Press (2014)
Google Scholar
Tran, N.T., Luong, V.T., Nguyen, N.L.T., Nghiem, M.Q.: Effective attention-based neural architectures for sentence compression with bidirectional long short-term memory. In: Proceedings of the 7th SoICT, pp. 123–130 (2016)
Google Scholar
Tran, N.T., Ung, V.G., Luong, A.V., Nghiem, M.Q., Nguyen, N.L.T.: Improving Vietnamese sentence compression by segmenting meaning chunks. In: Knowledge and Systems Engineering (KSE), pp. 320–323 (2015)
Google Scholar
Xian, Q., Yang, L.: Polynomial time joint structural inference for sentence compression, vol. 2, pp. 327–332. ACL (2014)
Google Scholar

Download references

Acknowledgment

This work was supported by JSPS KAKENHI Grant number JP15K16048.

Author information

Authors and Affiliations

Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Dac-Viet Lai, Nguyen Truong Son & Nguyen Le Minh
University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam
Nguyen Truong Son

Authors

Dac-Viet Lai
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Truong Son
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Le Minh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dac-Viet Lai .

Editor information

Editors and Affiliations

Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
Kôiti Hasida
Natural Language Processing Lab, University of Computer Studies, Yangon, Yangon, Myanmar
Win Pa Pa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lai, DV., Son, N.T., Le Minh, N. (2018). Deletion-Based Sentence Compression Using Bi-enc-dec LSTM. In: Hasida, K., Pa, W. (eds) Computational Linguistics. PACLING 2017. Communications in Computer and Information Science, vol 781. Springer, Singapore. https://doi.org/10.1007/978-981-10-8438-6_20

Download citation

DOI: https://doi.org/10.1007/978-981-10-8438-6_20
Published: 04 March 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8437-9
Online ISBN: 978-981-10-8438-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics