Phrase-Based Compressive Summarization for English-Vietnamese

Le, Tung; Nguyen, Le-Minh; Shimazu, Akira; Dien, Dinh

doi:10.1007/978-3-319-49046-5_28

Tung Le¹⁸,
Le-Minh Nguyen¹⁹,
Akira Shimazu¹⁹ &
…
Dinh Dien¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9978))

Included in the following conference series:

International Symposium on Integrated Uncertainty in Knowledge Modelling and Decision Making

1615 Accesses
1 Citations

Abstract

Cross-language summarization is the novel topic which is extremely practical and necessary for capturing, tracing, and retrieving the huge data. Especially, for many low-resource languages as Vietnamese, Chinese, ..., there are not any previous works to solve this problem as well as datasets. Therefore we propose to apply Phrase-based Compressive Summarization for English-Vietnamese. This model takes advantages of the relation between translation and summarization phases to overcome the popular drawback in most antecedent researches. Besides, the bilingual corpus for English-Vietnamese summarization built manually on the dataset is extremely helpful for a lot of later works. In this dataset, our system achieves approximately 37 % in ROUGE-1 score which is equivalent to systems on other language pairs. This significant and encouraging result proves the effectiveness of our approach and the quality of our manual datasets in English-Vietnamese.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Boudin, F., Huet, S., Torres-Moreno, J.: A graph-based approach to cross-language multi-document summarization. Polibits 43, 113–118 (2011)
Article Google Scholar
Ganesan, K., Zhai, C., Han, J.: Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 340–348. Association for Computational Linguistics (2010)
Google Scholar
Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, NAACL 2003, vol. 1, pp. 48–54. Association for Computational Linguistics, Stroudsburg, PA, USA (2003)
Google Scholar
Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Marie-Francine Moens, S.S. (ed.) Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pp. 74–81. Association for Computational Linguistics, Barcelona, Spain (2004)
Google Scholar
Lin, H., Bilmes, J.: Multi-document summarization via budgeted maximization of submodular functions. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT 2010, pp. 912–920 (2010)
Google Scholar
Lin, H., Bilmes, J.: A class of submodular functions for document summarization. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 510–520. Association for Computational Linguistics, Stroudsburg, PA, USA (2011)
Google Scholar
Litvak, M., Last, M., Friedman, M.: A new approach to improving multilingual summarization using a genetic algorithm. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 927–936. Association for Computational Linguistics, Stroudsburg, PA, USA (2010)
Google Scholar
Nenkova, A., McKeown, K.: Automatic summarization. Found. Trends Inf. Retr. 5(2–3), 103–233 (2011)
Article Google Scholar
Sviridenko, M.: A note on maximizing a submodular set function subject to a knapsack constraint. Oper. Res. Lett. 32(1), 41–43 (2004)
Article MathSciNet MATH Google Scholar
Yao, J.g., Wan, X., Xiao, J.: Phrase-based compressive cross-language summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 118–127. Association for Computational Linguistics, Lisbon, Portugal, September 2015
Google Scholar
Zhong, S., Liu, Y., Li, B., Long, J.: Query-oriented unsupervised multi-document summarization via deep learning model. Expert Syst. Appl. 42(21), 8146–8155 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, University of Science, Vietnam National Univeristy, Ho Chi Minh City, Vietnam
Tung Le & Dinh Dien
School of Information Science, Japan Advanced Institute of Science and Technology, Nomi, Japan
Le-Minh Nguyen & Akira Shimazu

Authors

Tung Le
View author publications
You can also search for this author in PubMed Google Scholar
Le-Minh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Akira Shimazu
View author publications
You can also search for this author in PubMed Google Scholar
Dinh Dien
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tung Le .

Editor information

Editors and Affiliations

Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, Japan
Van-Nam Huynh
Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka, Japan
Masahiro Inuiguchi
University of Science, Ho Chi Minh City, Vietnam
Bac Le
Duy Tan University, Da Nang, Vietnam
Bao Nguyen Le
Université de Technologie de Compiègne , Compiègne, France
Thierry Denoeux

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Le, T., Nguyen, LM., Shimazu, A., Dien, D. (2016). Phrase-Based Compressive Summarization for English-Vietnamese. In: Huynh, VN., Inuiguchi, M., Le, B., Le, B., Denoeux, T. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2016. Lecture Notes in Computer Science(), vol 9978. Springer, Cham. https://doi.org/10.1007/978-3-319-49046-5_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-49046-5_28
Published: 29 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49045-8
Online ISBN: 978-3-319-49046-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics