Abstract
Cross-language summarization is the novel topic which is extremely practical and necessary for capturing, tracing, and retrieving the huge data. Especially, for many low-resource languages as Vietnamese, Chinese, ..., there are not any previous works to solve this problem as well as datasets. Therefore we propose to apply Phrase-based Compressive Summarization for English-Vietnamese. This model takes advantages of the relation between translation and summarization phases to overcome the popular drawback in most antecedent researches. Besides, the bilingual corpus for English-Vietnamese summarization built manually on the dataset is extremely helpful for a lot of later works. In this dataset, our system achieves approximately 37 % in ROUGE-1 score which is equivalent to systems on other language pairs. This significant and encouraging result proves the effectiveness of our approach and the quality of our manual datasets in English-Vietnamese.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boudin, F., Huet, S., Torres-Moreno, J.: A graph-based approach to cross-language multi-document summarization. Polibits 43, 113–118 (2011)
Ganesan, K., Zhai, C., Han, J.: Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 340–348. Association for Computational Linguistics (2010)
Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, NAACL 2003, vol. 1, pp. 48–54. Association for Computational Linguistics, Stroudsburg, PA, USA (2003)
Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Marie-Francine Moens, S.S. (ed.) Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pp. 74–81. Association for Computational Linguistics, Barcelona, Spain (2004)
Lin, H., Bilmes, J.: Multi-document summarization via budgeted maximization of submodular functions. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT 2010, pp. 912–920 (2010)
Lin, H., Bilmes, J.: A class of submodular functions for document summarization. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 510–520. Association for Computational Linguistics, Stroudsburg, PA, USA (2011)
Litvak, M., Last, M., Friedman, M.: A new approach to improving multilingual summarization using a genetic algorithm. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 927–936. Association for Computational Linguistics, Stroudsburg, PA, USA (2010)
Nenkova, A., McKeown, K.: Automatic summarization. Found. Trends Inf. Retr. 5(2–3), 103–233 (2011)
Sviridenko, M.: A note on maximizing a submodular set function subject to a knapsack constraint. Oper. Res. Lett. 32(1), 41–43 (2004)
Yao, J.g., Wan, X., Xiao, J.: Phrase-based compressive cross-language summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 118–127. Association for Computational Linguistics, Lisbon, Portugal, September 2015
Zhong, S., Liu, Y., Li, B., Long, J.: Query-oriented unsupervised multi-document summarization via deep learning model. Expert Syst. Appl. 42(21), 8146–8155 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Le, T., Nguyen, LM., Shimazu, A., Dien, D. (2016). Phrase-Based Compressive Summarization for English-Vietnamese. In: Huynh, VN., Inuiguchi, M., Le, B., Le, B., Denoeux, T. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2016. Lecture Notes in Computer Science(), vol 9978. Springer, Cham. https://doi.org/10.1007/978-3-319-49046-5_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-49046-5_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49045-8
Online ISBN: 978-3-319-49046-5
eBook Packages: Computer ScienceComputer Science (R0)