Abstract
This paper presents the state of art research progress on multilingual multi-document summarization. Our method utilizes hLDA (hierarchical Latent Dirichlet Allocation) algorithm to model the documents firstly. A new feature is proposed from the hLDA modeling results, which can reflect semantic information to some extent. Then it combines this new feature with different other features to perform sentence scoring. According to the results of sentence score, it extracts candidate summary sentences from the documents to generate a summary. We have also attempted to verify the effectiveness and robustness of the new feature through experiments. After the comparison with other summarization methods, our method reveals better performance in some respects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ferreira, R., et.al.: A four dimension graph model for automatic text summarization. In: Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 17–20 November 2013, pp. 389–396 (2013)
Bhagat, K., Ingle, M.D.: Multi document summarization using EM Clustering. IOSR J. Eng. (IOSRJEN) 04(05), 45–50 (2014). ISSN (e): 2250-3021, ISSN (p): 2278-8719, ||V6||
Litvak, M., Vanetik. N.: Multilingual multi-document summarization with POLY2. In: Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization, pp. 45–49 (2013)
Celikyilmaz, A., Hakkani-Tur, D.: A hybrid hierarchical model for multi-document summarization. In: 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 815–824 (2010)
Blei, D.M., Griffiths, T.L., Jordan, M.I., Tenenbaum, J.B.: Hierarchical topic models and the nested Chinese restaurant process. In: Advances in Neural Information Processing Systems 16. MIT Press, Cambridge (2004)
Liu, P.: Chinese multi document summarization based on hLDA model. Beijing University of Posts and Telecommunications (2013)
Liu, H.: Multi document summarization based on hLDA hierarchical topic model. Beijing University of Posts and Telecommunications (2012)
Liu, Y.: Multi document summarization based on topic model and semantic analysis. Beijing University of Posts and Telecommunications (2015)
Heng, W., Yu, J., Li, L., Liu, Y.: Reasearch on key factors of multi document topic modeling using hLDA. Chinese J. Inf. Sci. Technol. 06 (2013)
Giannakopoulos, G.: MMS MultiLing2015 Task (2015). http://multiling.iit.demokritos.gr/pages/view/1540/task-mms-multi-documentsummarization-data-and-information. Accessed 19 July 2015
Liu, M., Wang, L., Nie, L.: Weibo-oriented chinese news summarization via multi-feature combination. In: Hou, L., et al. (eds.) NLPCC 2015. LNCS, vol. 9362, pp. 581–589. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25207-0_55
Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), Barcelona, Spain, July 25–26 2004 (2004)
Acknowledgement
This work was supported by the National Natural Science Foundation of China under Grant 91546121, 61202247, 71231002 and 61472046; EU FP7 IRSES MobileCloud Project (Grant No. 612212); the 111 Project of China under Grant B08004; Engineering Research Center of Information Networks, Ministry of Education; Beijing Institute of Science and Technology Information; CapInfo Company Limited.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Huang, T., Li, L., Zhang, Y. (2016). Multilingual Multi-document Summarization with Enhanced hLDA Features. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-47674-2_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47673-5
Online ISBN: 978-3-319-47674-2
eBook Packages: Computer ScienceComputer Science (R0)