Skip to main content

Multilingual Multi-document Summarization with Enhanced hLDA Features

  • Conference paper
  • First Online:
Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data (NLP-NABD 2016, CCL 2016)

Abstract

This paper presents the state of art research progress on multilingual multi-document summarization. Our method utilizes hLDA (hierarchical Latent Dirichlet Allocation) algorithm to model the documents firstly. A new feature is proposed from the hLDA modeling results, which can reflect semantic information to some extent. Then it combines this new feature with different other features to perform sentence scoring. According to the results of sentence score, it extracts candidate summary sentences from the documents to generate a summary. We have also attempted to verify the effectiveness and robustness of the new feature through experiments. After the comparison with other summarization methods, our method reveals better performance in some respects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ferreira, R., et.al.: A four dimension graph model for automatic text summarization. In: Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 17–20 November 2013, pp. 389–396 (2013)

    Google Scholar 

  2. Bhagat, K., Ingle, M.D.: Multi document summarization using EM Clustering. IOSR J. Eng. (IOSRJEN) 04(05), 45–50 (2014). ISSN (e): 2250-3021, ISSN (p): 2278-8719, ||V6||

    Article  Google Scholar 

  3. Litvak, M., Vanetik. N.: Multilingual multi-document summarization with POLY2. In: Proceedings of the MultiLing 2013 Workshop on Multilingual Multi-document Summarization, pp. 45–49 (2013)

    Google Scholar 

  4. Celikyilmaz, A., Hakkani-Tur, D.: A hybrid hierarchical model for multi-document summarization. In: 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 815–824 (2010)

    Google Scholar 

  5. Blei, D.M., Griffiths, T.L., Jordan, M.I., Tenenbaum, J.B.: Hierarchical topic models and the nested Chinese restaurant process. In: Advances in Neural Information Processing Systems 16. MIT Press, Cambridge (2004)

    Google Scholar 

  6. Liu, P.: Chinese multi document summarization based on hLDA model. Beijing University of Posts and Telecommunications (2013)

    Google Scholar 

  7. Liu, H.: Multi document summarization based on hLDA hierarchical topic model. Beijing University of Posts and Telecommunications (2012)

    Google Scholar 

  8. Liu, Y.: Multi document summarization based on topic model and semantic analysis. Beijing University of Posts and Telecommunications (2015)

    Google Scholar 

  9. Heng, W., Yu, J., Li, L., Liu, Y.: Reasearch on key factors of multi document topic modeling using hLDA. Chinese J. Inf. Sci. Technol. 06 (2013)

    Google Scholar 

  10. Giannakopoulos, G.: MMS MultiLing2015 Task (2015). http://multiling.iit.demokritos.gr/pages/view/1540/task-mms-multi-documentsummarization-data-and-information. Accessed 19 July 2015

  11. Liu, M., Wang, L., Nie, L.: Weibo-oriented chinese news summarization via multi-feature combination. In: Hou, L., et al. (eds.) NLPCC 2015. LNCS, vol. 9362, pp. 581–589. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25207-0_55

    Chapter  Google Scholar 

  12. Lin, C.-Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), Barcelona, Spain, July 25–26 2004 (2004)

    Google Scholar 

  13. http://www.berouge.com/Pages/default.aspx

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China under Grant 91546121, 61202247, 71231002 and 61472046; EU FP7 IRSES MobileCloud Project (Grant No. 612212); the 111 Project of China under Grant B08004; Engineering Research Center of Information Networks, Ministry of Education; Beijing Institute of Science and Technology Information; CapInfo Company Limited.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Taiwen Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Huang, T., Li, L., Zhang, Y. (2016). Multilingual Multi-document Summarization with Enhanced hLDA Features. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47674-2_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47673-5

  • Online ISBN: 978-3-319-47674-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics