Skip to main content

Automatic Generation of Multi-document Summaries Based on the Global-Best Harmony Search Metaheuristic and the LexRank Graph-Based Algorithm

  • Conference paper
  • First Online:
Advances in Computational Intelligence (MICAI 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10633))

Included in the following conference series:

  • 452 Accesses

Abstract

Recently, metaheuristic based algorithms have shown good results in generating automatic multi-document summaries. This paper proposes two algorithms that hybridize the metaheuristic of Global Best Harmony Search and the LexRank Graph based algorithm, called LexGbhs and GbhsLex. The objective function to be optimized is composed of the features of coverage and diversity. Coverage measures the similarity between each sentence of the candidate summary and the centroid of the sentences of the collection of documents, while diversity measures how different the sentences that make up a candidate summary are. The two proposed hybrid algorithms were compared with state of the art algorithms using ROUGE-1, ROUGE-2 and ROUGE-SU4 measurements for the DUC2005 and DUC2006 data sets. After a unified classification was carried out, the LexGbhs algorithm proposed ranked third, showing that the hybridization of metaheuristics with graphs in the generation of extractive summaries of multiple documents is a promising line of research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    WordNet is a lexico-conceptual database of the English language structured in the form of a semantic network, comprising lexical units and the relationships between them.

  2. 2.

    Prestige and centrality in this proposal represent the same concept, with the difference that the first is usually defined for directed graphs, the second for undirected graphs.

References

  1. Nenkova, A.: Automatic summarization. In: Foundations and Trends® in Information Retrieval, vol. 5, pp. 103–233 (2011)

    Article  Google Scholar 

  2. Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artif. Intell. Rev. 37, 1–41 (2011)

    Article  Google Scholar 

  3. Becerra, M.E.M., Guzmán, E.L.: A review of the extractive text summarization. Revista Facultad de Ingenierías Fisicomecánicas UIS Ingenierías 12, 7–27 (2013)

    Google Scholar 

  4. Chen, Y.-M., Wang, X.-L., Liu, B.-Q.: Multi-document summarization based on lexical chains. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1937–1942 (2005)

    Google Scholar 

  5. Park, S., Cha, B.: Query-based multi-document summarization using non-negative semantic feature and NMF clustering, pp. 609–614 (2008)

    Google Scholar 

  6. Wang, D., Li, T., Zhu, S., Ding, C.: Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. Presented at the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore (2008)

    Google Scholar 

  7. Ouyang, Y., Li, W., Li, S., Lu, Q.: Applying regression models to query-focused multi-document summarization. Inf. Process. Manag. 47, 227–237 (2011)

    Article  Google Scholar 

  8. Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. Presented at the SIAM International Conference on Data Mining, Nevada, USA (2009)

    Chapter  Google Scholar 

  9. Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. Artif. Intell. Res. 22, 457–479 (2004)

    Article  Google Scholar 

  10. Zhang, J., Cheng, X., Xu, H.: GSPSummary: a graph-based sub-topic partition algorithm for summarization. In: Li, H., Liu, T., Ma, W.-Y., Sakai, T., Wong, K.-F., Zhou, G. (eds.) AIRS 2008. LNCS, vol. 4993, pp. 321–334. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68636-1_31

    Chapter  Google Scholar 

  11. Ferreira, R., et al.: A multi-document summarization system based on statistics and linguistic treatment. Expert Syst. Appl. 41, 5780–5787 (2014)

    Article  Google Scholar 

  12. Radev, D.R., Jing, H., Budzikowska, M.: Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. Inf. Process. Manag. 40(6), 919–938 (2004)

    Article  Google Scholar 

  13. Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. Presented at the Conference of the North American Chapter of the ACL, Boulder, Colorado (2009)

    Google Scholar 

  14. Lei, H., Yanxiang, H., Furu, W., Wenjie, L.: Modeling document summarization as multi-objective optimization. In: Third International Symposium on Intelligent Information Technology and Security Informatics (IITSI), China, pp. 382–386 (2010)

    Google Scholar 

  15. Liu, D., Wang, Y., Liu, C., Wang, Z.: Multiple documents summarization based on genetic algorithm. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds.) FSKD 2006. LNCS (LNAI), vol. 4223, pp. 355–364. Springer, Heidelberg (2006). https://doi.org/10.1007/11881599_40

    Chapter  Google Scholar 

  16. Mendoza, M., et al.: A New memetic algorithm for multi-document summarization based on CHC algorithm and greedy search. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, Sofía N. (eds.) MICAI 2014. LNCS (LNAI), vol. 8856, pp. 125–138. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13647-9_14

    Chapter  Google Scholar 

  17. Alguliev, R.M., Aliguliyev, R.M., Isazade, N.R.: DESAMC + DocSum: differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization. Knowl. Based Syst. 36, 21–38 (2012)

    Article  Google Scholar 

  18. Abdel-Raouf, O., Metwally, M.A.-B.: A survey of harmony search algorithm. Eng. Appl. Artif. Intell. 70, 17–26 (2013)

    Google Scholar 

  19. Meng, W., Xinlai, T.: Extract summarization using concept-obtained and hybrid parallel genetic algorithm. Presented at the 8th International Conference on Natural Computation (2012)

    Google Scholar 

  20. Fattah, M.A.: A hybrid machine learning model for multi-document summarization. Appl. Intell. 40, 592–600 (2013)

    Article  Google Scholar 

  21. Cobos, C., Perez, J., Estupiñan, C.: Una revisión de la búsuqeda armónica. Revista Avances en Sistemas e Informática 8, 14 (2011)

    Google Scholar 

  22. Manning, C.D., Raghavan, P., Schütze, H.: An Introduction to Information Retrieval. Cambridge University Press, Cambridge (2009)

    MATH  Google Scholar 

  23. Omran, M.G.H., Mahdavi, M.: Global-best harmony search. Appl. Math. Comput. 198, 643–656 (2008)

    MathSciNet  MATH  Google Scholar 

  24. Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL 2004 Workshop on Text Summarization Branches Out (2004)

    Google Scholar 

  25. N.I.O.S.A. Technology: NIST covering array tables—about these pages (2008). http://math.nist.gov/coveringarrays/coveringarray.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to César Cuéllar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cuéllar, C., Mendoza, M., Cobos, C. (2018). Automatic Generation of Multi-document Summaries Based on the Global-Best Harmony Search Metaheuristic and the LexRank Graph-Based Algorithm. In: Castro, F., Miranda-Jiménez, S., González-Mendoza, M. (eds) Advances in Computational Intelligence. MICAI 2017. Lecture Notes in Computer Science(), vol 10633. Springer, Cham. https://doi.org/10.1007/978-3-030-02840-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02840-4_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02839-8

  • Online ISBN: 978-3-030-02840-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics