Advertisement

Automatic Generation of Multi-document Summaries Based on the Global-Best Harmony Search Metaheuristic and the LexRank Graph-Based Algorithm

  • César CuéllarEmail author
  • Martha Mendoza
  • Carlos Cobos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10633)

Abstract

Recently, metaheuristic based algorithms have shown good results in generating automatic multi-document summaries. This paper proposes two algorithms that hybridize the metaheuristic of Global Best Harmony Search and the LexRank Graph based algorithm, called LexGbhs and GbhsLex. The objective function to be optimized is composed of the features of coverage and diversity. Coverage measures the similarity between each sentence of the candidate summary and the centroid of the sentences of the collection of documents, while diversity measures how different the sentences that make up a candidate summary are. The two proposed hybrid algorithms were compared with state of the art algorithms using ROUGE-1, ROUGE-2 and ROUGE-SU4 measurements for the DUC2005 and DUC2006 data sets. After a unified classification was carried out, the LexGbhs algorithm proposed ranked third, showing that the hybridization of metaheuristics with graphs in the generation of extractive summaries of multiple documents is a promising line of research.

Keywords

Multi-document extractive summarization Metaheuristics Global-Best Harmony Search algorithm LexRank algorithm Hybrid algorithms 

References

  1. 1.
    Nenkova, A.: Automatic summarization. In: Foundations and Trends® in Information Retrieval, vol. 5, pp. 103–233 (2011)CrossRefGoogle Scholar
  2. 2.
    Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artif. Intell. Rev. 37, 1–41 (2011)CrossRefGoogle Scholar
  3. 3.
    Becerra, M.E.M., Guzmán, E.L.: A review of the extractive text summarization. Revista Facultad de Ingenierías Fisicomecánicas UIS Ingenierías 12, 7–27 (2013)Google Scholar
  4. 4.
    Chen, Y.-M., Wang, X.-L., Liu, B.-Q.: Multi-document summarization based on lexical chains. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1937–1942 (2005)Google Scholar
  5. 5.
    Park, S., Cha, B.: Query-based multi-document summarization using non-negative semantic feature and NMF clustering, pp. 609–614 (2008)Google Scholar
  6. 6.
    Wang, D., Li, T., Zhu, S., Ding, C.: Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. Presented at the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore (2008)Google Scholar
  7. 7.
    Ouyang, Y., Li, W., Li, S., Lu, Q.: Applying regression models to query-focused multi-document summarization. Inf. Process. Manag. 47, 227–237 (2011)CrossRefGoogle Scholar
  8. 8.
    Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. Presented at the SIAM International Conference on Data Mining, Nevada, USA (2009)CrossRefGoogle Scholar
  9. 9.
    Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. Artif. Intell. Res. 22, 457–479 (2004)CrossRefGoogle Scholar
  10. 10.
    Zhang, J., Cheng, X., Xu, H.: GSPSummary: a graph-based sub-topic partition algorithm for summarization. In: Li, H., Liu, T., Ma, W.-Y., Sakai, T., Wong, K.-F., Zhou, G. (eds.) AIRS 2008. LNCS, vol. 4993, pp. 321–334. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-68636-1_31CrossRefGoogle Scholar
  11. 11.
    Ferreira, R., et al.: A multi-document summarization system based on statistics and linguistic treatment. Expert Syst. Appl. 41, 5780–5787 (2014)CrossRefGoogle Scholar
  12. 12.
    Radev, D.R., Jing, H., Budzikowska, M.: Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. Inf. Process. Manag. 40(6), 919–938 (2004)CrossRefGoogle Scholar
  13. 13.
    Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. Presented at the Conference of the North American Chapter of the ACL, Boulder, Colorado (2009)Google Scholar
  14. 14.
    Lei, H., Yanxiang, H., Furu, W., Wenjie, L.: Modeling document summarization as multi-objective optimization. In: Third International Symposium on Intelligent Information Technology and Security Informatics (IITSI), China, pp. 382–386 (2010)Google Scholar
  15. 15.
    Liu, D., Wang, Y., Liu, C., Wang, Z.: Multiple documents summarization based on genetic algorithm. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds.) FSKD 2006. LNCS (LNAI), vol. 4223, pp. 355–364. Springer, Heidelberg (2006).  https://doi.org/10.1007/11881599_40CrossRefGoogle Scholar
  16. 16.
    Mendoza, M., et al.: A New memetic algorithm for multi-document summarization based on CHC algorithm and greedy search. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, Sofía N. (eds.) MICAI 2014. LNCS (LNAI), vol. 8856, pp. 125–138. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-13647-9_14CrossRefGoogle Scholar
  17. 17.
    Alguliev, R.M., Aliguliyev, R.M., Isazade, N.R.: DESAMC + DocSum: differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization. Knowl. Based Syst. 36, 21–38 (2012)CrossRefGoogle Scholar
  18. 18.
    Abdel-Raouf, O., Metwally, M.A.-B.: A survey of harmony search algorithm. Eng. Appl. Artif. Intell. 70, 17–26 (2013)Google Scholar
  19. 19.
    Meng, W., Xinlai, T.: Extract summarization using concept-obtained and hybrid parallel genetic algorithm. Presented at the 8th International Conference on Natural Computation (2012)Google Scholar
  20. 20.
    Fattah, M.A.: A hybrid machine learning model for multi-document summarization. Appl. Intell. 40, 592–600 (2013)CrossRefGoogle Scholar
  21. 21.
    Cobos, C., Perez, J., Estupiñan, C.: Una revisión de la búsuqeda armónica. Revista Avances en Sistemas e Informática 8, 14 (2011)Google Scholar
  22. 22.
    Manning, C.D., Raghavan, P., Schütze, H.: An Introduction to Information Retrieval. Cambridge University Press, Cambridge (2009)zbMATHGoogle Scholar
  23. 23.
    Omran, M.G.H., Mahdavi, M.: Global-best harmony search. Appl. Math. Comput. 198, 643–656 (2008)MathSciNetzbMATHGoogle Scholar
  24. 24.
    Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL 2004 Workshop on Text Summarization Branches Out (2004)Google Scholar
  25. 25.
    N.I.O.S.A. Technology: NIST covering array tables—about these pages (2008). http://math.nist.gov/coveringarrays/coveringarray.html

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Information Technology Research Group (GTI), Universidad del CaucaPopayánColombia

Personalised recommendations