Skip to main content

Calculating the Upper Bounds for Portuguese Automatic Text Summarization Using Genetic Algorithm

  • Conference paper
  • First Online:
Advances in Artificial Intelligence - IBERAMIA 2018 (IBERAMIA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11238))

Included in the following conference series:

Abstract

Over the last years, Automatic Text Summarization (ATS) has been considered as one of the main tasks in Natural Language Processing (NLP) that generates summaries in several languages (e.g., English, Portuguese, Spanish, etc.). One of the most significant advances in ATS is developed for Portuguese reflected with the proposals of various state-of-art methods. It is essential to know the performance of different state-of-the-art methods with respect to the upper bounds (Topline), lower bounds (Baseline-random), and other heuristics (Baseline-first). In recent works, the significance and upper bounds for Single-Document Summarization (SDS) and Multi-Document Summarization (MDS) using corpora from Document Understanding Conferences (DUC) were calculated. In this paper, a calculus of upper bounds for SDS in Portuguese using Genetic Algorithms (GA) is performed. Moreover, we present a comparison of some state-of-the-art methods with respect to the upper bounds, lower bounds, and heuristics to determinate their level of significance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    DUC website: https://www-nlpir.nist.gov/projects/duc/, TAC website: https://tac.nist.gov/.

  2. 2.

    http://www.nilc.icmc.usp.br/nilc/index.php.

  3. 3.

    https://www.linguateca.pt/Repositorio/TeMario/.

  4. 4.

    Each segmentation can be downloaded from https://gitlab.com/JohnRojas/Corpus-TeMario.

  5. 5.

    http://conteudo.icmc.usp.br/pessoas/taspardo/SENTER_Por.zip.

  6. 6.

    http://www.shvoong.com/summarizer/. (URL viewed May 7th, 2017).

  7. 7.

    https://github.com/neopunisher/Open-Text-Summarizer/ (URL viewed February 10th, 2018).

References

  1. Pardo, T.A.S., Rino, L.H.M., Nunes, M.G.V.: NeuralSumm: Uma Abordagem Conexionista para a Sumarização Automática de Textos. An. do IV Encontro Nac. Inteligência Artif., no. 1 (2003)

    Google Scholar 

  2. Orrú, T., Rosa, J.L.G., de Andrade Netto, M.L.: SABio: an automatic portuguese text summarizer through artificial neural networks in a more biologically plausible model. In: Vieira, R., et al. (eds.) PROPOR 2006. LNCS (LNAI), vol. 3960, pp. 11–20. Springer, Heidelberg (2006). https://doi.org/10.1007/11751984_2

    Chapter  Google Scholar 

  3. Pardo, T.A.S., Rino, L.H.M.: DMSumm: review and assessment. In: Ranchhod, E., Mamede, N.J. (eds.) PorTAL 2002. LNCS (LNAI), vol. 2389, pp. 263–273. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45433-0_36

    Chapter  Google Scholar 

  4. Cardoso, P.C.F.: Exploração de métodos de sumarização automática multidocumento com base em conhecimento semântico- discursivo. USP (2014)

    Google Scholar 

  5. Nunes, M.D.G.V., Aluisio, S.M., Pardo, T.A.S.: Um panorama do Núcleo Interinstitucional de Linguística Computacional às vésperas de sua maioridade. Linguamática 2(2), 13–27 (2010)

    Google Scholar 

  6. Pardo, T.A.S., Rino, L.H.M., Nunes, M.D.G.V.: GistSumm: a summarization tool based on a new extractive method. In: Mamede, N.J., Trancoso, I., Baptista, J., das Graças Volpe Nunes, M. (eds.) PROPOR 2003. LNCS (LNAI), vol. 2721, pp. 210–218. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45011-4_34

    Chapter  Google Scholar 

  7. Margarido, P.R., et al.: Automatic summarization for text simplification. In: Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web, pp. 310–315 (2008)

    Google Scholar 

  8. Pardo, T.A.S., Antiqueira, L., Nunes, M.D.G.V., Oliveira, O.N., Costa, L.D.F.: Using complex networks for language processing: the case of summary evaluation. In: International Conference on Communications, Circuits and Systems, pp. 2678–2682 (2006)

    Google Scholar 

  9. Antiqueira, L.: Desenvolvimento de técnicas baseadas em redes complexas para sumarização extrativa de textos. USP – São Carlos (2007)

    Google Scholar 

  10. Amancio, D.R., Nunes, M.G., Oliveira, O.N., Costa, L.D.F.: Extractive summarization using complex networks and syntactic dependency. Physica A: Stat. Mech. Appl. 391(4), 1855–1864 (2012)

    Article  Google Scholar 

  11. Mihalcea, R., Tarau, P.: A language independent algorithm for single and multiple document summarization. Department of Computer Science and Engineering, vol. 5, pp. 19–24 (2005)

    Google Scholar 

  12. Leite, D., Rino, L.: A genetic fuzzy automatic text summarizer. In: CSBC 2009. Inf. UFRGS, Brazil, vol. 2007, pp. 779–788 (2009)

    Google Scholar 

  13. Matías, G.A.: Generación Automática de Resúmenes Independientes del Lenguaje. Universidad Autónoma del Estado de México (2016)

    Google Scholar 

  14. Oliveira, M.A.D., Guelpeli, M.V.: BLMSumm – Métodos de Busca Local e Metaheurísticas na Sumarização de Textos. In: Proceedings of ENIA - VIII Encontro Nac. Inteligência Artif., vol. 1, no. 1, pp. 287–298 (2011)

    Google Scholar 

  15. Oliveira, M.A., Guelpeli, M.V.C.: The performance of BLMSumm: distinct languages with antagonistic domains and varied compressions. In: Information Science and Technology, ICIST 2012, pp. 609–614 (2012)

    Google Scholar 

  16. Pardo, T., Rino, L.: TeMário: Um Corpus para Sumarização Automática de Textos. NILC - ICMC-USP, São Carlos (2003)

    Google Scholar 

  17. Maziero, E.G., Volpe, G.: TeMário 2006 : Estendendo o Córpus TeMário (2007)

    Google Scholar 

  18. Aleixo, P., Pardo, T.A.S.: CSTNews: um Córpus de Textos Jornalísticos Anotados segundo a Teoria Discursiva Multidocumento CST (cross-document structure theory), Structure, pp. 1–12 (2008)

    Google Scholar 

  19. Rojas-Simón, J., Ledeneva, Y., García-Hernández, R.A.: Calculating the significance of automatic extractive text summarization using a genetic algorithm. J. of Intell. Fuzzy Syst. 35(1), 293–304 (2018)

    Article  Google Scholar 

  20. Rojas Simón, J., Ledeneva, Y., García Hernández, R.A.: Calculating the upper bounds for multi-document summarization using genetic algorithms. Comput. y Sist. 22(1), 11–26 (2018)

    Google Scholar 

  21. Verma, R., Lee, D.: Extractive summarization: limits, compression, generalized model and heuristics, p. 19 (2017)

    Google Scholar 

  22. Sidorov, G.: Non-linear construction of n-grams in computational linguistics, 1st edn. Sociedad Mexicana de Inteligencia Artificial, México (2013)

    Google Scholar 

  23. Louis, A., Nenkova, A.: Automatically evaluating content selection in summarization without human models. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, no. August, pp. 306–314 (2009)

    Google Scholar 

  24. Torres-Moreno, J.M., Saggion, H., Cunha, I.D., SanJuan, E., Velázquez-Morales, P.: Summary evaluation with and without references. Polibits Res. J. Comput. Sci. Comput. Eng. Appl. 42, 13–20 (2010)

    Google Scholar 

  25. Ceylan, H., Mihalcea, R., Özertem, U., Lloret, E., Palomar, M.: Quantifying the limits and success of extractive summarization systems across domains. In: Human Language Technologies, no. June, pp. 903–911 (2010)

    Google Scholar 

  26. Lin, C.-Y., Hovy, E.: The potential and limitations of automatic sentence extraction for summarization. In: Proceedings of the HLT-NAACL 2003 on Text Summarization Workshop, vol. 5, pp. 73–80 (2003)

    Google Scholar 

  27. Hong, K., Marcus, M., Nenkova, A.: System combination for multi-document summarization, pp. 107–117, September 2015

    Google Scholar 

  28. Wang, W.M., Li, Z., Wang, J.W., Zheng, Z.H.: How far we can go with extractive text summarization? Heuristic methods to obtain near upper bounds. Expert Syst. Appl. 90, 439–463 (2017)

    Article  Google Scholar 

  29. Ledeneva, Y., García-Hernández, R.A.: Generación automática de resúmenes Retos, propuestas y experimentos (2017)

    Google Scholar 

  30. Lin, C.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), no. 1, pp. 25–26 (2004)

    Google Scholar 

Download references

Acknowledgements

Work done under partial support of Mexican Government CONACyT Thematic Network program (Language Technologies Thematic Network project 295022). We also thank UAEMex for their support.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jonathan Rojas-Simón , Yulia Ledeneva or René Arnulfo García-Hernández .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rojas-Simón, J., Ledeneva, Y., García-Hernández, R.A. (2018). Calculating the Upper Bounds for Portuguese Automatic Text Summarization Using Genetic Algorithm. In: Simari, G., Fermé, E., Gutiérrez Segura, F., Rodríguez Melquiades, J. (eds) Advances in Artificial Intelligence - IBERAMIA 2018. IBERAMIA 2018. Lecture Notes in Computer Science(), vol 11238. Springer, Cham. https://doi.org/10.1007/978-3-030-03928-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03928-8_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03927-1

  • Online ISBN: 978-3-030-03928-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics