Skip to main content

A Comparison of Automatic Summarizers of Texts in Brazilian Portuguese

  • Conference paper
Book cover Advances in Artificial Intelligence – SBIA 2004 (SBIA 2004)

Abstract

Automatic Summarization (AS) in Brazil has only recently become a significant research topic. When compared to other languages initiatives, such a delay can be explained by the lack of specific resources, such as expressive lexicons and corpora that could provide adequate foundations for deep or shallow approaches on AS. Taking advantage of having commonalities with respect to resources and a corpus of texts and summaries written in Brazilian Portuguese, two NLP research groups have decided to start a common task to assess and compare their AS systems. In the experiment five distinct extractive AS systems have been assessed. Some of them incorporate techniques that have been already used to summarize texts in English; others propose novel approaches to AS. Two baseline systems have also been considered. An overall performance comparison has been carried out, and its outcomes are discussed in this paper.

The Brazilian Agencies FAPESP and PIBIC-CNPQ supported this research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aires, R.V.X., Aluísio, S.M., Kuhn, D.C.e.S., Andreeta, M.L.B., Oliveira Jr., O.N.: Combining classifiers to improve part of speech tagging: A case study for Brazilian Portuguese. In: Open Discussion Track Proceedings of the 15th Brazilian Symposium on AI, pp. 227–236 (2000)

    Google Scholar 

  2. Barzilay, R., Elhadad, M.: Using Lexical Chains for Text Summarization. In: Advances in Automatic Text Summarization, pp. 111–121. MIT Press, Cambridge (1999)

    Google Scholar 

  3. Caldas Jr., J., Imamura, C.Y.M., Rezende, S.O.: Evaluation of a stemming algorithm for the Portuguese language (in Portuguese). In: Proceedings of the 2nd Congress of Logic Applied to Technology, vol. 2, pp. 267–274 (2001)

    Google Scholar 

  4. Dias-da Silva, B., Oliveira, M.F., Moraes, H.R., Paschoalino, C., Hasegawa, R., Amorin, D., Nascimento, A.C.: The Building of an Electronic thesaurus for Brazilian Portuguese (in Portuguese). In: Proceedings of the V Encontro para o Processamento Computacional da Língua Portuguesa Escrita e Falada, pp. 1–11 (2000)

    Google Scholar 

  5. Edmundson, H.P.: New methods in automatic extracting. Journal of the Association for Computing Machinery 16, 264–285 (1969)

    MATH  Google Scholar 

  6. Kohonen, T.: Self organized formation of topologically correct feature maps. Biological Cybernetics 43, 59–69 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  7. Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proc. of the 18th ACM-SIGIR Conference on Research & Development in Information Retrieval, pp. 68–73 (1995)

    Google Scholar 

  8. Larocca Neto, J.: Contribution to the study of automatic text summarization techniques (in Portuguese). Master’s thesis, Pontifícia Universidade Católica do Paraná (PUC-PR), Graduate Program in Applied Computer Science (2002)

    Google Scholar 

  9. Larocca Neto, J., Santos, A.D., Kaestner, C.A.A., Freitas, A.A.: Document clustering and text summarization. In: Proc. 4th Int. Conf. Practical Applications of Knowledge Discovery and Data Mining, pp. 41–55 (2000)

    Google Scholar 

  10. Larocca Neto, J., Freitas, A.A., Kaestner, C.A.A.: Automatic text summarization using a machine learning approach. In: XVI Brazilian Symp. on Artificial Intelligence. LNCS (LNAI), vol. 2057, pp. 205–215 (2002)

    Google Scholar 

  11. Luhn, H.: The automatic creation of literature abstracts. IBM Journal of Research and Development 2, 159–165 (1958)

    Article  MathSciNet  Google Scholar 

  12. Lyman, P., Varian, H.R.: How much information [01/19/2004] (2003), Retrieved from http://www.sims.berkeley.edu/how-much-info-2003

  13. Mani, I.: Automatic Summarization. John Benjamin’s Publishing Company (2001)

    Google Scholar 

  14. Mani, I., Bloedorn, E.: Machine learning of generic and user-focused summarization. In: Proc. of the 15th National Conf. on Artificial Intelligence (AAAI 1998), pp. 821–826 (1998)

    Google Scholar 

  15. Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. MIT Press, Cambridge (1999)

    Google Scholar 

  16. Martins, R.T., Hasegawa, R., Nunes, M.G.V.: Curupira: a functional parser for Portuguese (in Portuguese). NILC Tech. Report NILC-TR-02-26 (2002)

    Google Scholar 

  17. Módolo, M.: Supor: an environment for exploration of extractive methods for automatic text summarization for portuguese (in Portuguese). Master’s thesis, Departamento de Computação, UFSCar (2003)

    Google Scholar 

  18. Nunes, M.G.V., Vieira, F.M.V., Zavaglia, C., Sossolete, C.R.C., Hernandez, J.: The building of a Brazilian Portuguese lexicon for supporting automatic grammar checking (in Portuguese). ICMC-USP Tech. Report 42 (1996)

    Google Scholar 

  19. Pardo, T.A.S., Rino, L.H.M.: TeMário: A corpus for automatic text summarization (in Portuguese). NILC Tech. Report NILC-TR-03-09 (2003)

    Google Scholar 

  20. Pardo, T.A.S., Rino, L.H.M., Nunes, M.G.V.: GistSumm: A summarization tool based on a new extractive method. In: Mamede, N.J., Baptista, J., Trancoso, I., Nunes, M.d.G.V. (eds.) PROPOR 2003. LNCS, vol. 2721, pp. 210–218. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  21. Pardo, T.A.S., Rino, L.H.M., Nunes, M.G.V.: NeuralSumm: A connexionist approach to automatic text summarization (in Portuguese). In: Proceedings of the IV Encontro Nacional de Inteligência Artificial (2003)

    Google Scholar 

  22. Pardo, T.A.S., Rino, L.H.M., Nunes, M.G.V.: DiZer: An automatic discourse analysis proposal to brazilian portuguese (in Portuguese). In: Proc. of the I Workshop em Tecnologia da Informação e da Linguagem Humana (2003)

    Google Scholar 

  23. Radev, D.R., Teufel, S., Saggion, H., Lam, W., Blitzer, J., Qi, H., Çelebi, A., Liu, D., Drabek, E.: Evaluation challenges in large-scale document summarization. In: Proc. of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 375–382 (2003)

    Google Scholar 

  24. Saggion, H., Lapalme, G.: Generating indicative-informative summaries with sumUM. Computational Linguistics 28, 497–526 (2002)

    Article  Google Scholar 

  25. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24, 513–523 (1988)

    Article  Google Scholar 

  26. Teufel, S., Moens, M.: Summarizing scientific articles: Experiments with relevance and rhetorical status. Computational Linguistics 28, 409–445 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rino, L.H.M., Pardo, T.A.S., Nascimento Silla, C., Kaestner, C.A.A., Pombo, M. (2004). A Comparison of Automatic Summarizers of Texts in Brazilian Portuguese. In: Bazzan, A.L.C., Labidi, S. (eds) Advances in Artificial Intelligence – SBIA 2004. SBIA 2004. Lecture Notes in Computer Science(), vol 3171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28645-5_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-28645-5_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23237-7

  • Online ISBN: 978-3-540-28645-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics