Skip to main content

TALAA-ATSF: A Global Operation-Based Arabic Text Summarization Framework

  • Chapter
  • First Online:
Intelligent Natural Language Processing: Trends and Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 740))

Abstract

Text summarization is one of the most challenging and difficult tasks in natural language processing, and artificial intelligence more generally. Various approaches have been proposed in the literature. Text summarization is classified into two categories: extractive text summarization and abstractive text summarization. The vast majority of work in the literature followed the extractive approach, probably due to the complexity of the abstractive one. To the best of our knowledge, the work presented here is the first work on Arabic that handles both the extractive and abstractive aspects. Indeed, while the literature lacks summarization frameworks that allow the integration of various operations within the same system, this work proposes a novel approach where we design a general framework which integrates several operations within the same system. It also provides a mechanism that allows the assignment of the suitable operation to each portion of the source text which is to be summarized, and this is achieved in an iterative process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the remainder of this paper, whenever we use the term “operation(s)” it will obviously stand for “summarization operation(s)”.

References

  1. Fattah, M.A., Ren, F.: Probabilistic neural network based text summarization. In: International Conference of Natural Language Processing and Knowledge Engineering, 2008. NLP-KE’08, pp. 1–6. IEEE (2008)

    Google Scholar 

  2. Al-Saleh, A.B., Menai, M.E.B.: Automatic Arabic text summarization: a survey. Artif. Intell. Rev. 45(2), 203–234 (2016)

    Article  Google Scholar 

  3. Amer, E., Foad, K.: Akea: an Arabic keyphrase extraction algorithm. In: International Conference on Advanced Intelligent Systems and Informatics, pp. 137–146. Springer (2016)

    Google Scholar 

  4. Azmi, A., Al-thanyyan, S.: Ikhtasir—a user selected compression ratio Arabic text summarization system. In: Proceeding of International Conference of Natural Language Processing and Knowledge Engineering (NLP-KE 2009), pp. 1–7

    Google Scholar 

  5. Azzam, S., Humphreys, K., Gaizauskas, R.: Using coreference chains for text summarization. In: Proceedings of the Workshop on Coreference and its Applications, pp. 77–84. Association for Computational Linguistics (1999)

    Google Scholar 

  6. Baldwin, B., Morton, T.S.: Dynamic coreference-based summarization. In: EMNLP, pp. 1–6 (1998)

    Google Scholar 

  7. Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. Adv. Autom. Text Summ. 111–121 (1999)

    Google Scholar 

  8. Barzilay, R., Elhadad, M., McKeown, K.R.: Text summarizations with lexical chains. Adv. Autom. Text Summ. 111–121 (1999)

    Google Scholar 

  9. Barzilay, R., McKeown, K.R.: Sentence fusion for multidocument news summarization. Comput. Linguist. 31(3), 297–328 (2005)

    Google Scholar 

  10. Belkebir, R., Guessoum. A.: A supervised approach to Arabic text summarization using adaboost. In: New Contributions in Information Systems and Technologies, pp. 227–236. Springer (2015)

    Google Scholar 

  11. Belkebir, R., Guessoum, A.: Talaa-asc: a sentence compression corpus for Arabic. In: 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA), pp. 1–8. IEEE (2015)

    Google Scholar 

  12. Belkebir, R., Guessoum, A.: Concept generalization and fusion for abstractive sentence generation. Expert Syst. Appl. 53, 43–56 (2016)

    Article  Google Scholar 

  13. Bell, E.T.: Exponential numbers. Am. Math. Mon. 41(7), 411–419 (1934)

    Google Scholar 

  14. Boudabous, M.M., Maaloul, M.H., Belguith, L.H.: Digital learning for summarizing Arabic documents. In: International Conference on Natural Language Processing, pp. 79–84. Springer (2010)

    Google Scholar 

  15. Clarke, J., Lapata, M.: Global inference for sentence compression: an integer linear programming approach. J. Artif. Intell. Res. 31, 399–429 (2008)

    MATH  Google Scholar 

  16. Cohn, T., Lapata, M.: An abstractive approach to sentence compression. ACM Trans. Intell. Syst. Technol. 4(3) 41:1–41:35 (July 2013)

    Google Scholar 

  17. Collins, M.: Course notes for nlp by Michael Collins. Columbia University (2013)

    Google Scholar 

  18. Coster, W., Kauchak, D.: Simple english wikipedia: a new text simplification task. In: ACL (Short Papers), pp. 665–669 (2011)

    Google Scholar 

  19. Douzidia, F.S., Lapalme, G.: Lakhas, an Arabic summarization system. In: Proceedings of DUC2004 (2004)

    Google Scholar 

  20. El-Fishawy, N., Hamouda, A., Attiya, G.M., Atef, M.: Arabic summarization in twitter social network. Ain Shams Eng. J. 5(2), 411–420 (2014)

    Article  Google Scholar 

  21. El-Haj, M., Kruschwitz, U., Fox, C.: Exploring clustering for multi-document Arabic summarisation. In: Asia Information Retrieval Symposium, pp. 550–561. Springer (2011)

    Google Scholar 

  22. El-Haj, M., Rayson, P.: Using a keyness metric for single and multi document summarisation. Association for Computational Linguistics (2013)

    Google Scholar 

  23. El-Haj, M.O., Hammo, B.H.: Evaluation of query-based Arabic text summarization system. In: International Conference on Natural Language Processing and Knowledge Engineering, 2008. NLP-KE’08, pp. 1–7. IEEE (2008)

    Google Scholar 

  24. Ercan, G., Cicekli, I.: Lexical cohesion based topic modeling for summarization. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 582–592. Springer (2008)

    Google Scholar 

  25. Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)

    Google Scholar 

  26. Feblowitz, D., Kauchak, D.: Sentence simplification as tree transduction. In: Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations, pp. 1–10, Sofia, Bulgaria (August 2013). Association for Computational Linguistics

    Google Scholar 

  27. Filippova, K., Alfonseca, E., Colmenares, A.C., Kaiser, L., Vinyals, O.: Sentence compression by deletion with lstms. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 360–368. Association for Computational Linguistics (2015)

    Google Scholar 

  28. Genest, P.-E., Lapalme, G.: Framework for abstractive summarization using text-to-text generation. In: Proceedings of the Workshop on Monolingual Text-To-Text Generation, pp. 64–73. Association for Computational Linguistics (2011)

    Google Scholar 

  29. Giannakopoulos, G., Karkaletsis, V., Vouros, G.: Testing the use of n-gram graphs in summarization sub-tasks. In: Proceedings of the Text Analysis Conference (TAC) (2008)

    Google Scholar 

  30. Gonçalves, P.N., Rino, L., Vieira, R.: Summarizing and referring: towards cohesive extracts. In: Proceedings of the Eighth ACM Symposium on Document Engineering, pp. 253–256. ACM (2008)

    Google Scholar 

  31. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD explor. newsl. 11(1), 10–18 (2009)

    Google Scholar 

  32. Hasler, L.: From extracts to abstracts: human summary production operations for computer-aided summarisation. Ph.D. thesis, University of Wolverhampton (2007)

    Google Scholar 

  33. Huang, M., Shi, X., Jin, F., Zhu, X.: Using first-order logic to compress sentences. In: AAAI (2012)

    Google Scholar 

  34. Imam, I., Nounou, N., Hamouda, A., Khalek, H.A.A.: An ontology-based summarization system for Arabic documents (ossad). Int. J. Comput. Appl. 74(17) (2013)

    Google Scholar 

  35. Jing, H.: Using hidden markov modeling to decompose human-written summaries. Comput. Linguist. 28(4), 527–543 (2002)

    Article  Google Scholar 

  36. Jones, K.S. et al. Automatic summarizing: factors and directions. Adv. Autom. Text Summ. pp. 1–12 (1999)

    Google Scholar 

  37. Kauchak, D.: Improving text simplification language modeling using unsimplified text data. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers), pp. 1537–1546, Sofia, Bulgaria (August 2013). Association for Computational Linguistics

    Google Scholar 

  38. Keskes, I., Boudabous, M.M., Maaloul, M.H., Belguith, L.H.: Étude comparative entre trois approches de résumé automatique de documents Arabes. In: Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, vol. 2: TALN, pp. 225–238, Grenoble, France (June 2012). ATALA/AFCP

    Google Scholar 

  39. Khan, A., Salim, N., Kumar, Y.J.: A framework for multi-document abstractive summarization based on semantic role labelling. Appl. Soft Comput. 30, 737–747 (2015)

    Article  Google Scholar 

  40. Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 workshop

    Google Scholar 

  41. Liu, F., Flanigan, J., Thomson, S., Sadeh, N., Smith, N.A.: Toward abstractive summarization using semantic representations. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1077–1086, Denver, CO (May–June 2015). Association for Computational Linguistics

    Google Scholar 

  42. Lloret, E., Boldrini, E., Vodolazova, T., Martínez-Barco, P., Muñoz, R., Palomar, M.: A novel concept-level approach for ultra-concise opinion summarization. Expert Syst. Appl. 42(20), 7148–7156 (2015)

    Article  Google Scholar 

  43. Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artif. Intell. Rev. 37(1), 1–41 (2012)

    Article  Google Scholar 

  44. Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. the MIT Press (1999)

    Google Scholar 

  45. Mann, W.C., Thompson S.A.: Rhetorical structure theory: toward a functional theory of text organization. Text-Interdiscip. J. Study Discourse 8(3), 243–281 (1988)

    Google Scholar 

  46. Marcu, D.: From discourse structures to text summaries. In: Proceedings of the ACL, Vol. 97, pp. 82–88. Citeseer (1997)

    Google Scholar 

  47. Marcu, D.: To build text summaries of high quality, nuclearity is not sufficient. In: Working Notes of the AAAI-98 Spring Symposium on Intelligent Text Summarization, pp. 1–8 (1998)

    Google Scholar 

  48. Marcu, D.: Discourse trees are good indicators of importance in text. Adv. Autom. Text Summ. pp. 123–136 (1999)

    Google Scholar 

  49. Marcu, D.: The Theory and Practice of Discourse Parsing and Summarization. MIT Press, Cambridge, MA, USA (2000)

    MATH  Google Scholar 

  50. McKeown, K., Rosenthal, S., Thadani, K., Moore, C.: Time-efficient creation of an accurate sentence fusion corpus. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 317–320. Association for Computational Linguistics (2010)

    Google Scholar 

  51. Medelyan, O.: Computing lexical chains with graph clustering. In: Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop, pp. 85–90. Association for Computational Linguistics (2007)

    Google Scholar 

  52. Mihalcea, R.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, p. 20. Association for Computational Linguistics (2004)

    Google Scholar 

  53. Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. Association for Computational Linguistics (2004)

    Google Scholar 

  54. Nenkova, A.: Entity-driven rewrite for multidocument summarization. In: Proceedings of IJCNLP08 (2008)

    Google Scholar 

  55. Nenkova, A., McKeown, K.: Automatic summarization. Found. Trends Inf. Retr. 5(23), 103–233 (2011)

    Article  Google Scholar 

  56. Oufaida, H., Nouali, O., Blache, P.: Minimum redundancy and maximum relevance for single and multi-document arabic text summarization. J. King Saud Univ. Comput. Inf. Sci. 26(4), 450–461 (2014)

    Google Scholar 

  57. Patil, K., Brazdil, P.: Text summarization: using centrality in the pathfinder network. Int. J. Comput. Sci. Inf. Syst. 2, 18–32 (2007)

    Google Scholar 

  58. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  59. Qazvinian, V., Radev, D.R.: Scientific paper summarization using citation summary networks. In: Proceedings of the 22nd International Conference on Computational Linguistics-Vol. 1, pp. 689–696. Association for Computational Linguistics (2008)

    Google Scholar 

  60. Saggion, H.: A classification algorithm for predicting the structure of summaries. In: Proceedings of the 2009 Workshop on Language Generation and Summarisation, pp. 31–38. Association for Computational Linguistics (2009)

    Google Scholar 

  61. Saggion, H., Poibeau, T.: Automatic text summarization: past, present and future. In: Multi-source, Multilingual Information Extraction and Summarization, pp. 3–21. Springer (2013)

    Google Scholar 

  62. Schlesinger, J.D., Oleary, D.P., Conroy, J.M.: Arabic/english multi-document summarization with classythe past and the future. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 568–581. Springer (2008)

    Google Scholar 

  63. Siddharthan, A.: An architecture for a text simplification system. In: Language Engineering Conference, 2002. Proceedings, pp. 64–71. IEEE (2002)

    Google Scholar 

  64. Mohammed, I., Sobh, A.H.: An optimized dual classification system for Arabic extractive generic text summarization. Ph.D. thesis, Citeseer (2009)

    Google Scholar 

  65. Tanaka, H., Kinoshita, A., Kobayakawa, T., Kumano, T., Kato, N.: Syntax-driven sentence revision for broadcast news summarization. In: Proceedings of the 2009 Workshop on Language Generation and Summarisation, pp. 39–47. Association for Computational Linguistics (2009)

    Google Scholar 

  66. Wan, X., Xiao, J.: Towards a unified approach based on affinity graph to various multi-document summarizations. In: International Conference on Theory and Practice of Digital Libraries, pp. 297–308. Springer (2007)

    Google Scholar 

  67. Wang, T., Chen, P., Amaral, K., Qiang, J.: An experimental study of lstm encoder-decoder model for text simplification. arXiv:1609.03663 (2016)

  68. Wang, T., Chen, P., Rochford, J., Qiang, J.: Text simplification using neural machine translation. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)

    Google Scholar 

  69. Woodsend, K., Lapata, M.: Wikisimple: automatic simplification of wikipedia articles. In: AAAI (2011)

    Google Scholar 

  70. Yamangil, E., Shieber, S.M.: Bayesian synchronous tree-substitution grammar induction and its application to sentence compression. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 937–947. Association for Computational Linguistics (2010)

    Google Scholar 

  71. Yeh, J.-Y., Ke, H.-R., Yang, W.-P., Meng, I.-H.: Text summarization using a trainable summarizer and latent semantic analysis. Inf. Process. Manage. 41(1), 75–95 (2005)

    Google Scholar 

  72. Yoshikawa, K., Hirao, T., Iida, R., Okumura, M.: Sentence compression with semantic role constraints. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Vol. 2, pp. 349–353. Association for Computational Linguistics (2012)

    Google Scholar 

  73. Zajic, D., Dorr, B.J., Lin, J., Schwartz, R.: Multi-candidate reduction: sentence compression as a tool for document summarization tasks. Inf. Process. Manage. 43(6), 1549–1570 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Riadh Belkebir or Ahmed Guessoum .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Belkebir, R., Guessoum, A. (2018). TALAA-ATSF: A Global Operation-Based Arabic Text Summarization Framework. In: Shaalan, K., Hassanien, A., Tolba, F. (eds) Intelligent Natural Language Processing: Trends and Applications. Studies in Computational Intelligence, vol 740. Springer, Cham. https://doi.org/10.1007/978-3-319-67056-0_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67056-0_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67055-3

  • Online ISBN: 978-3-319-67056-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics