TALAA-ATSF: A Global Operation-Based Arabic Text Summarization Framework

Belkebir, Riadh; Guessoum, Ahmed

doi:10.1007/978-3-319-67056-0_21

Riadh Belkebir⁵ &
Ahmed Guessoum⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 740))

3395 Accesses
5 Citations

Abstract

Text summarization is one of the most challenging and difficult tasks in natural language processing, and artificial intelligence more generally. Various approaches have been proposed in the literature. Text summarization is classified into two categories: extractive text summarization and abstractive text summarization. The vast majority of work in the literature followed the extractive approach, probably due to the complexity of the abstractive one. To the best of our knowledge, the work presented here is the first work on Arabic that handles both the extractive and abstractive aspects. Indeed, while the literature lacks summarization frameworks that allow the integration of various operations within the same system, this work proposes a novel approach where we design a general framework which integrates several operations within the same system. It also provides a mechanism that allows the assignment of the suitable operation to each portion of the source text which is to be summarized, and this is achieved in an iterative process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the remainder of this paper, whenever we use the term “operation(s)” it will obviously stand for “summarization operation(s)”.

References

Fattah, M.A., Ren, F.: Probabilistic neural network based text summarization. In: International Conference of Natural Language Processing and Knowledge Engineering, 2008. NLP-KE’08, pp. 1–6. IEEE (2008)
Google Scholar
Al-Saleh, A.B., Menai, M.E.B.: Automatic Arabic text summarization: a survey. Artif. Intell. Rev. 45(2), 203–234 (2016)
Article Google Scholar
Amer, E., Foad, K.: Akea: an Arabic keyphrase extraction algorithm. In: International Conference on Advanced Intelligent Systems and Informatics, pp. 137–146. Springer (2016)
Google Scholar
Azmi, A., Al-thanyyan, S.: Ikhtasir—a user selected compression ratio Arabic text summarization system. In: Proceeding of International Conference of Natural Language Processing and Knowledge Engineering (NLP-KE 2009), pp. 1–7
Google Scholar
Azzam, S., Humphreys, K., Gaizauskas, R.: Using coreference chains for text summarization. In: Proceedings of the Workshop on Coreference and its Applications, pp. 77–84. Association for Computational Linguistics (1999)
Google Scholar
Baldwin, B., Morton, T.S.: Dynamic coreference-based summarization. In: EMNLP, pp. 1–6 (1998)
Google Scholar
Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. Adv. Autom. Text Summ. 111–121 (1999)
Google Scholar
Barzilay, R., Elhadad, M., McKeown, K.R.: Text summarizations with lexical chains. Adv. Autom. Text Summ. 111–121 (1999)
Google Scholar
Barzilay, R., McKeown, K.R.: Sentence fusion for multidocument news summarization. Comput. Linguist. 31(3), 297–328 (2005)
Google Scholar
Belkebir, R., Guessoum. A.: A supervised approach to Arabic text summarization using adaboost. In: New Contributions in Information Systems and Technologies, pp. 227–236. Springer (2015)
Google Scholar
Belkebir, R., Guessoum, A.: Talaa-asc: a sentence compression corpus for Arabic. In: 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA), pp. 1–8. IEEE (2015)
Google Scholar
Belkebir, R., Guessoum, A.: Concept generalization and fusion for abstractive sentence generation. Expert Syst. Appl. 53, 43–56 (2016)
Article Google Scholar
Bell, E.T.: Exponential numbers. Am. Math. Mon. 41(7), 411–419 (1934)
Google Scholar
Boudabous, M.M., Maaloul, M.H., Belguith, L.H.: Digital learning for summarizing Arabic documents. In: International Conference on Natural Language Processing, pp. 79–84. Springer (2010)
Google Scholar
Clarke, J., Lapata, M.: Global inference for sentence compression: an integer linear programming approach. J. Artif. Intell. Res. 31, 399–429 (2008)
MATH Google Scholar
Cohn, T., Lapata, M.: An abstractive approach to sentence compression. ACM Trans. Intell. Syst. Technol. 4(3) 41:1–41:35 (July 2013)
Google Scholar
Collins, M.: Course notes for nlp by Michael Collins. Columbia University (2013)
Google Scholar
Coster, W., Kauchak, D.: Simple english wikipedia: a new text simplification task. In: ACL (Short Papers), pp. 665–669 (2011)
Google Scholar
Douzidia, F.S., Lapalme, G.: Lakhas, an Arabic summarization system. In: Proceedings of DUC2004 (2004)
Google Scholar
El-Fishawy, N., Hamouda, A., Attiya, G.M., Atef, M.: Arabic summarization in twitter social network. Ain Shams Eng. J. 5(2), 411–420 (2014)
Article Google Scholar
El-Haj, M., Kruschwitz, U., Fox, C.: Exploring clustering for multi-document Arabic summarisation. In: Asia Information Retrieval Symposium, pp. 550–561. Springer (2011)
Google Scholar
El-Haj, M., Rayson, P.: Using a keyness metric for single and multi document summarisation. Association for Computational Linguistics (2013)
Google Scholar
El-Haj, M.O., Hammo, B.H.: Evaluation of query-based Arabic text summarization system. In: International Conference on Natural Language Processing and Knowledge Engineering, 2008. NLP-KE’08, pp. 1–7. IEEE (2008)
Google Scholar
Ercan, G., Cicekli, I.: Lexical cohesion based topic modeling for summarization. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 582–592. Springer (2008)
Google Scholar
Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)
Google Scholar
Feblowitz, D., Kauchak, D.: Sentence simplification as tree transduction. In: Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations, pp. 1–10, Sofia, Bulgaria (August 2013). Association for Computational Linguistics
Google Scholar
Filippova, K., Alfonseca, E., Colmenares, A.C., Kaiser, L., Vinyals, O.: Sentence compression by deletion with lstms. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 360–368. Association for Computational Linguistics (2015)
Google Scholar
Genest, P.-E., Lapalme, G.: Framework for abstractive summarization using text-to-text generation. In: Proceedings of the Workshop on Monolingual Text-To-Text Generation, pp. 64–73. Association for Computational Linguistics (2011)
Google Scholar
Giannakopoulos, G., Karkaletsis, V., Vouros, G.: Testing the use of n-gram graphs in summarization sub-tasks. In: Proceedings of the Text Analysis Conference (TAC) (2008)
Google Scholar
Gonçalves, P.N., Rino, L., Vieira, R.: Summarizing and referring: towards cohesive extracts. In: Proceedings of the Eighth ACM Symposium on Document Engineering, pp. 253–256. ACM (2008)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD explor. newsl. 11(1), 10–18 (2009)
Google Scholar
Hasler, L.: From extracts to abstracts: human summary production operations for computer-aided summarisation. Ph.D. thesis, University of Wolverhampton (2007)
Google Scholar
Huang, M., Shi, X., Jin, F., Zhu, X.: Using first-order logic to compress sentences. In: AAAI (2012)
Google Scholar
Imam, I., Nounou, N., Hamouda, A., Khalek, H.A.A.: An ontology-based summarization system for Arabic documents (ossad). Int. J. Comput. Appl. 74(17) (2013)
Google Scholar
Jing, H.: Using hidden markov modeling to decompose human-written summaries. Comput. Linguist. 28(4), 527–543 (2002)
Article Google Scholar
Jones, K.S. et al. Automatic summarizing: factors and directions. Adv. Autom. Text Summ. pp. 1–12 (1999)
Google Scholar
Kauchak, D.: Improving text simplification language modeling using unsimplified text data. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers), pp. 1537–1546, Sofia, Bulgaria (August 2013). Association for Computational Linguistics
Google Scholar
Keskes, I., Boudabous, M.M., Maaloul, M.H., Belguith, L.H.: Étude comparative entre trois approches de résumé automatique de documents Arabes. In: Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, vol. 2: TALN, pp. 225–238, Grenoble, France (June 2012). ATALA/AFCP
Google Scholar
Khan, A., Salim, N., Kumar, Y.J.: A framework for multi-document abstractive summarization based on semantic role labelling. Appl. Soft Comput. 30, 737–747 (2015)
Article Google Scholar
Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 workshop
Google Scholar
Liu, F., Flanigan, J., Thomson, S., Sadeh, N., Smith, N.A.: Toward abstractive summarization using semantic representations. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1077–1086, Denver, CO (May–June 2015). Association for Computational Linguistics
Google Scholar
Lloret, E., Boldrini, E., Vodolazova, T., Martínez-Barco, P., Muñoz, R., Palomar, M.: A novel concept-level approach for ultra-concise opinion summarization. Expert Syst. Appl. 42(20), 7148–7156 (2015)
Article Google Scholar
Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artif. Intell. Rev. 37(1), 1–41 (2012)
Article Google Scholar
Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. the MIT Press (1999)
Google Scholar
Mann, W.C., Thompson S.A.: Rhetorical structure theory: toward a functional theory of text organization. Text-Interdiscip. J. Study Discourse 8(3), 243–281 (1988)
Google Scholar
Marcu, D.: From discourse structures to text summaries. In: Proceedings of the ACL, Vol. 97, pp. 82–88. Citeseer (1997)
Google Scholar
Marcu, D.: To build text summaries of high quality, nuclearity is not sufficient. In: Working Notes of the AAAI-98 Spring Symposium on Intelligent Text Summarization, pp. 1–8 (1998)
Google Scholar
Marcu, D.: Discourse trees are good indicators of importance in text. Adv. Autom. Text Summ. pp. 123–136 (1999)
Google Scholar
Marcu, D.: The Theory and Practice of Discourse Parsing and Summarization. MIT Press, Cambridge, MA, USA (2000)
MATH Google Scholar
McKeown, K., Rosenthal, S., Thadani, K., Moore, C.: Time-efficient creation of an accurate sentence fusion corpus. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 317–320. Association for Computational Linguistics (2010)
Google Scholar
Medelyan, O.: Computing lexical chains with graph clustering. In: Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop, pp. 85–90. Association for Computational Linguistics (2007)
Google Scholar
Mihalcea, R.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, p. 20. Association for Computational Linguistics (2004)
Google Scholar
Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. Association for Computational Linguistics (2004)
Google Scholar
Nenkova, A.: Entity-driven rewrite for multidocument summarization. In: Proceedings of IJCNLP08 (2008)
Google Scholar
Nenkova, A., McKeown, K.: Automatic summarization. Found. Trends Inf. Retr. 5(23), 103–233 (2011)
Article Google Scholar
Oufaida, H., Nouali, O., Blache, P.: Minimum redundancy and maximum relevance for single and multi-document arabic text summarization. J. King Saud Univ. Comput. Inf. Sci. 26(4), 450–461 (2014)
Google Scholar
Patil, K., Brazdil, P.: Text summarization: using centrality in the pathfinder network. Int. J. Comput. Sci. Inf. Syst. 2, 18–32 (2007)
Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Qazvinian, V., Radev, D.R.: Scientific paper summarization using citation summary networks. In: Proceedings of the 22nd International Conference on Computational Linguistics-Vol. 1, pp. 689–696. Association for Computational Linguistics (2008)
Google Scholar
Saggion, H.: A classification algorithm for predicting the structure of summaries. In: Proceedings of the 2009 Workshop on Language Generation and Summarisation, pp. 31–38. Association for Computational Linguistics (2009)
Google Scholar
Saggion, H., Poibeau, T.: Automatic text summarization: past, present and future. In: Multi-source, Multilingual Information Extraction and Summarization, pp. 3–21. Springer (2013)
Google Scholar
Schlesinger, J.D., Oleary, D.P., Conroy, J.M.: Arabic/english multi-document summarization with classythe past and the future. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 568–581. Springer (2008)
Google Scholar
Siddharthan, A.: An architecture for a text simplification system. In: Language Engineering Conference, 2002. Proceedings, pp. 64–71. IEEE (2002)
Google Scholar
Mohammed, I., Sobh, A.H.: An optimized dual classification system for Arabic extractive generic text summarization. Ph.D. thesis, Citeseer (2009)
Google Scholar
Tanaka, H., Kinoshita, A., Kobayakawa, T., Kumano, T., Kato, N.: Syntax-driven sentence revision for broadcast news summarization. In: Proceedings of the 2009 Workshop on Language Generation and Summarisation, pp. 39–47. Association for Computational Linguistics (2009)
Google Scholar
Wan, X., Xiao, J.: Towards a unified approach based on affinity graph to various multi-document summarizations. In: International Conference on Theory and Practice of Digital Libraries, pp. 297–308. Springer (2007)
Google Scholar
Wang, T., Chen, P., Amaral, K., Qiang, J.: An experimental study of lstm encoder-decoder model for text simplification. arXiv:1609.03663 (2016)
Wang, T., Chen, P., Rochford, J., Qiang, J.: Text simplification using neural machine translation. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Google Scholar
Woodsend, K., Lapata, M.: Wikisimple: automatic simplification of wikipedia articles. In: AAAI (2011)
Google Scholar
Yamangil, E., Shieber, S.M.: Bayesian synchronous tree-substitution grammar induction and its application to sentence compression. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 937–947. Association for Computational Linguistics (2010)
Google Scholar
Yeh, J.-Y., Ke, H.-R., Yang, W.-P., Meng, I.-H.: Text summarization using a trainable summarizer and latent semantic analysis. Inf. Process. Manage. 41(1), 75–95 (2005)
Google Scholar
Yoshikawa, K., Hirao, T., Iida, R., Okumura, M.: Sentence compression with semantic role constraints. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Vol. 2, pp. 349–353. Association for Computational Linguistics (2012)
Google Scholar
Zajic, D., Dorr, B.J., Lin, J., Schwartz, R.: Multi-candidate reduction: sentence compression as a tool for document summarization tasks. Inf. Process. Manage. 43(6), 1549–1570 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Natural Language Processing and Machine Learning Research Group, Laboratory for Research in Artificial Intelligence, Computer Science Department, University of Science and Technology Houari Boumediene (USTHB), BP 32 El-Alia, 16111, Bab Ezzouar, Algiers, Algeria
Riadh Belkebir & Ahmed Guessoum

Authors

Riadh Belkebir
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Guessoum
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Riadh Belkebir or Ahmed Guessoum .

Editor information

Editors and Affiliations

The British University in Dubai, Dubai, United Arab Emirates
Khaled Shaalan
Faculty of Computers and Information Technology, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Faculty of Computers and Information, Ain Shams University, Cairo, Egypt
Fahmy Tolba

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Belkebir, R., Guessoum, A. (2018). TALAA-ATSF: A Global Operation-Based Arabic Text Summarization Framework. In: Shaalan, K., Hassanien, A., Tolba, F. (eds) Intelligent Natural Language Processing: Trends and Applications. Studies in Computational Intelligence, vol 740. Springer, Cham. https://doi.org/10.1007/978-3-319-67056-0_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-67056-0_21
Published: 18 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67055-3
Online ISBN: 978-3-319-67056-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics