Abstract
Text summarization is one of the most challenging and difficult tasks in natural language processing, and artificial intelligence more generally. Various approaches have been proposed in the literature. Text summarization is classified into two categories: extractive text summarization and abstractive text summarization. The vast majority of work in the literature followed the extractive approach, probably due to the complexity of the abstractive one. To the best of our knowledge, the work presented here is the first work on Arabic that handles both the extractive and abstractive aspects. Indeed, while the literature lacks summarization frameworks that allow the integration of various operations within the same system, this work proposes a novel approach where we design a general framework which integrates several operations within the same system. It also provides a mechanism that allows the assignment of the suitable operation to each portion of the source text which is to be summarized, and this is achieved in an iterative process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In the remainder of this paper, whenever we use the term “operation(s)” it will obviously stand for “summarization operation(s)”.
References
Fattah, M.A., Ren, F.: Probabilistic neural network based text summarization. In: International Conference of Natural Language Processing and Knowledge Engineering, 2008. NLP-KE’08, pp. 1–6. IEEE (2008)
Al-Saleh, A.B., Menai, M.E.B.: Automatic Arabic text summarization: a survey. Artif. Intell. Rev. 45(2), 203–234 (2016)
Amer, E., Foad, K.: Akea: an Arabic keyphrase extraction algorithm. In: International Conference on Advanced Intelligent Systems and Informatics, pp. 137–146. Springer (2016)
Azmi, A., Al-thanyyan, S.: Ikhtasir—a user selected compression ratio Arabic text summarization system. In: Proceeding of International Conference of Natural Language Processing and Knowledge Engineering (NLP-KE 2009), pp. 1–7
Azzam, S., Humphreys, K., Gaizauskas, R.: Using coreference chains for text summarization. In: Proceedings of the Workshop on Coreference and its Applications, pp. 77–84. Association for Computational Linguistics (1999)
Baldwin, B., Morton, T.S.: Dynamic coreference-based summarization. In: EMNLP, pp. 1–6 (1998)
Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. Adv. Autom. Text Summ. 111–121 (1999)
Barzilay, R., Elhadad, M., McKeown, K.R.: Text summarizations with lexical chains. Adv. Autom. Text Summ. 111–121 (1999)
Barzilay, R., McKeown, K.R.: Sentence fusion for multidocument news summarization. Comput. Linguist. 31(3), 297–328 (2005)
Belkebir, R., Guessoum. A.: A supervised approach to Arabic text summarization using adaboost. In: New Contributions in Information Systems and Technologies, pp. 227–236. Springer (2015)
Belkebir, R., Guessoum, A.: Talaa-asc: a sentence compression corpus for Arabic. In: 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA), pp. 1–8. IEEE (2015)
Belkebir, R., Guessoum, A.: Concept generalization and fusion for abstractive sentence generation. Expert Syst. Appl. 53, 43–56 (2016)
Bell, E.T.: Exponential numbers. Am. Math. Mon. 41(7), 411–419 (1934)
Boudabous, M.M., Maaloul, M.H., Belguith, L.H.: Digital learning for summarizing Arabic documents. In: International Conference on Natural Language Processing, pp. 79–84. Springer (2010)
Clarke, J., Lapata, M.: Global inference for sentence compression: an integer linear programming approach. J. Artif. Intell. Res. 31, 399–429 (2008)
Cohn, T., Lapata, M.: An abstractive approach to sentence compression. ACM Trans. Intell. Syst. Technol. 4(3) 41:1–41:35 (July 2013)
Collins, M.: Course notes for nlp by Michael Collins. Columbia University (2013)
Coster, W., Kauchak, D.: Simple english wikipedia: a new text simplification task. In: ACL (Short Papers), pp. 665–669 (2011)
Douzidia, F.S., Lapalme, G.: Lakhas, an Arabic summarization system. In: Proceedings of DUC2004 (2004)
El-Fishawy, N., Hamouda, A., Attiya, G.M., Atef, M.: Arabic summarization in twitter social network. Ain Shams Eng. J. 5(2), 411–420 (2014)
El-Haj, M., Kruschwitz, U., Fox, C.: Exploring clustering for multi-document Arabic summarisation. In: Asia Information Retrieval Symposium, pp. 550–561. Springer (2011)
El-Haj, M., Rayson, P.: Using a keyness metric for single and multi document summarisation. Association for Computational Linguistics (2013)
El-Haj, M.O., Hammo, B.H.: Evaluation of query-based Arabic text summarization system. In: International Conference on Natural Language Processing and Knowledge Engineering, 2008. NLP-KE’08, pp. 1–7. IEEE (2008)
Ercan, G., Cicekli, I.: Lexical cohesion based topic modeling for summarization. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 582–592. Springer (2008)
Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)
Feblowitz, D., Kauchak, D.: Sentence simplification as tree transduction. In: Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations, pp. 1–10, Sofia, Bulgaria (August 2013). Association for Computational Linguistics
Filippova, K., Alfonseca, E., Colmenares, A.C., Kaiser, L., Vinyals, O.: Sentence compression by deletion with lstms. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 360–368. Association for Computational Linguistics (2015)
Genest, P.-E., Lapalme, G.: Framework for abstractive summarization using text-to-text generation. In: Proceedings of the Workshop on Monolingual Text-To-Text Generation, pp. 64–73. Association for Computational Linguistics (2011)
Giannakopoulos, G., Karkaletsis, V., Vouros, G.: Testing the use of n-gram graphs in summarization sub-tasks. In: Proceedings of the Text Analysis Conference (TAC) (2008)
Gonçalves, P.N., Rino, L., Vieira, R.: Summarizing and referring: towards cohesive extracts. In: Proceedings of the Eighth ACM Symposium on Document Engineering, pp. 253–256. ACM (2008)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD explor. newsl. 11(1), 10–18 (2009)
Hasler, L.: From extracts to abstracts: human summary production operations for computer-aided summarisation. Ph.D. thesis, University of Wolverhampton (2007)
Huang, M., Shi, X., Jin, F., Zhu, X.: Using first-order logic to compress sentences. In: AAAI (2012)
Imam, I., Nounou, N., Hamouda, A., Khalek, H.A.A.: An ontology-based summarization system for Arabic documents (ossad). Int. J. Comput. Appl. 74(17) (2013)
Jing, H.: Using hidden markov modeling to decompose human-written summaries. Comput. Linguist. 28(4), 527–543 (2002)
Jones, K.S. et al. Automatic summarizing: factors and directions. Adv. Autom. Text Summ. pp. 1–12 (1999)
Kauchak, D.: Improving text simplification language modeling using unsimplified text data. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers), pp. 1537–1546, Sofia, Bulgaria (August 2013). Association for Computational Linguistics
Keskes, I., Boudabous, M.M., Maaloul, M.H., Belguith, L.H.: Étude comparative entre trois approches de résumé automatique de documents Arabes. In: Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, vol. 2: TALN, pp. 225–238, Grenoble, France (June 2012). ATALA/AFCP
Khan, A., Salim, N., Kumar, Y.J.: A framework for multi-document abstractive summarization based on semantic role labelling. Appl. Soft Comput. 30, 737–747 (2015)
Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 workshop
Liu, F., Flanigan, J., Thomson, S., Sadeh, N., Smith, N.A.: Toward abstractive summarization using semantic representations. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1077–1086, Denver, CO (May–June 2015). Association for Computational Linguistics
Lloret, E., Boldrini, E., Vodolazova, T., Martínez-Barco, P., Muñoz, R., Palomar, M.: A novel concept-level approach for ultra-concise opinion summarization. Expert Syst. Appl. 42(20), 7148–7156 (2015)
Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artif. Intell. Rev. 37(1), 1–41 (2012)
Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. the MIT Press (1999)
Mann, W.C., Thompson S.A.: Rhetorical structure theory: toward a functional theory of text organization. Text-Interdiscip. J. Study Discourse 8(3), 243–281 (1988)
Marcu, D.: From discourse structures to text summaries. In: Proceedings of the ACL, Vol. 97, pp. 82–88. Citeseer (1997)
Marcu, D.: To build text summaries of high quality, nuclearity is not sufficient. In: Working Notes of the AAAI-98 Spring Symposium on Intelligent Text Summarization, pp. 1–8 (1998)
Marcu, D.: Discourse trees are good indicators of importance in text. Adv. Autom. Text Summ. pp. 123–136 (1999)
Marcu, D.: The Theory and Practice of Discourse Parsing and Summarization. MIT Press, Cambridge, MA, USA (2000)
McKeown, K., Rosenthal, S., Thadani, K., Moore, C.: Time-efficient creation of an accurate sentence fusion corpus. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 317–320. Association for Computational Linguistics (2010)
Medelyan, O.: Computing lexical chains with graph clustering. In: Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop, pp. 85–90. Association for Computational Linguistics (2007)
Mihalcea, R.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, p. 20. Association for Computational Linguistics (2004)
Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. Association for Computational Linguistics (2004)
Nenkova, A.: Entity-driven rewrite for multidocument summarization. In: Proceedings of IJCNLP08 (2008)
Nenkova, A., McKeown, K.: Automatic summarization. Found. Trends Inf. Retr. 5(23), 103–233 (2011)
Oufaida, H., Nouali, O., Blache, P.: Minimum redundancy and maximum relevance for single and multi-document arabic text summarization. J. King Saud Univ. Comput. Inf. Sci. 26(4), 450–461 (2014)
Patil, K., Brazdil, P.: Text summarization: using centrality in the pathfinder network. Int. J. Comput. Sci. Inf. Syst. 2, 18–32 (2007)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Qazvinian, V., Radev, D.R.: Scientific paper summarization using citation summary networks. In: Proceedings of the 22nd International Conference on Computational Linguistics-Vol. 1, pp. 689–696. Association for Computational Linguistics (2008)
Saggion, H.: A classification algorithm for predicting the structure of summaries. In: Proceedings of the 2009 Workshop on Language Generation and Summarisation, pp. 31–38. Association for Computational Linguistics (2009)
Saggion, H., Poibeau, T.: Automatic text summarization: past, present and future. In: Multi-source, Multilingual Information Extraction and Summarization, pp. 3–21. Springer (2013)
Schlesinger, J.D., Oleary, D.P., Conroy, J.M.: Arabic/english multi-document summarization with classythe past and the future. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 568–581. Springer (2008)
Siddharthan, A.: An architecture for a text simplification system. In: Language Engineering Conference, 2002. Proceedings, pp. 64–71. IEEE (2002)
Mohammed, I., Sobh, A.H.: An optimized dual classification system for Arabic extractive generic text summarization. Ph.D. thesis, Citeseer (2009)
Tanaka, H., Kinoshita, A., Kobayakawa, T., Kumano, T., Kato, N.: Syntax-driven sentence revision for broadcast news summarization. In: Proceedings of the 2009 Workshop on Language Generation and Summarisation, pp. 39–47. Association for Computational Linguistics (2009)
Wan, X., Xiao, J.: Towards a unified approach based on affinity graph to various multi-document summarizations. In: International Conference on Theory and Practice of Digital Libraries, pp. 297–308. Springer (2007)
Wang, T., Chen, P., Amaral, K., Qiang, J.: An experimental study of lstm encoder-decoder model for text simplification. arXiv:1609.03663 (2016)
Wang, T., Chen, P., Rochford, J., Qiang, J.: Text simplification using neural machine translation. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Woodsend, K., Lapata, M.: Wikisimple: automatic simplification of wikipedia articles. In: AAAI (2011)
Yamangil, E., Shieber, S.M.: Bayesian synchronous tree-substitution grammar induction and its application to sentence compression. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 937–947. Association for Computational Linguistics (2010)
Yeh, J.-Y., Ke, H.-R., Yang, W.-P., Meng, I.-H.: Text summarization using a trainable summarizer and latent semantic analysis. Inf. Process. Manage. 41(1), 75–95 (2005)
Yoshikawa, K., Hirao, T., Iida, R., Okumura, M.: Sentence compression with semantic role constraints. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Vol. 2, pp. 349–353. Association for Computational Linguistics (2012)
Zajic, D., Dorr, B.J., Lin, J., Schwartz, R.: Multi-candidate reduction: sentence compression as a tool for document summarization tasks. Inf. Process. Manage. 43(6), 1549–1570 (2007)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Belkebir, R., Guessoum, A. (2018). TALAA-ATSF: A Global Operation-Based Arabic Text Summarization Framework. In: Shaalan, K., Hassanien, A., Tolba, F. (eds) Intelligent Natural Language Processing: Trends and Applications. Studies in Computational Intelligence, vol 740. Springer, Cham. https://doi.org/10.1007/978-3-319-67056-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-67056-0_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67055-3
Online ISBN: 978-3-319-67056-0
eBook Packages: EngineeringEngineering (R0)