Abstract
In this paper we present a fresh look at the problem of summarizing evolving events from multiple sources. After a discussion concerning the nature of evolving events we introduce a distinction between linearly and non-linearly evolving events. We present then a general methodology for the automatic creation of summaries from evolving events. At its heart lie the notions of Synchronic and Diachronic cross-document Relations (SDRs), whose aim is the identification of similarities and differences between sources, from a synchronical and diachronical perspective. SDRs do not connect documents or textual elements found therein, but structures one might call messages. Applying this methodology will yield a set of messages and relations, SDRs, connecting them, that is a graph which we call grid. We will show how such a grid can be considered as the starting point of a Natural Language Generation System. The methodology is evaluated in two case-studies, one for linearly evolving events (descriptions of football matches) and another one for non-linearly evolving events (terrorist incidents involving hostages). In both cases we evaluate the results produced by our computational systems.
Similar content being viewed by others
References
Afantenos, S. D., Doura, I., Kapellou, E., & Karkaletsis, V. (2004). Exploiting cross-document relations for multi-document evolving summarization. In G. A. Vouros & T. Panayiotopoulos (Eds.), Methods and applications of artificial intelligence: third Hellenic conference on AI, SETN 2004, Samos, Greece, Lecture Notes in Computer Science, vol. 3025 (pp. 410–419). Berlin Heidelberg New York: Springer (May).
Afantenos, S. D., Karkaletsis, V., & Stamatopoulos, P. (2005a). Summarization from medical documents: A survey. Journal of Artificial Intelligence in Medicine, 33(2), 157–177 (February).
Afantenos, S. D., Karkaletsis, V., & Stamatopoulos, P. (2005b). Summarizing reports on evolving events; Part I: linear evolution. In G. Angelova, K. Bontcheva, R. Mitkov, N. Nicolov & N. Nikolov (Eds.), Recent advances in natural language processing (RANLP 2005) (pp. 18–24). Borovets, Bulgaria: INCOMA (September).
Afantenos, S. D., Liontou, K., Salapata, M., & Karkaletsis, V. (2005). An introduction to the summarization of evolving events: Linear and non-linear evolution. In B. Sharp (Ed.), Proceedings of the 2nd international workshop on natural language understanding and cognitive science, NLUCS 2005 (pp. 91–99). Miami, FL: INSTICC Press (May).
Allan, J., Carbonell, J., Doddington, G., Yamron, J., & Yang, Y. (1998). Topic detection and tracking pilot study: Final report. In Proceedings of the DARPA broadcast news transcription and understanding workshop (pp. 194–218) (February).
Allan, J., Gupta, R., & Khandelwal, V. (2001). Temporal summaries of news stories. In Proceedings of the ACM SIGIR 2001 conference (pp. 10–18).
Allan, J., Papka, R., & Lavrenko, V. (1998). On-line new event detection and tracking. In Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval (pp. 37–45). Melbourne, Australia (August).
Barzilay, R., McKeown, K. R., & Elhadad, M. (1999). Information fusion in the context of multi-document summarization. In Proceedings of the 37th association for computational linguistics, Maryland.
Cieri, C. (2000). Multiple annotations of reusable data resources: Corpora for topic detection and tracking. In Actes 5ième Journées Internationales d’Analyse Statistique des Données Textuelles (JADT).
Edmundson, H. P. (1969). New methods in automatic extracting. Journal for the Association for Computing Machinery, 16(2), 264–285.
Endres-Niggemeyer, B. (1998). Summarizing information. Berlin Heidelberg New York: Springer.
Goldstein, J., Mittal, V., Carbonell, J., & Callan, J. (2000). Creating and evaluating multi-document sentence extract summaries. In Proceedings of the 2000 ACM CIKM international conference on information and knowledge management. McLean, VA (pp. 165–172) (November).
Grishman, R. (2005). NLP: An information extraction perspective. In G. Angelova, K. Bontcheva, R. Mitkov, N. Nicolov & N. Nikolov (Eds.), Recent advances in natural language processing (RANLP 2005) (pp. 1–4). Borovets, Bulgaria: INCOMA (September).
Jones, D., Bench-Capon, T., & Visser, P. (1998). Methodologies for ontology development. In Proceedings of the IT & KNOWS conference, XV IFIP world computer congress, Budapest.
Lehnert, W. G. (1981). Plot units: A narrative summarization strategy. In W. G. Lehnert & M. H. Ringle (Eds.), Strategies for natural language processing (pp. 223–244). Hillsdale, NJ: Erlbaum (Also in, Mani & Maybury, 1999).
Lopez, M. F. (1999). Overview of methodologies for building ontologies. In Proceedings of the workshop on ontologies and problem-solving methods: Lessons learned and future trends (IJCAI99), Stockholm.
Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of Research & Development, 2(2), 159–165.
Mani, I. (2001). Automatic summarization. In Natural Language Processing, vol. 3. Amsterdam/Philadelphia: John Benjamins.
Mani, I. (2004). Narrative summarization. Journal Traitement Automatique des Langues (TAL): Special issue on “Le résumé automatique de texte: Solutions et perspectives”, 45(1) (Fall).
Mani, I., & Bloedorn, E. (1999). Summarizing similarities and differences among related documents. Information Retrieval, 1(1), 1–23.
Mani, I., & Maybury, M. T. (eds.) (1999). Advances in automatic text summarization. Cambridge, MA: MIT Press.
Mann, W. C., & Thompson, S. A. (1987). Rhetorical structure theory: A framework for the analysis of texts. Technical report ISI/RS-87-185. Information Sciences Institute, Marina del Rey, CA.
Mann, W. C., & Thompson, S. A. (1988). Rhetorical structure theory: Towards a functional theory of text organization. Text, 8(3), 243–281.
Marcu, D. (1997). The rhetorical parsing of natural language texts. In Proceedings of the 35th annual meeting of the association for computational linguistics (pp. 96–103). New Brunswick, NJ: Association for Computational Linguistics.
Marcu, D. (2000). The theory and practice of discourse parsing and summarization. Cambridge, MA: MIT Press.
Marcu, D. (2001). Discourse-based summarization in DUC-2001. In Workshop on text summarization (DUC 2001). New Orleans.
Papka, R. (1999). On-line new event detection, clustering and tracking. Ph.D. dissertation, Department of Computer Science, University of Massachusetts.
Petasis, G., Karkaletsis, V., Farmakiotou, D., Androutsopoulos, I. & Spyropoulos, C. D. (2003). A Greek morphological lexicon and its exploitation by natural language processing applications. In Y. Manolopoulos, S. Evripidou & A. Kakas (Eds.), Advances in informatics; Post-proceedings of the 8th Panhellenic conference in informatics, Lecture Notes in Computer Science (LNCS), vol. 2563 (pp. 401–419).
Petasis, G., Karkaletsis, V., Paliouras, G., Androutsopoulos, I., & Spyropoulos, C. D. (2002). Ellogon: A new text engineering platform. In Proceedings of the 3rd international conference on language resources and evaluation (LREC 2002) (pp. 72–78). Las Palmas, Canary Islands, Spain (May).
Pinker, S. (1997). How the mind works. New York: Norton.
Pinto, H. S., & J. P. Martins. (2004). Ontologies: How can they be built? Knowledge and Information Systems, 6(4), 441–464.
Radev, D. R. (1999). Generating natural language summaries from multiple on-line sources: Language reuse and regeneration. Ph.D. dissertation, Columbia University.
Radev, D. R. (October 2000). A common theory of information fusion from multiple text sources, step one: Cross-document structure. In Proceedings of the 1st ACL SIGDIAL workshop on discourse and dialogue. Hong Kong.
Radev, D. R., & McKeown, K. R. (1998). Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3), 469–500 (September).
Reiter, E., & Dale, R. (1997). Building applied natural language generation systems. Natural Language Engineering, 3(1), 57–87.
Reiter, E., & Dale, R. (2000). Building natural language generation systems. Studies in natural language processing. Cambridge, UK: Cambridge University Press.
Salton, G., Singhal, A., Mitra, M., & Buckley, C. (1997). Automatic text structuring and summarization. Information Processing and Management, 33(2), 193–207.
Stamatiou, G. (2005). Extraction and normalization of temporal expressions in the context of summarizing evolving events. Master’s thesis, University of the Aegean.
Taboada, M., & Mann, W. C. (2006). Rhetorical structure theory: Looking back and moving ahead. Discourse Studies, 8(3), 423–459.
Witten, I. H., & Frank, E. (2000). Data mining: Practical machine learning tools and techniques with Java implementations. San Francisco, CA: Morgan Kaufmann.
Zhang, Z., Blair-Goldensohn, S., & Radev, D. (2002). Towards CST-enhanced summarization. In Proceedings of AAAI-2002 (August).
Zhang, Z., Otterbacher, J., & Radev, D. (2003). Learning cross-document structural relationships using boosting. Proccedings of the twelfth international conference on information and knowledge management CIKM 2003. New Orleans, LA (pp. 124–130) (November).
Zhang, Z., & Radev, D. (March 2004). Learning cross-document structural relationships using both labeled and unlabeled data. In Proceedings of IJC-NLP 2004. Hainan Island, China.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Afantenos, S.D., Karkaletsis, V., Stamatopoulos, P. et al. Using synchronic and diachronic relations for summarizing multiple documents describing evolving events. J Intell Inf Syst 30, 183–226 (2008). https://doi.org/10.1007/s10844-006-0025-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-006-0025-9