Abstract
This chapter provides the details on how to build a knowledge-based system that is capable of composing new sentences to summarize multiple documents. The system is also capable of identifying the main topics of the given documents and is able to derive new concepts based on the given text data. In order to process the documents conceptually to create abstractive summaries, the system makes use of the Cyc development platform that consists of the world’s largest knowledge base and one of the most powerful inference engines. The resultant knowledge based system first uses natural language processing techniques to extracts syntactic structure of the documents and then maps the words of the sentences into related concepts in the knowledge base. It then uses the inference engine to generalize and fuse concepts to form more abstract concepts. Since a word can be mapped into multiple concepts, the system also includes new techniques to handle word-sense disambiguation by using concept weights. After the generalization, the system is able to identify the main topics and the key concepts of the documents. The system then composes new sentences based on the key concepts by linking subject concepts with their related predicate concepts. The syntactic structure of the newly created sentences extends beyond simple subject-predicate-object triplets by incorporating adjective and adverb modifiers. The final stage is then to map the linked concepts back to words to form the abstractive sentences. The system has been implemented and tested. The implementation encodes a process that consists of seven stages: syntactic analysis, words mapping, concept propagation, concept weights and relations accumulation, topic derivation, subject identification, and new sentence generation. The implementation has been tested on various documents and webpages. The test results showed that the system is capable of creating new sentences that include abstracted concepts not explicitly mentioned in the original documents and that contain information synthesized from different parts of the documents to compose a summary.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Timofeyev, A., Choi, B.: Knowledge based automatic summarization. In: Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3 K 2017). pp. 350–356. SCITEPRESS (2017). https://doi.org/10.5220/0006580303500356
Cycorp – Cycorp Making Solutions Better. http://www.cyc.com
Cheung, J., Penn, G.: Towards robust abstractive multi-document summarization: a caseframe analysis of centrality and domain. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 1233–1242. Association for Computational Linguistics (2013)
Luhn, H.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2, 159–165 (1958). https://doi.org/10.1147/Rd.22.0159
Nenkova, A., Mckeown, K.: A survey of text summarization techniques. In: Charu, A., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, Heidelberg (2012)
Hovy, E., Chin-Yew, L.: Automated text summarization and the SUMMARIST system. In: Proceedings of a Workshop Held at Baltimore, Maryland, 13–15 October 1998, pp. 197–214. Association for Computational Linguistics (1998). https://doi.org/10.3115/1119089.1119121
Radev, D., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manag. 40, 919–938 (2004). https://doi.org/10.3115/1117575.1117578
Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. Adv. Autom. Text Summ., 111–121 (1999). https://doi.org/10.7916/d85b09vz
Ye, S., Chua, T., Kan, M., Qiu, L.: Document concept lattice for text understanding and summarization. Inf. Process. & Manag. 43, 1643–1662 (2007). https://doi.org/10.1016/J.Ipm.2007.03.010
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–25. ACM (2001). https://doi.org/10.1145/383952.383955
Shen, D., Sun, J., Li, H., Yang, Q., Chen, Z.: Document summarization using conditional random fields. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 2862–2867. IJCAI (2007)
Bing, L., Li, P., Liao, Y., Lam, W., Guo, W., Passonneau, R.: Abstractive multi-document summarization via phrase selection and merging. In: Proceedings of the ACL-IJCNLP, pp. 1587–1597. Association for Computational Linguistics (2015)
Ganesan, K., Zhai, C., Han, J.: Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 340–348. Association for Computational Linguistics (2010)
Moawad, I., Aref, M.: Semantic graph reduction approach for abstractive text summarization. In: 2012 Seventh International Conference Computer Engineering & Systems (ICCES), pp. 132–138. IEEE (2012). https://doi.org/10.1109/icces.2012.6408498
Bellare, K., Sharma, A.D., Loiwal, N., Mehta, V., Ramakrishnan, G., Bhattacharyya, P.: Generic text summarization using WordNet. In: Language Resources and Evaluation Conference, pp. 691–694. LREC (2004)
Pal, A., Saha, D.: An approach to automatic text summarization using WordNet. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 1169–1173. IEEE (2014). https://doi.org/10.1109/iadcc.2014.6779492
Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI 2017). pp. 3075–3081. AAAI (2017)
Rush, A.M., Chopra, S., Wetson, J.: A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 379–389. EMNLP (2015). https://doi.org/10.18653/v1/d15-1044
Choi, B., Huang, X.: Creating new sentences to summarize documents. In: The 10th IASTED International Conference on Artificial Intelligence and Application (AIA 2010), pp. 458–463. IASTED (2010)
Jpype - Java to Python Integration. http://jpype.sourceforge.net
Honnibal, M., Johnson, M.: An improved non-monotonic transition system for dependency parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1373–1378. EMNLP (2015). https://doi.org/10.18653/v1/d15-1162
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Timofeyev, A., Choi, B. (2019). Knowledge Based System for Composing Sentences to Summarize Documents. In: Fred, A., et al. Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2017. Communications in Computer and Information Science, vol 976. Springer, Cham. https://doi.org/10.1007/978-3-030-15640-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-15640-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15639-8
Online ISBN: 978-3-030-15640-4
eBook Packages: Computer ScienceComputer Science (R0)