Knowledge Based System for Composing Sentences to Summarize Documents

Timofeyev, Andrey; Choi, Ben

doi:10.1007/978-3-030-15640-4_9

Knowledge Based System for Composing Sentences to Summarize Documents

Andrey Timofeyev¹⁵ &
Ben Choi¹⁵

Conference paper
First Online: 15 March 2019

343 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 976))

Abstract

This chapter provides the details on how to build a knowledge-based system that is capable of composing new sentences to summarize multiple documents. The system is also capable of identifying the main topics of the given documents and is able to derive new concepts based on the given text data. In order to process the documents conceptually to create abstractive summaries, the system makes use of the Cyc development platform that consists of the world’s largest knowledge base and one of the most powerful inference engines. The resultant knowledge based system first uses natural language processing techniques to extracts syntactic structure of the documents and then maps the words of the sentences into related concepts in the knowledge base. It then uses the inference engine to generalize and fuse concepts to form more abstract concepts. Since a word can be mapped into multiple concepts, the system also includes new techniques to handle word-sense disambiguation by using concept weights. After the generalization, the system is able to identify the main topics and the key concepts of the documents. The system then composes new sentences based on the key concepts by linking subject concepts with their related predicate concepts. The syntactic structure of the newly created sentences extends beyond simple subject-predicate-object triplets by incorporating adjective and adverb modifiers. The final stage is then to map the linked concepts back to words to form the abstractive sentences. The system has been implemented and tested. The implementation encodes a process that consists of seven stages: syntactic analysis, words mapping, concept propagation, concept weights and relations accumulation, topic derivation, subject identification, and new sentence generation. The implementation has been tested on various documents and webpages. The test results showed that the system is capable of creating new sentences that include abstracted concepts not explicitly mentioned in the original documents and that contain information synthesized from different parts of the documents to compose a summary.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Timofeyev, A., Choi, B.: Knowledge based automatic summarization. In: Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3 K 2017). pp. 350–356. SCITEPRESS (2017). https://doi.org/10.5220/0006580303500356
Cycorp – Cycorp Making Solutions Better. http://www.cyc.com
Cheung, J., Penn, G.: Towards robust abstractive multi-document summarization: a caseframe analysis of centrality and domain. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 1233–1242. Association for Computational Linguistics (2013)
Google Scholar
Luhn, H.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2, 159–165 (1958). https://doi.org/10.1147/Rd.22.0159
Article MathSciNet Google Scholar
Nenkova, A., Mckeown, K.: A survey of text summarization techniques. In: Charu, A., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, Heidelberg (2012)
Chapter Google Scholar
Hovy, E., Chin-Yew, L.: Automated text summarization and the SUMMARIST system. In: Proceedings of a Workshop Held at Baltimore, Maryland, 13–15 October 1998, pp. 197–214. Association for Computational Linguistics (1998). https://doi.org/10.3115/1119089.1119121
Radev, D., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manag. 40, 919–938 (2004). https://doi.org/10.3115/1117575.1117578
Article MATH Google Scholar
Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. Adv. Autom. Text Summ., 111–121 (1999). https://doi.org/10.7916/d85b09vz
Ye, S., Chua, T., Kan, M., Qiu, L.: Document concept lattice for text understanding and summarization. Inf. Process. & Manag. 43, 1643–1662 (2007). https://doi.org/10.1016/J.Ipm.2007.03.010
Article Google Scholar
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–25. ACM (2001). https://doi.org/10.1145/383952.383955
Shen, D., Sun, J., Li, H., Yang, Q., Chen, Z.: Document summarization using conditional random fields. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 2862–2867. IJCAI (2007)
Google Scholar
Bing, L., Li, P., Liao, Y., Lam, W., Guo, W., Passonneau, R.: Abstractive multi-document summarization via phrase selection and merging. In: Proceedings of the ACL-IJCNLP, pp. 1587–1597. Association for Computational Linguistics (2015)
Google Scholar
Ganesan, K., Zhai, C., Han, J.: Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 340–348. Association for Computational Linguistics (2010)
Google Scholar
Moawad, I., Aref, M.: Semantic graph reduction approach for abstractive text summarization. In: 2012 Seventh International Conference Computer Engineering & Systems (ICCES), pp. 132–138. IEEE (2012). https://doi.org/10.1109/icces.2012.6408498
Bellare, K., Sharma, A.D., Loiwal, N., Mehta, V., Ramakrishnan, G., Bhattacharyya, P.: Generic text summarization using WordNet. In: Language Resources and Evaluation Conference, pp. 691–694. LREC (2004)
Google Scholar
Pal, A., Saha, D.: An approach to automatic text summarization using WordNet. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 1169–1173. IEEE (2014). https://doi.org/10.1109/iadcc.2014.6779492
Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI 2017). pp. 3075–3081. AAAI (2017)
Google Scholar
Rush, A.M., Chopra, S., Wetson, J.: A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 379–389. EMNLP (2015). https://doi.org/10.18653/v1/d15-1044
Choi, B., Huang, X.: Creating new sentences to summarize documents. In: The 10th IASTED International Conference on Artificial Intelligence and Application (AIA 2010), pp. 458–463. IASTED (2010)
Google Scholar
Jpype - Java to Python Integration. http://jpype.sourceforge.net
Honnibal, M., Johnson, M.: An improved non-monotonic transition system for dependency parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1373–1378. EMNLP (2015). https://doi.org/10.18653/v1/d15-1162

Download references

Author information

Authors and Affiliations

Computer Science, Louisiana Tech University, Ruston, USA
Andrey Timofeyev & Ben Choi

Authors

Andrey Timofeyev
View author publications
You can also search for this author in PubMed Google Scholar
Ben Choi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrey Timofeyev .

Editor information

Editors and Affiliations

Instituto de Telecomunicações, Lisbon, Portugal
Ana Fred
University of Madeira, Funchal, Portugal
David Aveiro
Delft University of Technology, Delft, The Netherlands
Jan L. G. Dietz
Henley Business School, University of Reading, Reading, UK
Kecheng Liu
University of Coimbra, Coimbra, Portugal
Jorge Bernardino
Federal University of Pernambuco, Recife, Brazil
Ana Salgado
INSTICC and Instituto Politecnico de Setúbal, Setúbal, Portugal
Joaquim Filipe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Timofeyev, A., Choi, B. (2019). Knowledge Based System for Composing Sentences to Summarize Documents. In: Fred, A., et al. Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2017. Communications in Computer and Information Science, vol 976. Springer, Cham. https://doi.org/10.1007/978-3-030-15640-4_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-15640-4_9
Published: 15 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15639-8
Online ISBN: 978-3-030-15640-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics