Skip to main content

Knowledge Based System for Composing Sentences to Summarize Documents

  • Conference paper
  • First Online:
  • 343 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 976))

Abstract

This chapter provides the details on how to build a knowledge-based system that is capable of composing new sentences to summarize multiple documents. The system is also capable of identifying the main topics of the given documents and is able to derive new concepts based on the given text data. In order to process the documents conceptually to create abstractive summaries, the system makes use of the Cyc development platform that consists of the world’s largest knowledge base and one of the most powerful inference engines. The resultant knowledge based system first uses natural language processing techniques to extracts syntactic structure of the documents and then maps the words of the sentences into related concepts in the knowledge base. It then uses the inference engine to generalize and fuse concepts to form more abstract concepts. Since a word can be mapped into multiple concepts, the system also includes new techniques to handle word-sense disambiguation by using concept weights. After the generalization, the system is able to identify the main topics and the key concepts of the documents. The system then composes new sentences based on the key concepts by linking subject concepts with their related predicate concepts. The syntactic structure of the newly created sentences extends beyond simple subject-predicate-object triplets by incorporating adjective and adverb modifiers. The final stage is then to map the linked concepts back to words to form the abstractive sentences. The system has been implemented and tested. The implementation encodes a process that consists of seven stages: syntactic analysis, words mapping, concept propagation, concept weights and relations accumulation, topic derivation, subject identification, and new sentence generation. The implementation has been tested on various documents and webpages. The test results showed that the system is capable of creating new sentences that include abstracted concepts not explicitly mentioned in the original documents and that contain information synthesized from different parts of the documents to compose a summary.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Timofeyev, A., Choi, B.: Knowledge based automatic summarization. In: Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3 K 2017). pp. 350–356. SCITEPRESS (2017). https://doi.org/10.5220/0006580303500356

  2. Cycorp – Cycorp Making Solutions Better. http://www.cyc.com

  3. Cheung, J., Penn, G.: Towards robust abstractive multi-document summarization: a caseframe analysis of centrality and domain. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 1233–1242. Association for Computational Linguistics (2013)

    Google Scholar 

  4. Luhn, H.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2, 159–165 (1958). https://doi.org/10.1147/Rd.22.0159

    Article  MathSciNet  Google Scholar 

  5. Nenkova, A., Mckeown, K.: A survey of text summarization techniques. In: Charu, A., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Hovy, E., Chin-Yew, L.: Automated text summarization and the SUMMARIST system. In: Proceedings of a Workshop Held at Baltimore, Maryland, 13–15 October 1998, pp. 197–214. Association for Computational Linguistics (1998). https://doi.org/10.3115/1119089.1119121

  7. Radev, D., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manag. 40, 919–938 (2004). https://doi.org/10.3115/1117575.1117578

    Article  MATH  Google Scholar 

  8. Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. Adv. Autom. Text Summ., 111–121 (1999). https://doi.org/10.7916/d85b09vz

  9. Ye, S., Chua, T., Kan, M., Qiu, L.: Document concept lattice for text understanding and summarization. Inf. Process. & Manag. 43, 1643–1662 (2007). https://doi.org/10.1016/J.Ipm.2007.03.010

    Article  Google Scholar 

  10. Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–25. ACM (2001). https://doi.org/10.1145/383952.383955

  11. Shen, D., Sun, J., Li, H., Yang, Q., Chen, Z.: Document summarization using conditional random fields. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 2862–2867. IJCAI (2007)

    Google Scholar 

  12. Bing, L., Li, P., Liao, Y., Lam, W., Guo, W., Passonneau, R.: Abstractive multi-document summarization via phrase selection and merging. In: Proceedings of the ACL-IJCNLP, pp. 1587–1597. Association for Computational Linguistics (2015)

    Google Scholar 

  13. Ganesan, K., Zhai, C., Han, J.: Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 340–348. Association for Computational Linguistics (2010)

    Google Scholar 

  14. Moawad, I., Aref, M.: Semantic graph reduction approach for abstractive text summarization. In: 2012 Seventh International Conference Computer Engineering & Systems (ICCES), pp. 132–138. IEEE (2012). https://doi.org/10.1109/icces.2012.6408498

  15. Bellare, K., Sharma, A.D., Loiwal, N., Mehta, V., Ramakrishnan, G., Bhattacharyya, P.: Generic text summarization using WordNet. In: Language Resources and Evaluation Conference, pp. 691–694. LREC (2004)

    Google Scholar 

  16. Pal, A., Saha, D.: An approach to automatic text summarization using WordNet. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 1169–1173. IEEE (2014). https://doi.org/10.1109/iadcc.2014.6779492

  17. Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI 2017). pp. 3075–3081. AAAI (2017)

    Google Scholar 

  18. Rush, A.M., Chopra, S., Wetson, J.: A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 379–389. EMNLP (2015). https://doi.org/10.18653/v1/d15-1044

  19. Choi, B., Huang, X.: Creating new sentences to summarize documents. In: The 10th IASTED International Conference on Artificial Intelligence and Application (AIA 2010), pp. 458–463. IASTED (2010)

    Google Scholar 

  20. Jpype - Java to Python Integration. http://jpype.sourceforge.net

  21. Honnibal, M., Johnson, M.: An improved non-monotonic transition system for dependency parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1373–1378. EMNLP (2015). https://doi.org/10.18653/v1/d15-1162

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrey Timofeyev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Timofeyev, A., Choi, B. (2019). Knowledge Based System for Composing Sentences to Summarize Documents. In: Fred, A., et al. Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2017. Communications in Computer and Information Science, vol 976. Springer, Cham. https://doi.org/10.1007/978-3-030-15640-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-15640-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-15639-8

  • Online ISBN: 978-3-030-15640-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics