Skip to main content

Summary Generation Centered on Important Words

  • Conference paper
Information Retrieval Technology (AIRS 2004)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3411))

Included in the following conference series:

  • 407 Accesses

Abstract

We developed a summarizing system ABISYS based on the output of semantic analysis system SAGE. ABISYS extracts important words from an article and generates summary sentences according to the word meanings and the deep cases among the words in the output from SAGE. In this paper, we define five kinds of scores to evaluate the importance of a word respectively on repetition information, context information, position information, opinion word information and topic-focus information. We first calculate the above scores for each substantive and reflect them in a five-dimensional space. Then the probability of each substantive to be important is calculated using a pan-distance of Mahalanobis. Finally, we complement the indispensable cases for verbs and the Sahen nouns that have been selected as important words, and use them as the summary element words to generate easy-to-read Japanese sentences. We carried out a subjectivity evaluation for our system output by referring to the summaries made by human. In comparison with the subjectivity evaluations made for other summarizing systems, we found that the point of readability was on a par with other systems, while the point of content covering was much better. And 95% of the summary sentences generated by ABISYS were acknowledged as correct Japanese sentences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hahn, U., Reimer, U.: Topic essentials. In: Proceedings of the 11th COLING, Bonn, Germany, pp. 497–503 (1986)

    Google Scholar 

  2. Harada, M., Mizuno, T.: Japanese semantic analysis system SAGE using EDR. Transactions of the Japanese Society of Artificial Intelligence 16(1), 85–93 (2001) (in Japanese)

    Article  Google Scholar 

  3. Harada, M., Tabuchi, K., Oono, H.: Improvement of speed and accuracy of Japanese semantic analysis system SAGE and its accuracy evaluation by comparison with EDR corpus. Transactions of Information Processing Society of Japan 43(9), 2894–2902 (2002) (in Japanese)

    Google Scholar 

  4. Hatayama, M., Matsuo, Y., Shirai, S.: Summarizing newspaper articles using extracted informative and functional words. Journal of Natural Language Processing 9(4), 55–70 (2002) (in Japanese)

    Google Scholar 

  5. Hovy, E.H., Lin, C.: Automated text summarization in SUMMARIST. In: Proceedings of the ACL 1997/EACL 1997 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain (1997)

    Google Scholar 

  6. Ishizako, Y., Kataoka, A., Masuyama, S., Nakagawa, S.: Summarization by reducing overlaps and its application to TV news texts. SIG Notes NL-133, 45–52 (1999) (in Japanese)

    Google Scholar 

  7. Kawabata, T., Harada, M.: Development research on system InSeRA which analyzes the semantic relations between Japanese sentences. SIG Notes NL-142, 105–112 (2001) (in Japanese)

    Google Scholar 

  8. Kurohasi, S., Nagao, M.: Nihongo Koubun Kaiseki System KNP manual, version 2.0b6 (1998), http://www.kc.t.u-tokyo.ac.jp/nl-resource/index.html

  9. Maezawa, T., Menrai, M., Ueno, M., Han, D., Harada, M.: Improvement of the precision of the semantic analysis system SAGE, and generation of conceptual graph. In: Proceedings of the 66th National Conference of Information Processing Society of Japan, vol. 2, pp. 177–178 (2004) (in Japanese)

    Google Scholar 

  10. Matsumoto etc. Morphological analysis system chasen version 2.2.8 manual (1999), http://chasen.aist-nara.ac.jp/hiki/chasen

  11. Minami, A., Harada, M.: Development of anaphoric analysis system which uses similarity of vocabulary. In: Proceedings of the 64th National Conference of Information Processing Society of Japan, vol. 2, pp. 53–54 (2002) (in Japanese)

    Google Scholar 

  12. Nanba, H., Okumura, M.: Analysis of the results and evaluation methods of text summarization challenge (TSC), a subtask of NTCIR Workshop 2. Technical Report of IEICE, NLC2001-28, pp. 46-52, (2001) (in Japanese)

    Google Scholar 

  13. Oguro, R., Ozeki, K., Zhang, Y., Takagi, K.: A Japanese sentence compaction algorithm based on phrase significance and inter-phrase dependency. Journal of Natural Language Processing 8(3), 3–18 (2001) (in Japanese)

    Google Scholar 

  14. Ohtake, K., Okamoto, D., Kodama, M., Masuyama, S.: A summarization system YELLOW for Japanese newspaper articles. Transactions of Information Processing Society of Japan 43 SIG2 , 37–47 (2002) (in Japanese)

    Google Scholar 

  15. Okumura, M., Nanba, H.: Automated text summarization: a survey. Journal of Natural Language Processing 6(6), 1–26 (1999) (in Japanese)

    Google Scholar 

  16. Okumura, M., Nanba, H.: New topics on automated text summarization. Journal of Natural Language Processing 9(4), 97–116 (2002) (in Japanese)

    Google Scholar 

  17. Sakuma, M.: Bunshou Kouzou To Youyakubun No Showou (Version 3). Kurosio Publisher (2000) (in Japanese)

    Google Scholar 

  18. Ueda, Y., Oka, M., Koyama, T., Miyauchi, T.: Development and evaluation of a summarization system based on phrase-representation summarization method. Journal of Natural Language Processing 9(4), 75–96 (2002) (in Japanese)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Han, D., Noguchi, T., Yago, T., Harada, M. (2005). Summary Generation Centered on Important Words. In: Myaeng, S.H., Zhou, M., Wong, KF., Zhang, HJ. (eds) Information Retrieval Technology. AIRS 2004. Lecture Notes in Computer Science, vol 3411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31871-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31871-2_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25065-4

  • Online ISBN: 978-3-540-31871-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics