Skip to main content

Automatic Summarization of Chinese and English Parallel Documents

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2911))

Abstract

As a result of the rapid growth in Internet access, significantly more information has become available online in real time. However, there is not sufficient time for users to read large volumes of information and make decisions accordingly. The problem of information-overloading can be resolved through the application of automatic summarization. Many summarization systems for documents in different languages have been implemented. However, the performance of summarization system on documents in different languages has not yet been investigated. In this paper, we compare the result of fractal summarization technique on parallel documents in Chinese and English. The grammatical and lexical differences between Chinese and English have significant effect on the summarization processes. Their impact on the performances of the summarization for the Chinese and English parallel documents is compared.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barnsley, M.F., Jacquin, A.E.: Application of Recurrent Iterated Function Systems to Images. In: Proceedings of SPIE Visual Communications and Image Processing 1988, vol. 1001, pp. 122–131 (1988)

    Google Scholar 

  2. Baxendale, P.: Machine-Made Index for Technical Literature - An Experiment. IBM Journal, 354–361 (October 1958)

    Google Scholar 

  3. Chen, H.H., Huang, S.J.: A Summarization System for Chinese News from Multiple Sources. In: Proceedings of 4th International Workshop on Information Retrieval with Asia Languages, pp. 1–7 (1999)

    Google Scholar 

  4. Cowie, J., Mahesh, K., Nirenburg, S., Zajaz, R.: MINDS-Multilingual Interactive Document Summarization. In: Working Notes of the AAAI Spring Symposium on Intelligent Text Summarization, California, USA, pp. 131–132. AAAI Press, Menlo Park (1998)

    Google Scholar 

  5. Edmundson, H.P.: New Method in Automatic Extraction. Journal of the ACM 16(2), 264–285 (1968)

    Article  Google Scholar 

  6. Endres-Niggemeyer, B., Maier, E., Sigel, A.: How to Implement a Naturalistic Model of Abstracting: Four Core Working Steps of an Expert Abstractor. Information Processing and Management 31(5), 631–674 (1995)

    Article  Google Scholar 

  7. Feder, J.: Fractals. Plenum, New York (1988)

    MATH  Google Scholar 

  8. Frakes, W.: Stemming Algorithms. In: Frakes, W.B., Baeza-Yates, R. (eds.) Information Retrieval: data structures and algorithms, pp. 131–160. Prentice-Hall, Englewood Cliffs (1992)

    Google Scholar 

  9. Gallager, R.: Information theory and reliable communication (1968)

    Google Scholar 

  10. Gan, K.W., Palmer, M., Lua, K.T.: A Statistically Emergent Approach for Language Processing: Application to Modeling Context effects in Ambiguous Chinese Word Boundary Perception. Computational Linguistics, 531–553 (1996)

    Google Scholar 

  11. Glaser, B.G., Strauss, A.L.: The Discovery of Grounded Theory, Strategies for Qualitative Research. Aldine de Gruyter, New York (1967)

    Google Scholar 

  12. Hearst, M.A.: Subtopic Structuring for Full-Length Document Access. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 56–68 (1993)

    Google Scholar 

  13. Hull, D.: Stemming algorithms - a case study for detailed evaluation. Journal of the American Society for Information Science 47(1), 70–84 (1996)

    Article  Google Scholar 

  14. Kataoka, A., Masuyama, S., Yamamoto, K.: Summarization by shortening a Japanese Noun Modifier into Expression ‘A no B’. In: Proceedings of NLPRS 1999, pp. 409–414 (1999)

    Google Scholar 

  15. Koike, H.: Fractal Views: A Fractal-Based Method for Controlling Information Display. ACM Transaction on Information Systems 13(3), 305–323 (1995)

    Article  Google Scholar 

  16. Lam-Adesina, M., Jones, G.J.F.: Applying Summarization Techniques for Term Selection in Relevance Feedback. In: Proceedings of SIGIR 2001, pp. 1–9 (2001)

    Google Scholar 

  17. Lin, Y., Hovy, E.H.: Identifying Topics by Position. In: Proceedings of the Applied Natural Language Processing Conference (ANLP 1997), Washington, DC, pp. 283–290 (1997)

    Google Scholar 

  18. Luhn, H.P.: The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development, 159–165 (1958)

    Google Scholar 

  19. Mandelbrot, B.: The fractal geometry of nature. W.H. Freeman, New York (1983)

    Google Scholar 

  20. Mani, I.: Recent Development in Text Summarization. In: ACM CIKM 2001, Georgia, USA, pp. 529–531 (2001)

    Google Scholar 

  21. Myaeng, S.H., Jang, D.H.: Development and Evaluation of a Statistically-Based Document Summarization System. In: Mani, I. (ed.) Advances in Automatic Text Summarization, pp. 61–70. MIT Press, Cambridge (1999)

    Google Scholar 

  22. Nie, J.Y., Hannan, M.L., Jin, W.: Combining Dictionary, Rules and Statistical Information in Segmentation of Chinese. Computer Processing of Chinese and Oriental Languages, 125–143 (1995)

    Google Scholar 

  23. Ogden, W., Cowie, J., Davis, M., Ludovik, E., Molina-Salgado, H., Shin, H.: Getting information from documents you cannot read: an interactive cross-language text retrieval and summarization system. In: Joint ACM DL/SIGIR Workshop on Multilingual Information Discovery and Access (1999)

    Google Scholar 

  24. Salton G., and Buckley C. Term-Weighting Approaches in Automatic Text Retrieval. Information Processing and Management, 24, 513–523, 1988.

    Article  Google Scholar 

  25. Yang, C.C., Luk, J., Yung, J., Yen, J.: Combination and Boundary Detection Approach for Chinese Indexing. Journal of the American Society for Information Science, Special Topic Issue on Digital Libraries 51(4), 340–351 (2000)

    Google Scholar 

  26. Yang, C.C., Li, K.W.: Automatic Construction of English/Chinese Parallel Corpora. Journal of the American Society for Information Science and Technology 54(8), 730–742 (2003)

    Article  Google Scholar 

  27. Yang, C.C., Wang, F.L.: Fractal Summarization: Summarization Based on Fractal Theory. In: Proceedings of the 26th Annual International ACM Conference(SIGIR 2003), Toronto, Canada, July 28-August 1 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, F.L., Yang, C.C. (2003). Automatic Summarization of Chinese and English Parallel Documents. In: Sembok, T.M.T., Zaman, H.B., Chen, H., Urs, S.R., Myaeng, SH. (eds) Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access. ICADL 2003. Lecture Notes in Computer Science, vol 2911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24594-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24594-0_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20608-8

  • Online ISBN: 978-3-540-24594-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics