Skip to main content

Text Summarization as Controlled Search

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2338))

Abstract

We present a framework for text summarization based on the generate-and-test model. A large set of summaries is generated for all plausible values of six parameters that control a three-stage process that includes segmentation and keyphrase extraction, and a number of features that characterize the document. Quality is assessed by measuring the summaries against the abstract of the summarized document. The large number of summaries produced for our corpus dictates automated validation and fine-tuning of the summary generator. We use supervised machine learning to detect good and bad parameters. In particular, we identify parameters and ranges of their values within which the summary generator might be used with high reliability on documents for which no author’s abstract exists.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Choi, F.: Advances in domain independent linear text segmentation. In Proceedings of ANLP/NAACL-00 (2000)

    Google Scholar 

  2. Hearst, M.: TexTiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics (1997) 23(1) 33–64.

    Google Scholar 

  3. Goldstein, J., Kantrowitz, M., Mittal, V., Carbonell, J.: Summarizing Text Documents: Sentence Selection and Evaluation Metrics. In Proceedings of SIGIR-99 (1999) 121–128.

    Google Scholar 

  4. Kan, M.-Y., Klavans, J., McKeown, K.: Linear Segmentation and Segment Significance. In Proceedings of WVLC-6 (1998) 197–205.

    Google Scholar 

  5. Klavans, J., McKeown, K., Kan, M.-Y., Lee, S.: Resources for Evaluation of Summarization Techniques. In Proceedings of the 1st International Conference on Language Resources and Evaluation (1998).

    Google Scholar 

  6. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical statistics and probability (1967) (1) 281–297.

    Google Scholar 

  7. Mittal, V., Kantrowitz, M., Goldstein, J., Carbonell, J.: Selecting Text Spans for Document Summaries: Heuristics and Metrics. In Proceedings of AAAI-99 (1999)467–473.

    Google Scholar 

  8. Quinlan, J.R.: C5.0: An Informal Tutorial, www.rulequest.com/see5-unix.html. Rulequest Research (2002).

  9. Turney, P.: Learning algorithms for keyphrase extraction. Information Retrieval (2000) 2(4) 303–336.

    Article  Google Scholar 

  10. Witten, I.H., Paynter, G., Frank, E., Gutwin, C, Nevill-Manning, C: KEA: Practical automatic keyphrase extraction. In Proceedings of DL-99 (1999) 254–256.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Copeck, T., Japkowicz, N., Szpakowicz, S. (2002). Text Summarization as Controlled Search. In: Cohen, R., Spencer, B. (eds) Advances in Artificial Intelligence. Canadian AI 2002. Lecture Notes in Computer Science(), vol 2338. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47922-8_22

Download citation

  • DOI: https://doi.org/10.1007/3-540-47922-8_22

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43724-6

  • Online ISBN: 978-3-540-47922-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics