Skip to main content

Sentence Retrieval with LSI and Topic Identification

  • Conference paper
Advances in Information Retrieval (ECIR 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3936))

Included in the following conference series:

Abstract

This paper presents two sentence retrieval methods. We adopt the task definition done in the TREC Novelty Track: sentence retrieval consists in the extraction of the relevant sentences for a query from a set of relevant documents for that query. We have compared the performance of the Latent Semantic Indexing (LSI) retrieval model against the performance of a topic identification method, also based on Singular Value Decomposition (SVD) but with a different sentence selection method. We used the TREC Novelty Track collections from years 2002 and 2003 for the evaluation. The results of our experiments show that these techniques, particularly sentence retrieval based on topic identification, are valid alternative approaches to other more ad-hoc methods devised for this task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Harman, D.: Overview of the TREC 2002 Novelty Track. In: The Eleventh Text REtrieval Conference, pp. 17–28. NIST Special Publication 500-251 (2002)

    Google Scholar 

  2. White, R.W., Jose, J.M., Ruthven, I.: Using top-ranking sentences to facilitate effective information access. JASIST 56(10), 1113–1125 (2005)

    Article  Google Scholar 

  3. Larkey, L.S., Allan, J., Connell, M.E., Bolivar, A., Wade, C.: UMass at TREC 2002: Cross Language and Novelty Tracks. In: The Eleventh Text REtrieval Conference, pp. 721–732. NIST Special Publication 500-251 (2002)

    Google Scholar 

  4. Zhang, M., Song, R., Lin, C., Ma, S., Jiang, Z., Jin, Y., Liu, Y., Zhao, L.: THU TREC 2002: Novelty Track Experiments. In: The Eleventh Text REtrieval Conference, pp. 591–595. NIST Special Publication 500-251 (2002)

    Google Scholar 

  5. Berry, M.W., Dumais, S.T., Letsche, T.A.: Computational Methods for Intelligent Information Access. In: Supercomputing 1995: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (1995)

    Google Scholar 

  6. Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by Latent Semantic Analysis. JASIS 41(6), 391–407 (1990)

    Article  Google Scholar 

  7. Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: SIGIR 2001, Proceedings of the 24th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 19–25 (2001)

    Google Scholar 

  8. Soboroff, I., Harman, D.: Overview of the TREC 2003 Novelty Track. In: The Twelfth Text REtrieval Conference, pp. 38–53. NIST Special Publication 500-255 (2003)

    Google Scholar 

  9. Telcordia Technologies: LSI Software Home, http://lsi.research.telcordia.com

  10. Rohde, D.: SVDLIBC, http://tedlab.mit.edu/~dr/SVDLIBC

  11. Dumais, S.: Enhancing Performance in Latent Semantic Indexing. TM-ARH- 017527 Technical Report, Bellcore (1990)

    Google Scholar 

  12. Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting Query Performance. In: SIGIR 2002, Proceedings of the 25th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 299–306 (2002)

    Google Scholar 

  13. He, B., Ounis, I.: Inferring query performance using pre-retrieval predictors. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 43–54. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  14. Buckley, C.: Why current IR engines fail. In: SIGIR 2004, Proceedings of the 27th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 584–585 (2004)

    Google Scholar 

  15. Soboroff, I.: Overview of the TREC 2004 Novelty Track. In: The Thirteenth Text REtrieval Conference Proceedings. NIST Special Publication 500-261 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Parapar, D., Barreiro, Á. (2006). Sentence Retrieval with LSI and Topic Identification. In: Lalmas, M., MacFarlane, A., Rüger, S., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds) Advances in Information Retrieval. ECIR 2006. Lecture Notes in Computer Science, vol 3936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11735106_12

Download citation

  • DOI: https://doi.org/10.1007/11735106_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33347-0

  • Online ISBN: 978-3-540-33348-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics