Abstract
This paper presents two sentence retrieval methods. We adopt the task definition done in the TREC Novelty Track: sentence retrieval consists in the extraction of the relevant sentences for a query from a set of relevant documents for that query. We have compared the performance of the Latent Semantic Indexing (LSI) retrieval model against the performance of a topic identification method, also based on Singular Value Decomposition (SVD) but with a different sentence selection method. We used the TREC Novelty Track collections from years 2002 and 2003 for the evaluation. The results of our experiments show that these techniques, particularly sentence retrieval based on topic identification, are valid alternative approaches to other more ad-hoc methods devised for this task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Harman, D.: Overview of the TREC 2002 Novelty Track. In: The Eleventh Text REtrieval Conference, pp. 17–28. NIST Special Publication 500-251 (2002)
White, R.W., Jose, J.M., Ruthven, I.: Using top-ranking sentences to facilitate effective information access. JASIST 56(10), 1113–1125 (2005)
Larkey, L.S., Allan, J., Connell, M.E., Bolivar, A., Wade, C.: UMass at TREC 2002: Cross Language and Novelty Tracks. In: The Eleventh Text REtrieval Conference, pp. 721–732. NIST Special Publication 500-251 (2002)
Zhang, M., Song, R., Lin, C., Ma, S., Jiang, Z., Jin, Y., Liu, Y., Zhao, L.: THU TREC 2002: Novelty Track Experiments. In: The Eleventh Text REtrieval Conference, pp. 591–595. NIST Special Publication 500-251 (2002)
Berry, M.W., Dumais, S.T., Letsche, T.A.: Computational Methods for Intelligent Information Access. In: Supercomputing 1995: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (1995)
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by Latent Semantic Analysis. JASIS 41(6), 391–407 (1990)
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: SIGIR 2001, Proceedings of the 24th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 19–25 (2001)
Soboroff, I., Harman, D.: Overview of the TREC 2003 Novelty Track. In: The Twelfth Text REtrieval Conference, pp. 38–53. NIST Special Publication 500-255 (2003)
Telcordia Technologies: LSI Software Home, http://lsi.research.telcordia.com
Rohde, D.: SVDLIBC, http://tedlab.mit.edu/~dr/SVDLIBC
Dumais, S.: Enhancing Performance in Latent Semantic Indexing. TM-ARH- 017527 Technical Report, Bellcore (1990)
Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting Query Performance. In: SIGIR 2002, Proceedings of the 25th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 299–306 (2002)
He, B., Ounis, I.: Inferring query performance using pre-retrieval predictors. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 43–54. Springer, Heidelberg (2004)
Buckley, C.: Why current IR engines fail. In: SIGIR 2004, Proceedings of the 27th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 584–585 (2004)
Soboroff, I.: Overview of the TREC 2004 Novelty Track. In: The Thirteenth Text REtrieval Conference Proceedings. NIST Special Publication 500-261 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Parapar, D., Barreiro, Á. (2006). Sentence Retrieval with LSI and Topic Identification. In: Lalmas, M., MacFarlane, A., Rüger, S., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds) Advances in Information Retrieval. ECIR 2006. Lecture Notes in Computer Science, vol 3936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11735106_12
Download citation
DOI: https://doi.org/10.1007/11735106_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33347-0
Online ISBN: 978-3-540-33348-7
eBook Packages: Computer ScienceComputer Science (R0)