Abstract
Latent Semantic Indexing (LSI) is a novel approach to information retrieval that attempts to model the underlying structure of term associations by transforming the traditional representation of documents as vectors of weighted term frequencies to a new coordinate space where both documents and terms are represented as linear combinations of underlying semantic factors. In previous research, LSI has produced a small improvement in retrieval performance. In this paper, we apply LSI to the routing task, which operates under the assumption that a sample of relevant and non-relevant documents is available to use in constructing the query. Once again, LSI slightly improves performance. However, when LSI is used is conjuction with statistical classification, there is a dramatic improvement in performance.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Gerard Salton, editor. The SMART retrieval system: Experiments in Automatic Document Processing. Prentice-Hall, 1971.
S. Deerwester, S. Dumais, G. Fumas, T. Landauer, and R. Harshrnan. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41 (6): 391–407, 1990.
Donna Harman. Overview of the first TREC conference. In Proc. of the 16th ACM/SIGIR Conference, pages 36–47, 1993.
Gerard Salton and Christopher Buckley. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41 (4): 288–297, 1990.
Yonggang Qiu and H.P. Frei. Concept based query expansion. In Proc. of the 16th ACM/SIGIR Conference, pages 160–169, 1993.
Hinrich Schiitze. Dimensions of meaning. In Proceedings of Supercomputing ‘82, pages 787–796, 1992.
S.K.M. Wong, Y.J. Cai, and Y.Y. Yao. Computation of term associations by a neural network. In Proc. of the 16th ACM/SIGIR Conference, pages 107–115, 1993.
J. Friedman, J. Bentley, and R. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3 (3): 209–226, 1977.
G. Fumas, S. Deerwester, S. Dumais, T. Landauer, R. Harshman, L. Streeter, and K. Lochbaum. Information retrieval using a singular value decomposition model of latent semantic structure. In Proc. of the 11th ACM/SIGIR Conference, pages 465–480, 1988.
B.T. Bartell, G.W. Cottrell, and R.K. Belew. Latent semantic indexing is an optimal special case of multidimensional scaling. In Proc. of the 15th ACM/SIGIR Conference, pages 161–167, 1992.
J.J. Rocchio. Relevance feedback in information retrieval. In Gerard Salton, editor, The SMART retrieval system: Experiments in Automatic Document Processing,pages 313–323. Prentice-Hall, 1971-
Gerard Salton and Christopher Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24 (5): 513–523, 1988.
M. Berry. Large scale singular value computations. International Journal of Supercomputer Applications, 6 (1): 13–49, 1992.
David Hull. Using statistical testing in the evaluation of retrieval performance. In Proc. of the 16th ACM/SIGIR Conference, pages 329–338, 1993.
Donna Harman. Relevance feedback revisited. In Proc. of the 15th ACM/SIGIR Conference, pages 1–10, 1993.
Geoffrey J. McLachlan. Discriminant Analysis and Statistical Pattern Recognition, pages 52–64, 341–346. Wiley, 1992.
Ross Wilkinson and Philip Hingston. Using the cosine measure in a neural network for document retrieval. In Proc. of the 14th ACM/SIGIR Conference, pages 202–210, 1991.
S.K.M. Wong, W. Ziarko, and P.C.N. Wong. Generalized vector space model in information retrieval. In Proc. of the 8th ACM/SIGIR Conference, pages 18–25, 1985.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1994 Springer-Verlag London Limited
About this paper
Cite this paper
Hull, D. (1994). Improving Text Retrieval for the Routing Problem using Latent Semantic Indexing. In: Croft, B.W., van Rijsbergen, C.J. (eds) SIGIR ’94. Springer, London. https://doi.org/10.1007/978-1-4471-2099-5_29
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2099-5_29
Publisher Name: Springer, London
Print ISBN: 978-3-540-19889-5
Online ISBN: 978-1-4471-2099-5
eBook Packages: Springer Book Archive