Skip to main content

Part of the book series: The IMA Volumes in Mathematics and its Applications ((IMA,volume 107))

Abstract

With the electronic storage of documents comes the possibility of building search engines that can automatically choose documents relevant to a given set of topics. In information retrieval, we wish to match queries with relevant documents. Documents can be represented by the terms that appear within them, but literal matching of terms does not necessarily retrieve all relevant documents. There are a number of information retrieval systems based on inexact matches. Latent Semantic Indexing represents documents by approximations and tends to cluster documents on similar topics even if their term profiles are somewhat different. This approximate representation is usually accomplished using a low-rank singular value decomposition (SVD) approximation. In this paper, we use an alternate decomposition, the semi-discrete decomposition (SDD). For equal query times, the SDD does as well as the SVD and uses less than one-tenth the storage for the MEDLINE test set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. W. Berry, S. T. Dumais, And G. W. O’brien, Using linear algebra for intelligent information retrieval, SIAM Review, 37 (1995), pp. 573–595.

    Article  MathSciNet  MATH  Google Scholar 

  2. J. P. Callan, B. Croft, and S. M. Harding, The INQUERY retrieval system, in Proceedings of the Third International Conference on Database and Expert Systems Applications, Springer-Verlag, 1992, pp. 78–83.

    Google Scholar 

  3. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, And R. Harsh-Man, Indexing by latent semantic analysis, Journal of the Society for Information Science, 41 (1990), pp. 391–407.

    Article  Google Scholar 

  4. S. Dumais, Improving the retrieval of infomation from external sources, Behavior Research Methods, Instruments, & Computers, 23 (1991), pp. 229–236.

    Article  Google Scholar 

  5. W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures and Algorithms, Prentice Hall, Englewood Cliffs, New Jersey, 1992.

    Google Scholar 

  6. G. H. Golub and C. F. Van Loan, Matrix Computations, Johns Hopkins Press, 2nd ed., 1989.

    Google Scholar 

  7. D. P. O’leary and S. Peleg, Digital image compression by outer product expansion, IEEE Transactions on Communications, 31 (1983), pp. 441–444.

    Article  Google Scholar 

  8. G. Salton and M. J. Mcgill, Introduction to Modern Information Retrieval, McGraw-Hill, 1983.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media New York

About this chapter

Cite this chapter

Kolda, T.G., O’leary, D.P. (1999). Latent Semantic Indexing Via a Semi-Discrete Matrix Decomposition. In: Cybenko, G., O’Leary, D.P., Rissanen, J. (eds) The Mathematics of Information Coding, Extraction and Distribution. The IMA Volumes in Mathematics and its Applications, vol 107. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1524-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-1524-0_5

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-7178-9

  • Online ISBN: 978-1-4612-1524-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics