Semantic Space Representation and Latent Semantic Analysis
In this chapter, we introduce latent semantic analysis (LSA), which uses singular value decomposition (SVD) to reduce the dimensionality of the document-term representation. This method reduces the large matrix to an approximation that is made up of fewer latent dimensions that can be interpreted by the analyst. Two important concepts in LSA, cosine similarity and queries, are explained. Finally, we discuss decision-making in LSA.
KeywordsLatent semantic analysis (LSA) Singular value decomposition (SVD) Latent semantic indexing (LSI) Cosine similarity Queries
- Berry, M. W., & Browne, M. (1999). Understanding search engines: Mathematical modeling and text retrieval. Philadelphia: Society for Industrial and Applied Mathematics.Google Scholar
- Dumais, S. T., Furnas, G. W., Landauer, T. K., & Deerwester, S. (1988). Using latent semantic analysis to improve information retrieval. In Proceedings of CHI’88: Conference on Human Factors in Computing (pp. 281–285). New York: ACM.Google Scholar
- Hu, M., & Liu, B. (2004, August). Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 168–177). ACM.Google Scholar
- Martin, D. I., & Berry, M. W. (2007). Mathematical foundations behind latent semantic analysis. In Handbook of latent semantic analysis, 35–56.Google Scholar
- For more about latent semantic analysis (LSA), see Landauer et al. (2007).Google Scholar