Abstract
This paper describes a research, experiments, and theoretical considerations leading towards automatic computational thesaurus construction based upon identification of synonyms in large sets of texts for the needs of question-answering (QA) systems. The method benefits from and is founded on Latent Semantic Analysis (LSA) technique. LSA serves as a hypothesis generator which produces hypotheses about the words that might be synonyms. Subsequently, the generated hypotheses are proven right or wrong by means of examination of morphologic bindings between the two words and of the overall syntactic structure of the context in which they appear, namely the subject-object relation. The retrieved synonyms are used to extend the search space where a QA system mines the answers.
Chapter PDF
References
Konopík, M., Rohlík, O.: Question Answering for Not Quite Semantic Web. In: Proc. of 13th International Conference on Text, Speech and Dialogue TSD 2010, Brno, Czech Republic. Springer (2010)
Hajič, J.: Disambiguation of Rich Inflection (Computational Morphology of Czech), Prague, Czech Republic. Charles Univeristy Press, Karolinum (2004)
Hajič, J., Böhmová, A., Hajičová, E., Vidová Hladká, B.: The Prague Dependency Treebank: A Three-Level Annotation Scenario. In: Abeillé, A. (ed.) Treebanks: Building and Using Parsed Corpora, pp. 103–127. Kluwer, Amsterdam (2000)
Landauer, T.K., Dumais, S.T.: A solution to Platós problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104(2), 211–240 (1997)
Moraliyski, R., Dias, G.: Combination of Global and Local Attributional Similarities for Synonym Detection (2007), http://www.di.ubi.pt/~ddg/publications/Pliska2007.pdf
Turney, P.D.: Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 491–502. Springer, Heidelberg (2001)
Jurgens, D., Stevens, K.: The S-Space Package: An Open Source Package for Word Space Models. System Papers of the Association of Computational Linguistics. University of California Los Angeles, Los Angeles (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag GmbH Berlin Heidelberg
About this paper
Cite this paper
Ekštein, K., Krčmář, L. (2013). Automatic LSA-Based Retrieval of Synonyms (for Search Space Extension). In: Gaol, F. (eds) Recent Progress in Data Engineering and Internet Technology. Lecture Notes in Electrical Engineering, vol 156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28807-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-28807-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28806-7
Online ISBN: 978-3-642-28807-4
eBook Packages: EngineeringEngineering (R0)