Use of Dependency Microcontexts in Information Retrieval
This paper focuses especially on two problems that are crucial for retrieval performance in information retrieval (IR) systems: the lack of information caused by document pre-processing and the difficulty caused by homonymous and synonymous words in natural language. Author argues that traditional IR methods, i. e. methods based on dealing with individual terms without considering their relations, can be overcome using natural language processing (NLP). In order to detect the relations among terms in sentences and make use of lemmatisation and morphological and syntactic tagging of Czech texts, author proposes a method for construction of dependency word microcontexts fully automatically extracted from texts, and several ways how to exploit the microcontexts for the sake of increasing retrieval performance.
KeywordsInformation Retrieval Natural Language Processing Retrieval Performance Ambiguous Word Word Sense
Unable to display preview. Download preview PDF.
- 1.E. Brill, R. J. Mooney: An Overview of Empirical Natural Language Processing. In: AI Magazine, Vol. 18 (1997), No. 4.Google Scholar
- 2.M. Holub, A. Böhmová: Use of Dependency Tree Structures for the Microcontext Extraction. Accepted for the ACL’2000 conference. 350Google Scholar
- 4.C. Leacock, G. Towell, E. M. Voorhees: Toward building contextual representations of word senses using statistical models. In: B. Boguraev and J. Pustejovsky (editors), Corpus Processing for Lexical Acquisitions, 1996, pp 97–113, MIT Press. 350Google Scholar
- 5.D. Lin: Extracting Collocations from Text Corpora. In: Computerm’ 98. Proceedings of the First Workshop on Computational Terminology. Montreal, 1998. 352Google Scholar
- 6.G. A. Miller, W. G. Charles: Contextual correlates of semantic similarity. In: Language and cognitive processes, 6(1), 1991. 350Google Scholar
- 7.H. Schütze, J. O. Pedersen: Information Retrieval Based on Word Senses. In: Proceedings of the Fourth Annual Symposium on Document Analysis and Information retrieval, pp 161–175, Las Vegas, NV, 1995. 350Google Scholar
- 8.G. Towell, E. M. Voorhees: Disambiguating Highly Ambiguous Words. In: Computational Linguistics, March 1998, Vol. 24, Number 1, pp 125–145. 350Google Scholar