Advertisement

Text Retrieval

  • Peter Schäuble
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 397)

Abstract

Information retrieval is based on the assumption that occurrences of indexing features in a document will tell us something about the relevance of this document. This assumption implies the cluster hypothesis: closely associated documents tend to be relevant to the same requests (van Rijsbergen 1979, p. 45). In this section, we will describe how texts can be characterized by the distribution of occurrences of textual indexing features. For consistency, we will use the more general terminology of multimedia information retrieval. In particular, we will use the notion of indexing features rather than indexing terms and feature frequency rather than term frequency, even though throughout this chapter about text retrieval, every indexing feature ϕi denotes a term and the feature frequency f f i ,d j ) denotes the corresponding term frequency.

Keywords

Retrieval Method Document Frequency Stop Word Indexing Feature Document Length 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

  1. 1.
    As usual iff means if and only if.Google Scholar

Copyright information

© Springer Science+Business Media New York 1997

Authors and Affiliations

  • Peter Schäuble
    • 1
  1. 1.Swiss Federal Institute of Technology (ETH)ZurichSwitzerland

Personalised recommendations