# Inverse Document Frequency

## Synonyms

IDF

## Definition

The inverse document frequency (*IDF*) is a statistical weight used for measuring the importance of a term in a text document collection. The document frequency *DF* of a term is defined by the number of documents in which a term appears.

## Key Points

Karen Sparck-Jones first proposed that terms with low document frequency are more valuable than terms with high document frequency during retrieval [2]. In other words, the underlying idea of *IDF* is that the more frequently the term appears in the collection, the less informative the term is.

In its simplest form, the

*IDF*weight of a term is assigned as follows [ 3]:
$$ \mathrm{IDF}={ \log}_2\frac{\mathrm{N}}{\mathrm{DF}} $$

