Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

TF*IDF

  • Ibrahim Abu El-Khair
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_956

Synonyms

Term frequency by inverse document frequency

Definition

A weighting function that depends on the term frequency (TF) in a given document calculated with its relative collection frequency (IDF). This weighting function is calculated as follows [ 1]. Assuming that term j occurs in at least one document d( dj ≠ 0), the inverse document frequency (IDF) would be
$$ {\mathrm{Log}}_2\left(N/{d}_j\right)+1={\mathrm{log}}_2N-{\mathrm{log}}_2{d}_j $$
This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Korfhage RR. Information storage and retrieval. New York: Wiley; 1997.Google Scholar
  2. 2.
    Manning CD, Raghavan P, Schütze H. Introduction to information retrieval. Cambridge, UK: Cambridge University Press; 2008.zbMATHCrossRefGoogle Scholar
  3. 3.
    Roelleke T. Information retrieval models: foundations & relationships. Morgan & Claypool Publishers; 2013.Google Scholar
  4. 4.
    Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manag. 1988;24(4):513–23.CrossRefGoogle Scholar
  5. 5.
    Singhal A, Salton G, Mitra M, Buckley C. Document length normalization. Inf Process Manag. 1996;32(5):619–33.CrossRefGoogle Scholar
  6. 6.
    Sparck JK. A statistical interpretation of term specify and its application in retrieval. J Doc. 1972;28(1):11–20.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Information Science Department, School of Social SciencesUmm Al-Qura UniversityMeccaSaudi Arabia

Section editors and affiliations

  • Edie Rasmussen
    • 1
  1. 1.Library, Archival & Information StudiesThe University of British ColumbiaVancouverCanada