Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Term Weighting

  • Ibrahim Abu El-Khair
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_943

Definition

Term weighting is a procedure that takes place during the text indexing process in order to assess the value of each term to the document. Term weighting is the assignment of numerical values to terms that represent their importance in a document in order to improve retrieval effectiveness [9]. Essentially it considers the relative importance of individual words in an information retrieval system, which can improve system effectiveness, since not all the terms in a given document collection are of equal importance. Weighing the terms is the means that enables the retrieval system to determine the importance of a given term in a certain document or a query. It is a crucial component of any information retrieval system, a component that has shown great potential for improving the retrieval effectiveness of an information retrieval system [8].

Historical Background

The use of word frequency dates back to G. K. Zipf and his well-known law [16] for word distribution. The law...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Hiemstra D, de Vries A. Relating the new language models of information retrieval to the traditional retrieval models (No. TR-CTIT-00-09). Amsterdam: Centre for Telematics and Information Technology (CTIT), University of Twente; 2000.Google Scholar
  2. 2.
    Korfhage RR. Information storage and retrieval. New York: Wiley; 1997.Google Scholar
  3. 3.
    Lancaster FW. Indexing and abstracting in theory and practice. 2nd ed. Champaign: University of Illinois, Graduate School of Library and Information Science; 1998.Google Scholar
  4. 4.
    Luhn HP. The automatic creation of literature abstracts. IBM J Res Dev. 1958;2(2):159–65.MathSciNetCrossRefGoogle Scholar
  5. 5.
    Ponte JM, Croft WB. A language modeling approach to information retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1998. p. 275–281.Google Scholar
  6. 6.
    Robertson SE, Sparck-Jones K. Relevance weighting of search terms. J Am Soc Inf Sci. 1976;27(3):129–46.CrossRefGoogle Scholar
  7. 7.
    Roelleke, Thomas. Information retrieval models: foundations & relationships. San Rafael: Morgan & Claypool Publishers; 2013.Google Scholar
  8. 8.
    Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manag. 1988;24(4):513–23.CrossRefGoogle Scholar
  9. 9.
    Salton G, McGill M. Introduction to modern information retrieval. New York: McGraw-Hill Book Company; 1983.zbMATHGoogle Scholar
  10. 10.
    Salton G, Yang CS, Yu CT. A theory of term importance in automatic text analysis. J Am Soc Inf Sci Technol. 1975;26(1):33–44.CrossRefGoogle Scholar
  11. 11.
    Singhal A. Modern information retrieval: a brief overview. Bull IEEE Comput Soc Tech Comm Data Eng. 2001;24(4):35–43.Google Scholar
  12. 12.
    Singhal A, Salton G, Mitra M, Buckley C. Document length normalization. Inf Process Manag. 1996;32(5):619–33.CrossRefGoogle Scholar
  13. 13.
    Sparck Jones K. A statistical interpretation of term specificity and its application in retrieval. J Doc. 1972;28(1):11–20.CrossRefGoogle Scholar
  14. 14.
    Sparck Jones K, Walker S, Robertson SE. A probabilistic model of information retrieval: development and comparative experiments: part I. Inf Process Manag. 2000;36(6):779–808.CrossRefGoogle Scholar
  15. 15.
    Zhai CX. Statistical language models for information retrieval. Synth Lect Hum Lang Technol. 2008;1(1):1–141.MathSciNetCrossRefGoogle Scholar
  16. 16.
    Zipf GK. Human behavior and principle of least effort. Cambridge, MA: Addison Wesley; 1949.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Information Science Department, School of Social SciencesUmm Al-Qura UniversityMeccaSaudi Arabia

Section editors and affiliations

  • Edie Rasmussen
    • 1
  1. 1.Library, Archival & Information StudiesThe University of British ColumbiaVancouverCanada