Definition
In language modeling, n-gram models are probabilistic models of text that use some limited amount of history, or word dependencies, where n refers to the number of words that participate in the dependence relation.
Key Points
In automatic speech recognition, n-grams are important to model some of the structural usage of natural language, i.e., the model uses word dependencies to assign a higher probability to “how are you today” than to “are how today you,” although both phrases contain the exact same words. If used in information retrieval, simple unigram language models (n-gram models with n = 1), i.e., models that do not use term dependencies, result in good quality retrieval in many studies. The use of bigram models (n-gram models with n= 2) would allow the system to model direct term dependencies, and treat the occurrence of “New York” differently from separate occurrences of “New” and “York,” possibly improving retrieval performance. The use of trigram models would...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Metzler D, Bruce Croft W. A Markov random field model for term dependencies. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2005. p. 472–9.
Miller DRH, Leek T, Schwartz RM. A hidden Markov model information retrieval system. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1999. p. 214–21.
Song F, Bruce Croft W. A general language model for information retrieval. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1999. p. 4–9.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Hiemstra, D. (2018). N-Gram Models. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_935
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_935
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering