Skip to main content

Stoplists

  • Reference work entry
  • First Online:
Encyclopedia of Database Systems
  • 15 Accesses

Synonyms

Negative dictionary; Stopwords

Definition

Stoplists are lists of words, commonly called stopwords, which are not indexed in an information retrieval system, and/or are not available for use as query terms. A stoplist can be created by sorting the terms in a document collection by frequency of occurrence, and designating some number of high frequency terms as stopwords, or alternately, by using one of the published lists of stopwords available. Stoplists may be generic or domain specific, and are of course language specific. When a stoplist is used for indexing, as a document is added to the system, each word in it is checked against the stoplist (for example through dictionary lookup or hashing), and those which match are eliminated from further processing. In some systems, stopwords are indexed, but the stoplist is used to eliminate the words from processing when they are used as query terms.

Key Points

Hans Peter Luhn, in pioneering work on automatic abstracting, put forward...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Dialog online courses: glossary of search terms. Available at: http://training.dialog.com/onlinecourses/glossary/glossary_life.html.

  2. Flood BJ. Historical note: the start of a stop list at Biological Abstracts. J Am Soc Inf Sci. 1999;50(12):1066.

    Article  Google Scholar 

  3. Fox C. Lexical analysis and stoplists. In: Frakes WB, Baeza-Yates R, editors. Information retrieval: data structures and algorithms. Englewood Cliffs: Prentice-Hall; 1992. p. 102–30.

    Google Scholar 

  4. Google Web Search Help Center. Search basics: use of common words. Available at: http://www.google.com/support/bin/answer.py?answer=981.

  5. Korfhage RR. Information storage and retrieval. Wiley: Wiley Computer Pub; 1997.

    Google Scholar 

  6. Luhn HP. The automatic creation of literature abstracts. IBM J Res Dev. 1958;2(2):157–65.

    MathSciNet  Google Scholar 

  7. Luhn HP. Keyword-in-context index for technical literature. Am Doc. 1960;11(4):288–95.

    Article  Google Scholar 

  8. Manning CD, Raghavan P, Schütze H. Introduction to information retrieval. Cambridge: Cambridge University Press; 2008.

    Book  MATH  Google Scholar 

  9. Parkins PV. Approaches to vocabulary management in permuted-title indexing of Biological Abstracts. In: Proceedings of the 26th Annual Meeting on American Documentation Institute; 1963. p. 27–9.

    Google Scholar 

  10. Witten IH, Moffat A, Bell TC. Managing gigabytes: compressing and indexing documents and images. 2nd ed. San Francisco: Morgan Kaufmann; 1999.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edie Rasmussen .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Rasmussen, E. (2018). Stoplists. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_955

Download citation

Publish with us

Policies and ethics