Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Lexical Analysis of Textual Data

  • Chris D. PaiceEmail author
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_941


Lexical processing; Term processing


Lexical analysis refers to the association of meaning with explicitly specified textual strings, referred to here as lexical terms. These lexical terms are typically obtained from texts (whether natural or artificial) by a process called term extraction. The association of meaning with lexical terms involves a data structure known generically as a lexicon. The characteristic operation in using a lexicon is a lookup, where the input is a lexical term, and the output is a representation of one or more associated meanings. A lexicon consists of a collection of entries, each of which comprises an entry term and a meaning structure. Lookup entails finding any entries whose entry term matches the lexical term in question.

Here, the term lexical analysis is used to refer only to operations performed on complete words or word groups. Operations on the characters within words is the concern of morphology.

The use of text corporafor...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Chen H, Schatz B, Yim T, Fye D. Automatic thesaurus generation for an electronic community system. J Am Soc Inf Sci. 1995;46(3):175–93.CrossRefGoogle Scholar
  2. 2.
    Chodorow M, Byrd R, Heidorn, G. Extracting semantic hierarchies from a large on-line dictionary. In: Proceedings of the 23rd Annual Meeting of the Association for Computation Linguistics; 1985. p. 299–304.Google Scholar
  3. 3.
    Church K, Hanks P. Word association norms: mutual information and lexicography. Comput Linguist. 1990;16(1):22–9.Google Scholar
  4. 4.
    Clarke CLA, Cormack GV. On the use of regular expressions for searching text. ACM Trans Program Lang Syst. 1997;19(3):413–26.CrossRefGoogle Scholar
  5. 5.
    Fellbaum C, editor. WordNet: an electronic lexical database. Cambridge, MA: MIT Press; 1998.zbMATHGoogle Scholar
  6. 6.
    Frakes WB, Baeza-Yates R. Information retrieval: data structures & algorithms, Chapters III, VII, and IX. Englewood Cliffs: Prentice-Hall; 1992.Google Scholar
  7. 7.
    Greffenstette G. Explorations in automatic thesaurus discovery. Boston: Kluwer; 1994.CrossRefGoogle Scholar
  8. 8.
    Greffenstette G. Tokenization. In: van Halteren H, editor. Syntactic wordclass tagging. The Netherlands: Kluwer; 1999. p. 117–33.CrossRefGoogle Scholar
  9. 9.
    McEnery T, Wilson A. Corpus linguistics. Edinburgh: Edinburgh University Press; 2001.zbMATHGoogle Scholar
  10. 10.
    Miller GA, Beckwith R, Fellbaum C, Gross D, Miller KJ. Introduction to WordNet: an on-line lexical database. Int J Lexicogr. 1990;3(4):235–44.CrossRefGoogle Scholar
  11. 11.
    Mitkov R. The Oxford handbook of computational linguistics. Chapters III, XXI, XXIV, XXV, and XXXIII. Oxford: Oxford University Press; 2003.Google Scholar
  12. 12.
    Schneider JW, Borland P. Introduction to bibliometrics for construction and maintenance of thesauri: methodical considerations. J Doc. 2004;60(5):524–49.CrossRefGoogle Scholar
  13. 13.
    Voorhees EM. Using WordNet to disambiguate word senses for text retrieval. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1993. p. 171–80.Google Scholar
  14. 14.
    Yokoi T. The EDR electronic dictionary. Commun ACM. 1995;38(11):42–4.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Lancaster UniversityLancasterUK

Section editors and affiliations

  • Edie Rasmussen
    • 1
  1. 1.Library, Archival & Inf. StudiesThe Univ. of British ColumbiaVancouverCanada