Synonyms
Lexical processing; Term processing
Definitions
Lexical analysis refers to the association of meaning with explicitly specified textual strings, referred to here as lexical terms. These lexical terms are typically obtained from texts (whether natural or artificial) by a process called term extraction. The association of meaning with lexical terms involves a data structure known generically as a lexicon. The characteristic operation in using a lexicon is a lookup, where the input is a lexical term, and the output is a representation of one or more associated meanings. A lexicon consists of a collection of entries, each of which comprises an entry term and a meaning structure. Lookup entails finding any entries whose entry term matches the lexical term in question.
Here, the term lexical analysis is used to refer only to operations performed on complete words or word groups. Operations on the characters within words is the concern of morphology.
The use of text corporafor...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Chen H, Schatz B, Yim T, Fye D. Automatic thesaurus generation for an electronic community system. J Am Soc Inf Sci. 1995;46(3):175–93.
Chodorow M, Byrd R, Heidorn, G. Extracting semantic hierarchies from a large on-line dictionary. In: Proceedings of the 23rd Annual Meeting of the Association for Computation Linguistics; 1985. p. 299–304.
Church K, Hanks P. Word association norms: mutual information and lexicography. Comput Linguist. 1990;16(1):22–9.
Clarke CLA, Cormack GV. On the use of regular expressions for searching text. ACM Trans Program Lang Syst. 1997;19(3):413–26.
Fellbaum C, editor. WordNet: an electronic lexical database. Cambridge, MA: MIT Press; 1998.
Frakes WB, Baeza-Yates R. Information retrieval: data structures & algorithms, Chapters III, VII, and IX. Englewood Cliffs: Prentice-Hall; 1992.
Greffenstette G. Explorations in automatic thesaurus discovery. Boston: Kluwer; 1994.
Greffenstette G. Tokenization. In: van Halteren H, editor. Syntactic wordclass tagging. The Netherlands: Kluwer; 1999. p. 117–33.
McEnery T, Wilson A. Corpus linguistics. Edinburgh: Edinburgh University Press; 2001.
Miller GA, Beckwith R, Fellbaum C, Gross D, Miller KJ. Introduction to WordNet: an on-line lexical database. Int J Lexicogr. 1990;3(4):235–44.
Mitkov R. The Oxford handbook of computational linguistics. Chapters III, XXI, XXIV, XXV, and XXXIII. Oxford: Oxford University Press; 2003.
Schneider JW, Borland P. Introduction to bibliometrics for construction and maintenance of thesauri: methodical considerations. J Doc. 2004;60(5):524–49.
Voorhees EM. Using WordNet to disambiguate word senses for text retrieval. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1993. p. 171–80.
Yokoi T. The EDR electronic dictionary. Commun ACM. 1995;38(11):42–4.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Paice, C.D. (2018). Lexical Analysis of Textual Data. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_941
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_941
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering