Automatic Indexing

Part of the The Information Retrieval Series book series (INRE, volume 1)


Chapter 3 introduced the concept and objectives of indexing along with its history. This chapter focuses on the process and algorithms to perform indexing. The indexing process is a transformation of an item that extracts the semantics of the topics discussed in the item. The extracted information is used to create the processing tokens and the searchable data structure. The semantics of the item not only refers to the subjects discussed in the item but also in weighted systems, the depth to which the subject is discussed. The index can be based on the full text of the item, automatic or manual generation of a subset of terms/phrases to represent the item, natural language representation of the item or abstraction to concepts in the item. The results of this process are stored in one of the data structures (typically inverted data structure) described in Chapter 4. Distinctions, where appropriate, are made between what is logically kept in an index versus what is physically stored.


Natural Language Processing Concept Class Term Frequency Inverse Document Frequency Latent Semantic Indexing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Kluwer Academic Publishers 1997

Personalised recommendations