Information Retrieval Tools for Literary Analysis

  • Abraham Bookstein
  • Shmuel T. Klein


The advent of the CD-ROM as a means of distributing massive bodies of textual data increases the importance of developing automatic techniques for textual analysis. To accomplish this task, we should be alert to existing techniques, perhaps developed for other purposes, that can be of value. We here report on observations we made while carrying out research on information storage and retrieval that promise to be helpful. Specifically, auxiliary information and data structures created incidental to our IR investigations are rich in semantic content, and can be useful in suggesting or confirming relations among concepts in text. Two examples are given: one based on a term weighting scheme for IR, the other on a tree structure for compressing bitmaps.


Information Retrieval Semantic Content Retrieval Mechanism Hebrew Word Singleton Cluster 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Bookstein A., Probability and Fuzzy-Set Applications to Information Retrieval, Annual Review of Information Science and Technology 20 (1985) 117–151Google Scholar
  2. [2]
    Bookstein A., Explanation and Generalization of Vector Models in Information Retrieval, Proc. 5-th ACA- SIGIR Conf., Berlin (1982) 118–132Google Scholar
  3. [3]
    Bookstein A., Klein S.T., Construction of Optimal Graphs for Bit-Vector Compression, to appear in Proc. 13-th A CAI-SIGIR Conf, Brussels (1990).Google Scholar
  4. [4]
    Bookstein A., Klein S.T., Using Bitmaps for Medium Sized Information Retrieval Systems, to appear in Information ProcessingManagement (1990).Google Scholar
  5. [5]
    Bookstein A., Morrissey R., Deerwester S., Waclena K., Ziff D., Statistical Guides for Literary Analysis, to appear in Festschrift for Quemada. edited by Antonio Zampolli.Google Scholar
  6. [6]
    Bookstein A., Swanson D., A Decision Theoretic Foundation for Indexing, J. Amer. Soc. for Inf. Sc. 26 (1975) 45–50.CrossRefGoogle Scholar
  7. [7]
    Choueka Y., Fraenkel A.S., Klein S.T., Segal E., Improved hierarchical bit-vector compression in document retrieval systems, Proc. 9-th ACAI-SIGIR Conf., Pisa, Italy (1986) 88–97.Google Scholar
  8. [8]
    Choueka Y., Fraenkel A.S., Klein S.T., Segal E., Improved Techniques for Processing Queries in Full-Text Systems, Proc. 10-th ACA’M-SIGIR Conf., New Orleans (1987) 306–315.Google Scholar
  9. [9]
    Choueka Y., Klein S.T., Neuvitz E., Automatic Retrieval of Frequent Idiomatic and Collocational Expressions in a Large Corpus, J. Assoc. Literary and Linguistic Computing, Vol. 4 (1983) 34–38.Google Scholar
  10. [10]
    Feller W., An Introduction to Probability Theory and Atrpi„, Wiley, New York (1968).Google Scholar
  11. [11]
    Morrissey R., Del Vigna C., A Natural Language Data Base, Educom 18 (1983).Google Scholar
  12. [12]
    Salton G, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Addison-Wesley, Reading, Mass. (1989).Google Scholar
  13. [13]
    Storer J.A., Data Compression, Methods and Theory, Computer Science Press, Rockville, Maryland (1988).Google Scholar
  14. [14]
    Teuhola J., A Compression method for Clustered Bit-Vectors, Inf. Processing Letters 7 (1978) 308–311.MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag/Wien 1990

Authors and Affiliations

  • Abraham Bookstein
    • 1
  • Shmuel T. Klein
    • 1
  1. 1.Center for Information and Language StudiesUniversity of ChicagoChicagoUSA

Personalised recommendations