Skip to main content

Supporting full-text information retrieval with a persistent object store

  • Conference paper
  • First Online:
Advances in Database Technology — EDBT '94 (EDBT 1994)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 779))

Included in the following conference series:

Abstract

The inverted file index common to many full-text information retrieval systems presents unusual and challenging data management requirements. These requirements are usually met with custom data management software. Rather than build this custom software, we would prefer to use an existing database management system. Attempts to do this with traditional (e.g., relational) database management systems have produced discouraging results. Instead, we have used a persistent object store, Mneme, to support the inverted file of a full-text information retrieval system, INQUERY. The result is an improvement in performance along with opportunities for INQUERY to take advantage of the standard data management services provided by Mneme. We describe our implementation, present performance results on a variety of document collections, and discuss the advantages of using a persistent object store to support information retrieval.

This work is supported by the NSF Center for Intelligent Information Retrieval at the University of Massachusetts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D. C. Blair. An extended relational document retrieval model. Inf. Process. & Mgmnt., 24(3):349–371, 1988.

    Google Scholar 

  2. C. Buckley and A. F. Lewit. Optimization of inverted vector searches. In Proc. of the 8th Inter. ACM SIGIR Conf. on Res. and Develop in Infor. Retr., pages 97–110, June 1985.

    Google Scholar 

  3. J. P. Callan, W. B. Croft, and S. M. Harding. The INQUERY retrieval system. In Proc. of the 3rd Inter. Conf. on Database and Expert Sys. Apps., Sept. 1992.

    Google Scholar 

  4. R. G. Crawford. The relational model in information retrieval. J. Amer. Soc. Inf. Sci., 32(1):51–64, 1981.

    Google Scholar 

  5. R. G. Crawford and I. A. MacLeod. A relational approach to modular information retrieval systems design. In Proc. of the 41st Conf. of the Amer. Soc. for Inf. Sci., 1978.

    Google Scholar 

  6. J. S. Deogun and V. V. Raghavan. Integration of information retrieval and database management systems. Inf. Process. & Mgmnt., 24(3):303–313, 1988.

    Google Scholar 

  7. C. Faloutsos. Access methods for text. ACM Comput. Surv., 17:50–74, 1985.

    Google Scholar 

  8. E. A. Fox. Characterization of two new experimental collections in computer and information science containing textual and bibliographic concepts. Technical Report 83-561, Cornell University, Ithaca, NY, Sept. 1983.

    Google Scholar 

  9. D. A. Grossman and J. R. Driscoll. Structuring text within a relational system. In Proc. of the 3rd Inter. Conf. on Database and Expert Sys. Apps., pages 72–77, Sept. 1992.

    Google Scholar 

  10. D. Harman, editor. The First Text REtrieval Conference (TREC1). National Institute of Standards and Technology Special Publication 200–207, Gaithersburg, MD, 1992.

    Google Scholar 

  11. C. A. Lynch and M. Stonebraker. Extended user-defined indexing with application to textual databases. In Proc. of the 14th Inter. Conf. on VLDB, pages 306–317, 1988.

    Google Scholar 

  12. I. A. MacLeod. SEQUEL as a language for document retrieval. J. Amer. Soc. Inf. Sci., 30(5):243–249, 1979.

    Google Scholar 

  13. I. A. MacLeod and R. G. Crawford. Document retrieval as a database application. Inf. Tech. Res. Dev., 2(1):43–60, 1983.

    Google Scholar 

  14. J. E. B. Moss. Design of the Mneme persistent object store. ACM Trans. Inf. Syst., 8(2): 103–139, Apr. 1990.

    Google Scholar 

  15. G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, New York, 1983.

    Google Scholar 

  16. L. V. Saxton and V. V. Raghavan. Design of an integrated information retrieval/database management system. IEEE Trans. Know. Data Eng., 2(2):210–219, June 1990.

    Google Scholar 

  17. M. Stonebraker. Operating system support for database management. Commun. ACM, 24(7):412–418, July 1981.

    Google Scholar 

  18. A. Tomasic and H. Garcia-Molina. Performance of inverted indices in distributed text document retrieval systems. Technical Report STAN-CS-92-1434, Stanford University Department of Computer Science, 1992.

    Google Scholar 

  19. A. Tomasic and H. Garcia-Molina. Caching and database scaling in distributed shared-nothing information retrieval systems. In Proc. of the ACM SIGMOD Inter. Conf. on Management of Data, Washington, D.C., May 1993.

    Google Scholar 

  20. H. Turtle and W. B. Croft. Evaluation of an inference network-based retrieval model. ACM Trans. Inf. Syst., 9(3): 187–222, July 1991.

    Google Scholar 

  21. D. Wolfram. Applying informetric characteristics of databases to IR system file design, Part I: informetric models. Inf. Process. & Mgmnt., 28(1):121–133, 1992.

    Google Scholar 

  22. D. Wolfram. Applying informetric characteristics of databases to IR system file design, Part II: simulation comparisons. Inf. Process. & Mgmnt., 28(1):135–151, 1992.

    Google Scholar 

  23. G. K.Zipf. Human Behavior and the Principle of Least Effort. Addison-Wesley Press, 1949.

    Google Scholar 

  24. J. Zobel, A. Moffat, and R. Sacks-Davis. An efficient indexing technique for full-text database systems. In Proc. of the 18th Inter. Conf. on VLDB, Vancouver, 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Matthias Jarke Janis Bubenko Keith Jeffery

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brown, E.W., Callan, J.P., Croft, W.B., Moss, J.E.B. (1994). Supporting full-text information retrieval with a persistent object store. In: Jarke, M., Bubenko, J., Jeffery, K. (eds) Advances in Database Technology — EDBT '94. EDBT 1994. Lecture Notes in Computer Science, vol 779. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57818-8_64

Download citation

  • DOI: https://doi.org/10.1007/3-540-57818-8_64

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-57818-5

  • Online ISBN: 978-3-540-48342-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics