Skip to main content

The Effect of Database Size Distribution on Resource Selection Algorithms

  • Conference paper
Distributed Multimedia Information Retrieval (DIR 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2924))

Included in the following conference series:

Abstract

Resource selection is an important topic in distributed information retrieval research. It can be a component of a distributed information retrieval task and can also serve as an independent application of database recommendation system together with the resource representation part. There is a large body of valuable prior research on resource selection but very little has studied about the effects of different database size distributions on resource selection. In this paper, we propose extended versions of two well-known resource selection algorithms: CORI and KL divergence in order to consider the factors of database size distributions, and compare them with the lately proposed Relevant Document Distribution Estimation (ReDDE) resource selection algorithm. Experiments were done on four testbeds with different characteristics, and the ReDDE and the extended KL divergence resource selection algorithm have been shown to be more robust in various environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Callan, J.: Distributed information retrieval. In: Croft, W.B. (ed.) Advances in Information Retrieval, pp. 127–150. Kluwer Academic Publishers, Dordrecht (2000)

    Google Scholar 

  2. Callan, J., Connell, M.: Query-based sampling of text databases. ACM Transactions on Information Systems, 97–130 (2001)

    Google Scholar 

  3. French, J.C., Powell, A.L., Callan, J., Viles, C.L., Emmitt, T., Prey, K.J., Mou, Y.: Comparing the performance of database selection algorithms. In: Proceedings of the Twenty Second Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1999)

    Google Scholar 

  4. Gravano, L., Chang, C., Garcia-Molina, H., Paepcke, A.: STARTS: Stanford proposal for internet Meta-Searching. In: Proceedings of the ACM-SIGMOD International Conference on Management of Data (1997)

    Google Scholar 

  5. Liu, K.L., Yu, C., Meng, W., Santos, A., Zhang, C.: Discovering the representative of a search engine. In: Proceedings of 10th ACM International Conference on Information and Knowledge Management (2001)

    Google Scholar 

  6. Si, L., Callan, J.: Using sampled data and regression to merge search engine results. In: Proceedings of the Twenty Fourth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2002)

    Google Scholar 

  7. Si, L., Jin, R., Callan, J., Ogilvie, P.: A language model framework for resource selection and results merging. In: Proceedings of the eleventh International Conference on Information and Knowledge Management, ACM, New York (2002)

    Google Scholar 

  8. Si, L., Callan, J.: Relevant document distribution estimation method for resource selection. In: Proceedings of the Twenty Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2003)

    Google Scholar 

  9. Xu, J., Croft, W.B.: Cluster-based language models for distributed retrieval. In: Proceedings of the Twenty Second Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Si, L., Callan, J. (2004). The Effect of Database Size Distribution on Resource Selection Algorithms. In: Callan, J., Crestani, F., Sanderson, M. (eds) Distributed Multimedia Information Retrieval. DIR 2003. Lecture Notes in Computer Science, vol 2924. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24610-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24610-7_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20875-4

  • Online ISBN: 978-3-540-24610-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics