Abstract
The effectiveness of metasearch data fusion procedures depends crucially on the properties of common documents distributions. Because we usually know neither how different search engines assign relevance scores nor the similarity of these assignments, common documents of the individual ranked lists are the only base of combining search results. So it is very important to study the properties of common documents distributions. One of these properties is the Overlap Property (OP) of documents retrieved by different search engines. According to OP, the overlap between the relevant documents is usually greater than the overlap between non-relevant ones. Although OP was repeatedly observed and discussed, no theoretical explanation of this empirical property was elaborated. This paper considers formal research of properties of the common documents distributions. In particular, sufficient and necessary condition of OP is elaborated and it is proved that OP should take place practically under arbitrary circumstances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank Aggregation Methods for the Web. WWW10 (2001) 613–622
Katzer, J., McGill, M., Tessier, J., Frakes, W., DasGupta, P.: A study of the overlap among document representations. Information Technology: Research and Development. Vol.2, (1982) 261–274
Lee, J. H.: Analyses of Multiple Evidence Combination. Proceedings of the 20th Annual International ACM-SIGIR Conference (1997) 267–276
Saracevic, T., Kantor, P.: A study of information seeking and retrieving. III. Searchers, searches, overlap. Journal of the American Society for Information Science. Vol. 39, No. 3 (1988) 197–216
Zhang, J. et al.: Improving the Effectiveness of Information Rtrieval with Clustering and Fusion. Computational Linguistics and Chinese Language Processing. Vol. 6, No. 1 (2001) 109–125
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Buzikashvili, N. (2002). Metasearch. Properties of Common Documents Distributions. In: Karagiannis, D., Reimer, U. (eds) Practical Aspects of Knowledge Management. PAKM 2002. Lecture Notes in Computer Science(), vol 2569. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36277-0_21
Download citation
DOI: https://doi.org/10.1007/3-540-36277-0_21
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00314-4
Online ISBN: 978-3-540-36277-7
eBook Packages: Springer Book Archive