A Ground-Truth Training Set for Hierarchical Clustering in Content-based Image Retrieval
Progress in Content-Based Image Retrieval (CBIR) is ham- pered by the absence of well-documented and validated test-sets that provide ground-truth for the performance evaluation of image indexing, retrieval and clustering tasks. For quick access to large (tenthousands or millions of images) digital image collections a hierarchically structured indexing or browsing mechanism based on clusters of similar images at various coarse to fine levels is highly wanted. The Leiden 19th-Century Portrait Database (LCPD), that consists of over 16,000 scanned studio portraits (so-called Cartes de Visite CdV), happens to have a clearly delineated set of clusters in the studio logo backside images. Clusters of similar or semantically identical logos can also be formed on a number of levels that show a clear hierarchy. The Leiden Imaging and Multimedia Group is constructing a CD-ROM with a well-documented set of studio portraits and logos that can serve as ground-truth for feature performance evaluation in domains beside color-indexing. Its grey-level image lay-out characteristics are also described by various precalculated feature vector sets. For both portraits (near copy pairs) and studio logos (clusters of identical logos) test-sets will be provided and described at various clustering levels. The statistically significant number of test-set images embedded in a realistically large environment of narrow-domain images are presented to the CBIR community to enable selection of more optimal indexing and retrieval approaches as part of an internationally defined test-set that comprises test-sets specifically designed for color-, texture- and shape retrieval evaluation.
KeywordsImage Retrieval Relevance Feedback Cluster Member Design Index Goal Image
Unable to display preview. Download preview PDF.
- [specialissue98]Murtagh, F. (ed.): Special Issue on Clustering and Classification. The Computer Journal 41–8 (1998)Google Scholar
- [Hartigan75]Hartigan, J. A.: Clustering Algorithms. Wiley (1975)Google Scholar
- [Dimai99]Dimai, A.: Assessment of Effectiveness of Content Based Image Retrieval Systems. Conf. Proc. Visual’99 LNCS 1614 (1999) 525–532Google Scholar
- [HP98]Ma, W., Zhang, H.: Benchmarking of Image Features for Content-based Retrieval. IEEE (1998) 253–257Google Scholar
- [DeVijver82]DeVijver, P.A., Kittler, J.: Pattern Recognition A Statistical Approach. Prentice-Hall (1982)Google Scholar
- [Kittler96]Kittler, J., Hatef, M., Duin, R.P.W.: Combining Classifiers. IEEE Proc ICPR’96 (1996) 2B 897–901Google Scholar
- [metric98]Sebe, N., Lew, M., Huijsmans, D.P.: Which Ranking Metric is Optimal? With Applications in Image Retrieval and Stereo Matching. Conf Proc ICPR’98 (1998) 265–271Google Scholar
- [perfHuijsmans97]Huijsmans, D.P., Lew, M.S., Denteneer, D.: Quality Measures for Interactive Image Retrieval with a Performance Evaluation of Two 3x3 Texel-Based Methods. Conf. Proc. ICIAP’97 LNCS 1311 (1997) 22–29Google Scholar