A Ground-Truth Training Set for Hierarchical Clustering in Content-based Image Retrieval

  • D. P. Huijsmans
  • N. Sebe
  • M. S. Lew
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1929)


Progress in Content-Based Image Retrieval (CBIR) is ham- pered by the absence of well-documented and validated test-sets that provide ground-truth for the performance evaluation of image indexing, retrieval and clustering tasks. For quick access to large (tenthousands or millions of images) digital image collections a hierarchically structured indexing or browsing mechanism based on clusters of similar images at various coarse to fine levels is highly wanted. The Leiden 19th-Century Portrait Database (LCPD), that consists of over 16,000 scanned studio portraits (so-called Cartes de Visite CdV), happens to have a clearly delineated set of clusters in the studio logo backside images. Clusters of similar or semantically identical logos can also be formed on a number of levels that show a clear hierarchy. The Leiden Imaging and Multimedia Group is constructing a CD-ROM with a well-documented set of studio portraits and logos that can serve as ground-truth for feature performance evaluation in domains beside color-indexing. Its grey-level image lay-out characteristics are also described by various precalculated feature vector sets. For both portraits (near copy pairs) and studio logos (clusters of identical logos) test-sets will be provided and described at various clustering levels. The statistically significant number of test-set images embedded in a realistically large environment of narrow-domain images are presented to the CBIR community to enable selection of more optimal indexing and retrieval approaches as part of an internationally defined test-set that comprises test-sets specifically designed for color-, texture- and shape retrieval evaluation.


Image Retrieval Relevance Feedback Cluster Member Design Index Goal Image 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [specialissue98]
    Murtagh, F. (ed.): Special Issue on Clustering and Classification. The Computer Journal 41–8 (1998)Google Scholar
  2. [survey83]
    Murtagh, F.: A Survey of Recent Advances in Hierarchical Clustering Algorithms. The Computer Journal 26 (1983) 354–359zbMATHGoogle Scholar
  3. [Hartigan75]
    Hartigan, J. A.: Clustering Algorithms. Wiley (1975)Google Scholar
  4. [Dimai99]
    Dimai, A.: Assessment of Effectiveness of Content Based Image Retrieval Systems. Conf. Proc. Visual’99 LNCS 1614 (1999) 525–532Google Scholar
  5. [HP98]
    Ma, W., Zhang, H.: Benchmarking of Image Features for Content-based Retrieval. IEEE (1998) 253–257Google Scholar
  6. [DeVijver82]
    DeVijver, P.A., Kittler, J.: Pattern Recognition A Statistical Approach. Prentice-Hall (1982)Google Scholar
  7. [Kittler96]
    Kittler, J., Hatef, M., Duin, R.P.W.: Combining Classifiers. IEEE Proc ICPR’96 (1996) 2B 897–901Google Scholar
  8. [metric98]
    Sebe, N., Lew, M., Huijsmans, D.P.: Which Ranking Metric is Optimal? With Applications in Image Retrieval and Stereo Matching. Conf Proc ICPR’98 (1998) 265–271Google Scholar
  9. [perfHuijsmans97]
    Huijsmans, D.P., Lew, M.S., Denteneer, D.: Quality Measures for Interactive Image Retrieval with a Performance Evaluation of Two 3x3 Texel-Based Methods. Conf. Proc. ICIAP’97 LNCS 1311 (1997) 22–29Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • D. P. Huijsmans
    • 1
  • N. Sebe
    • 1
  • M. S. Lew
    • 1
  1. 1.LIACSLeiden UniversityThe Netherlands

Personalised recommendations