Multimedia Tools and Applications

, Volume 57, Issue 1, pp 5–27 | Cite as

Image similarity: from syntax to weak semantics

  • Jukka PerkiöEmail author
  • Antti Tuominen
  • Taneli Vähäkangas
  • Petri Myllymäki


Measuring image similarity is an important task for various multimedia applications. Similarity can be defined at two levels: at the syntactic (lower, context-free) level and at the semantic (higher, contextual) level. As long as one deals with the syntactic level, defining and measuring similarity is a relatively straightforward task, but as soon as one starts dealing with the semantic similarity, the task becomes very difficult. We examine the use of simple readily available syntactic image features combined with other multimodal features to derive a similarity measure that captures the weak semantics of an image. The weak semantics can be seen as an intermediate step between low level image understanding and full semantic image understanding. We investigate the use of single modalities alone and see how the combination of modalities affect the similarity measures. We also test the measure on multimedia retrieval task on a tv series data, even though the motivation is in understanding how different modalities relate to each other.


Image similarity Weak semantics Image retrieval Multimedia retrieval Video retrieval 



This work was supported in part by the IST Programme of the European Community under the PASCAL Network of Excellence and under the CLASS project, and by the Academy of Finland under projects VISCI and HPE, and by the Finnish Funding Agency for Technology and Innovation under the project MIFSAS.


  1. 1.
    Batko M, Falchi F, Lucchese C, Novak D, Perego R, Rabitti F, Sedmidubsky J, Zezula P (2009) Building a web-scale image similarity search system. Multimed Tools Appl 47(3):599–629CrossRefGoogle Scholar
  2. 2.
    Blei D, Ng A, Jordan M (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022. MIT PresszbMATHGoogle Scholar
  3. 3.
    Brin S, Page L (1998) The anatomy of a large-scale hypertextual Web search engine. Comput Netw ISDN Syst 30:107–117CrossRefGoogle Scholar
  4. 4.
    Buntine W, Jakulin A (2006) Discrete component analysis, subspace, latent structure and feature selection techniques, pp 1–33Google Scholar
  5. 5.
    Chen W, Liu C, Lander K, Fu X (2009) Comparison of human face matching behavior and computational image similarity measure. Science China Information Sciences 52(2):316–321zbMATHCrossRefGoogle Scholar
  6. 6.
    Csillaghy A, Hinterberger H, Benz AO (2000) Content-based image retrieval in astronomy. In: Information retrieval, vol 3(3). Kluwer Academic Publishers, pp 229–241Google Scholar
  7. 7.
    Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: ECCV international workshop on statistical learning in computer vision, pp 1–22Google Scholar
  8. 8.
    Durán ML, Rodríguez PG, Arias-Nicolás JP, Martín J, Disdier C (2009) A perceptual similarity method by pairwise comparison in a medical image case. Mach Vis Appl. doi: 10.1007/s00138-009-0201-3 Google Scholar
  9. 9.
    Felipe JC, Traina Jr C, Machado Traina AJ (2009) A new family of distance functions for perceptual similarity retrieval of medical images. J Digit Imaging 22(2):183–201CrossRefGoogle Scholar
  10. 10.
    Fu KS (1974) Syntactic methods in pattern recognition. Academic, NYzbMATHGoogle Scholar
  11. 11.
    Gile N, Wang N, Nathalie C, Siewe F, Lin X, Xu D (2008) A case study of image retrieval on lung cancer chest X-ray pictures. In: 9th international conference on signal processing 2008 (ICSP 2008), pp 924–927Google Scholar
  12. 12.
    Grigorova A, De Natale F, Dagli C, Huang T (2007) Content-based image retrieval by feature adaptation and relevance feedback. IEEE Trans Multimedia 9:1183–1192CrossRefGoogle Scholar
  13. 13.
    Hofmann T (1999) Probabilistic latent semantic indexing. In: SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, pp 50–57Google Scholar
  14. 14.
    Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley InterscienceGoogle Scholar
  15. 15.
    Jing Y, Baluja S (2008) PageRank for product image search. In: WWW ’08: Proceeding of the 17th international conference on World Wide Web, pp 307–315Google Scholar
  16. 16.
    Kak A, Pavlopoulou C (2002) Content-based image retrieval from large medical databases. In: First international symposium on 3D data processing visualization and transmission 2002, pp 138–147Google Scholar
  17. 17.
    Li M, Chen X, Li X, Ma B, Vitányi P (2004) The similarity metric. IEEE Trans Inf Theory 50:3250–3264CrossRefGoogle Scholar
  18. 18.
    Lin W, Jin R, Hauptmann A (2003) Web image retrieval re-ranking with relevance model. In: IEEE/WIC/ACM international conference on Web intelligenceGoogle Scholar
  19. 19.
    Lu SY, Fu KS (1978) A syntactic approach to texture analysis. CGIP 7:303–330Google Scholar
  20. 20.
    Marchand-Maillet S, Worring M (2006) Benchmarking image and video retrieval: an overview. In: MIR ’06: Proceedings of the 8th ACM international workshop on multimedia information retrieval. Santa Barbara, CA, USA, pp 297–300CrossRefGoogle Scholar
  21. 21.
    McDonald K, Smeaton AF (2005) A comparison of score, rank and probability-based fusion methods for video shot retrieval. In: 4th international conference on image and video retrieval (CIVR), pp 61–70Google Scholar
  22. 22.
    Perkiö J, Hyvärinen A (2009) Modelling image complexity by independent component analysis, with application to content-based image retrieval. In: ICANN ’09: Proceedings of the 19th international conference on artificial neural networks, pp 704–714Google Scholar
  23. 23.
    Perkiö J, Tuominen A, Myllymäki P (2009) Image similarity: from syntax to weak semantics using multimodal features with application to multimedia retrieval. In: International conference on multimedia information networking and security, pp 213–219Google Scholar
  24. 24.
    Porter MF (1980) An algorithm for suffix stripping. Program 14:130–137CrossRefGoogle Scholar
  25. 25.
    Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: Proceedings of the international conference on computer vision, pp 1470–1477Google Scholar
  26. 26.
    Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22:1349–1380. IEEE Computer SocietyCrossRefGoogle Scholar
  27. 27.
    Souvannavong F, Merialdo B, Huet B (2004) Latent semantic analysis for an effective region-based video shot retrieval system. In: MIR ’04: Proceedings of the 6th ACM SIGMM international workshop on multimedia information retrieval, pp 243–250Google Scholar
  28. 28.
    Tao D, Tang X, Li X, Rui Y (2006) Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm. IEEE Trans Multimedia 8:716–727. IEEE Computer SocietyCrossRefGoogle Scholar
  29. 29.
    Zhang J, Ye L (2009) Content based image retrieval using unclean positive examples. IEEE Trans Image Process 18(10):2370–2375MathSciNetCrossRefGoogle Scholar
  30. 30.
    Zhang RF, Zhang ZFM (2004) Hidden semantic concept discovery in region based image retrieval. In: CVPR04, pp 996–1001Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Jukka Perkiö
    • 1
    Email author
  • Antti Tuominen
    • 1
  • Taneli Vähäkangas
    • 1
  • Petri Myllymäki
    • 1
  1. 1.Helsinki Institute for Information TechnologyHelsinkiFinland

Personalised recommendations