Abstract
In this chapter we consider the problem of keyword focusing. In keyword focusing the input data is a collection of images that are annotated with a given keyword, such as “car”. The problem is to attribute the annotation to specific parts of the images. There exists plenty of suitable input data readily available for this data mining type of problem. For instance, parts of the pictorial content of the World Wide Web could be considered together with the associated text. We propose an unsupervised approach to the problem. Our technique is based on automatic hierarchical segmentation of the images, followed by statistical correlation of the segments’ visual features, represented using multiple Self-Organising Maps. The performed feasibility study experiments demonstrate the potential usefulness of the presented method. In most cases, the results from this data-driven approach agree with the manually de- fined ground truth for the keyword focusing task. In particular, the algorithm succeeds in selecting the appropriate level of hierarchy among the alternatives available in the segmentation results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Agarwal, A. Awan, and D. Roth. Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis Analysis and Machine Intelligence, 26(11):1475–1490, November 2004.
K. Arbter. Affine-invariant Fourier descriptors. In J. C. Simon, editor, From Pixels to Features, pages 153–164. Elsevier Science Publishers B.V.(North-Holland), 1989.
K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, Special Issue on Machine Learning Methods for Text and Images, 3:1107–1135, February 2003.
S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4):509–522, April 2002.
I. Biederman. A theory of human image understanding. Psychological Review, 94:115–147, 1987.
P. Carbonetto, N. de Freitas, and K. Barnard. A statistical model for general contextual object recognition. In Proceedings of the Eight European Conference on Computer Vision, Prague, May 2004.
C. Carson, S. Belongie, H. Greenspan, and J. Malik. Blobworld: Image segmentation using expectation-maximization and its application to image querying. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8):1026–1038, August 2002.
Y. Chen and J. Z. Wang. Looking beyond region boundaries: Region-based image retrieval using fuzzy feature matching. In Multimedia Content-Based Indexing and Retrieval Workshop, September 24-25, INRIA Rocquencourt, France, September 2001.
Supplement No. 2 to CIE publication No. 15 Colorimetry (E-1.3.1) 1971: Official recommendations on uniform color spaces, color-difference equations, and metric color terms, 1976.
The Corel Corporation WWW home page, http://www.corel.com, 1999.
A. Dimai. Unsupervised extraction of salient region-descriptors for content based image retrieval. In 10th International Conference on Image Analysis and Processing (ICIAP), September 27-29, pages 686–691, Venice, Italy, September 1999.
J. P. Eakins. Automatic image retrieval — are we getting anywhere? In Third International Conference on Electronic Libraries and Visual Information Research (ELVIRA3), April 30 - May 2, pages 123–135, Milton Keynes, UK, 1996. De Montfort University.
M. Everingham, A. Zisserman, and C. K. I. Williams et al. The 2005 PASCAL Visual Object Classes Challenge. In F. d’Alche Buc, I. Dagan, and J. Quinonero, editors, Selected Proceedings of the first PASCAL Challenges Workshop. Springer, 2006.
The Fine Arts Museum of San Francisco http://www.thinker.org, 2005.
J. Fan, Y. Gao, and H. Luo. Multi-level annotation of natural scenes using dominant image components and semantic concepts. In Proceedings of the 12th annual ACM international conference on Multimedia, pages 540–547, New York, NY, October 2004.
J. Fan, Y. Gao, H. Luo, and G. Xu. Automatic image annotation by using concept-sensitive salient objects for image content representation. In Proceedings of the 27th annual international conference on Research and development in information retrieval, pages 361–368, Sheffield, England, July 2004.
L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. In Proceedings of the Workshop on Generative-Model Based Vision, Washington, DC, June 2004.
W. Freeman and E. Adelson. The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9):891–906, September 1991.
H. Glotin and S. Tollari. Fast image auto-annotation with visual vector approximation clusters. In Proc. of IEEE EURASIP Fourth International Workshop on Content-Based Multimedia Indexing (CBMI2005), June 2005.
L. Guan, P. Muneesawang, J. Lay, I. Lee, and T. Amin. Recent advancement in indexing and retrieval of visual documents. In Proceedings of the Ninth International Conference on Distributed Multimedia Systems / The 2003 Conference on Visual Information Systems (VIS’2003), pages 375–380, Miami, FL, USA, September 2003.
V. N. Gudivada and V. V. Raghavan. Content-based image retrieval systems. IEEE Computer, 28(9):18–22, 1995.
ISO/IEC. Information technology - Multimedia content description interface - Part 3: Visual, 2002. 15938-3:2002(E).
J. Jeon, V. Lavrenko, and R. Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 119–126, Toronto, Canada, July-August 2003.
F. Jing, M. Li, L. Zhang, H. Zhang, and B. Zhang. Learning in region-based image retrieval. In Proceedings of International Conference on Image and Video Retrieval, volume 2728 of Lecture Notes in Computer Science, pages 198–207. Springer, 2003.
A. Khotanzad and Y. H. Hong. Invariant image recognition by Zernike moments. IEEE Transaction on Pattern Analysis and Machine Intelligence, 12(5):489–497, 1990.
Teuvo Kohonen. Self-Organizing Maps, volume 30 of Springer Series in Information Sciences. Springer-Verlag, third edition, 2001.
J. Laaksonen, J. Koskela, S. Laakso, and E. Oja. PicSOM – Content-based image retrieval with self-organizing maps. Pattern Recognition Letters, 21(13-14):1199–1207, December 2000.
J. Laaksonen, M. Koskela, S. Laakso, and E. Oja. Self-organizing maps as a relevance feedback technique in content-based image retrieval. Pattern Analysis & Applications, 4(2+3):140–152, June 2001.
J. Laaksonen, M. Koskela, and E. Oja. PicSOM—Self-organizing image retrieval with MPEG-7 content descriptions. IEEE Transactions on Neural Networks, Special Issue on Intelligent Multimedia Processing, 13(4):841–853, July 2002.
J. Li and J. Z. Wang. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9):1075–1088, September 2003.
N. K. Logothetis and D. L. Sheinberg. Visual object recognition. Annual Review of Neuroscience, 19:577–621, 1996.
D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91–110, November 2004.
K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10):1615–1630, October 2005.
A. Mohan, C. Papageorgiou, and T. Poggio. Example based object detection in images by components. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(4):349–361, April 2001.
F. Monay and D. Gatica-Perez. On image auto-annotation with latent space models. In Proceedings of the eleventh ACM international conference on Multimedia, pages 275–278, Berkeley, CA, 2003.
Y. Mori, H. Takahashi, and R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words. In Proceedings of First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999.
J.-Y. Pan, H.-J. Yang, P. Duygulu, and C. Faloutsos. Automatic image captioning. In Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, June 2004.
J.-Y. Pan, H.-J. Yang, C. Faloutsos, and P. Duygulu. GCap: Graph-based automatic image captioning. In Proceedings MDDE ’04, 4th International Workshop on Multimedia Data and Document Engineering, Washington, DC, USA, July 2004.
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. Computer Science Series. McGraw-Hill, 1983.
R. J. Schalkoff. Pattern Recognition: Statistical, Structural and Neural Approaches. John Wiley & Sons, Ltd., 1992.
A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12):1349–1380, December 2000.
M. Sonka, V. Hlavac, and R. Boyle. Image Processing, Analysis and Machine Vision. International Thomson Computer Press, 1993.
S. Ullman. High-Level Vision: Object recognition and cognition. MIT Press, 1996.
A. Ultsch. Data mining and knowledge discovery with emergent self-organizing feature maps for multivariate time series. In E. Oja and S. Kaski, editors, Kohonen Maps, pages 33–45. Elsevier, 1999.
V. Viitaniemi and J. Laaksonen. Keyword-detection approach to automatic image annotation. In Proceedings of 2nd European Workshop on the Integration of Knowledge, Semantic and Digital Media Technologies (EWIMT 2005), pages 15–22, London, UK, November 2005.
J. Z. Wang, J. Liu, and G. Wiederhold. SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(9):947–963, September 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer
About this chapter
Cite this chapter
Viitaniemi, V., Laaksonen, J. (2007). Focusing Keywords to Automatically Extracted Image Segments Using Self-Organising Maps. In: Nachtegael, M., Van der Weken, D., Kerre, E.E., Philips, W. (eds) Soft Computing in Image Processing. Studies in Fuzziness and Soft Computing, vol 210. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-38233-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-38233-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38232-4
Online ISBN: 978-3-540-38233-1
eBook Packages: EngineeringEngineering (R0)