Integrating Visual and Textual Cues for Image Classification
In this paper, we study computational models and techniques to merge textual and image features to classify images on the World Wide Web (WWW). A vector-based framework is used to index images on the basis of textual, pictorial and composite (textual-pictorial) information. The scheme makes use of weighted document terms and color invariant image features to obtain a highdimensional image descriptor in vector form to be used as an index. Experiments are conducted on a representative set of more than 100.000 images down loaded from the WWW together with their associated text. Performance evaluations are reported on the accuracy of merging textual and pictorial information for image classification.
KeywordsSynthetic Image Color Saturation Color Ratio Composite Information Pictorial Information
Unable to display preview. Download preview PDF.
- 1.Favella, J. and Meza, V., “Image-retrieval Agent: Integrating Image Content and Text”, IEEE Int. Sys., 1999.Google Scholar
- 3.S. Sclaroff, M. La Cascia, S. Sethi, L. Taycher, “Unifying Textual and Visual Cues for Content-based Image Retrieval on the World Wide Web,” CVIU, 75(1/2), 1999.Google Scholar
- 4.J.R. Smith and S.-F. Chang, “VisualSEEK: A Fully Automated Content-based Image Query System,” ACM Multimedia, 1996.Google Scholar
- 5.A. Vailaya, M. Figueiredo, A. Jain, H. Zhang, “Content-based Hierarchical Classification of Vacation Images,” IEEE ICMCS, June 7-11 1999, 1999.Google Scholar
- 6.H.-H. Yu and W. Wolf, “Scene Classification Methods for Image and Video Databases”, Proc. SPIE on DISAS, 1995.Google Scholar
- 7.D. Zhong, H. j. Zhang, S.-F. Chang, “Clustering Methods for Video Browsing and Annotation”, Proc. SPIE on SRIVD, 1995.Google Scholar