Abstract
Image annotation aims to assign semantic concepts to images based on their visual contents. It has received much attention recently as huge dynamic collections of images/videos become available on the Web. Most recent approaches employ supervised learning techniques, which have the limitation that a large set of labeled training samples is required for effective learning. This is both tedious and time consuming to obtain. This chapter explores the use of a bootstrapping framework to tackle this problem by employing three complementary strategies. First, we train two “view independent” classifiers based on probabilistic SVM using two orthogonal sets of content features and incorporate the classifiers in the co-training framework to annotate regions. Second, at the image level, we employ two different segmentation methods to segment the image into different sets of possibly overlapping regions and devise a contextual model to disambiguate the concepts learned from different regions. Third, we incorporate active learning in order to ensure that the framework is scalable to large image collections. Our experiments on a mid-sized image collection demonstrate that our bootstrapping cum active learning framework is effective. As compared to the traditional supervised learning approach, it is able to improve the accuracy of annotation by over 4% in F1 measure without active learning, and by over 18% when active learning is incorporated. Most importantly, the bootstrapping framework has the added benefit that it requires only a small set of training samples to kick start the learning process, making it suitable to practical applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abney, S. (2002) Bootstrapping, Association for Computational Linguistics (ACL’02).
Barnard, K., Forsyth, D. A. (2001) Learning the semantics of words and pictures, IEEE International Conference on Computer Vision II, 408–415
Barnard, K., Duygulu, P., Forsyth, D. (2001) Clustering Art, IEEE Computer Vision and Pattern Recognition, 434–441
Blum, A., Mitchell, T. (1998) Combined labeled data and unlabelled data with co-training, Proceeding of the 11th Annual Conference on Computational Learning Theory.
Cao, Y., Li, H., Lian, L. (2003) Uncertainty reduction in collaborative bootstrapping: measure and algorithm, Association for computational Linguistics (ACL’03).
Carson, C, Thomas, M, Hellerstein, J. M., Malik, J. (1999) BlobWorld: A system for region-based image indexing and retrieval, International Conf Visual Info Sys.
Chang, E., Goh, K., Sychay, G., Wu, G. (2003) CBSA: content-based soft annotation for multimodal image retrieval using Bayes Point Machines, IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on Conceptual and Dynamical Aspects of Multimedia Content Description 13, 26–38
Collins, M., Singer, Y. (1999) Unsupervised models for name entity classification, Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural language Processing and Very Large Corpora.
Deng, Y., Manjunath, B. S. (2001) Unsupervised segmentation of color-texture regions in images and video, IEEE Trans on Pattern Analysis and Machine Intelligence, 23, 800–810
Feng, H., Chua, T.-S., (2003) A bootstrapping approach to annotating large image collection, Workshop on Multimedia Information Retrieval, organized in part of ACM Multimedia 2003, 55–62
Jeon, J., Lavrenko, V., Manmatha, R. (2003) Automatic image annotation and retrieval using cross-media relevance models, ACM AIGIR, 119–126
Lewis, D. D., Gale, W. A. (1994) A sequential algorithm for training text classifiers, in proceeding of ACM SIGIR, 3–12
Mori, Y., Takahashi, H., Oka, R. (1999) Image-to-word transformation based on dividing and vector quantizing images with words, First International Workshop on multimedia Intelligent Storage and Retrieval Management.
Muslea, I., Minton, S., Knoblock, C. A. (2000) Selective sampling with co-testing, CRM Workshop on Combining and Selecting Multiple Models with Machine Learning.
Nigam, K., Ghani, R. (2000) Analyzing the effectiveness and applicability of co-training, Proceedings of the 9th International Conference on Information and Knowledge management.
Pierce, D., Cardie, C. (2001) Limitations of co-training for natural language learning from large datasets, Proceeding of the Conference on Empirical Methods in Natural Language Processing.
Platt, J. C. (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, in ‘Advances in Large Margin Classifiers’, Smola, A. J., Bartlett, P., Scholkopf, B., Schuurmans, D. (Eds). MIT Press.
Salton, G., McGill, M. J. (1983) Introduction to modern information retrieval, McGraw Hill.
Smith, J. R., Chang, S.-F. (1996) VisualSeek: A fully automated content-based query system, ACM Multimedia, 87–92
Smith, J. R., Naphade, M., Natsev, A. (2003) Multimedia semantic indexing using model vectors. ICME’ 03.
Shi, R., Feng, H., Chua, T.-S., Lee, C.-H. (2004) An adaptive image content representation and segmentation approach to automatic image annotation, Conference on Image and Video Retrieval (CIVR’04).
Vapnik, Vladimir. (1995) The nature of statistical learning theory, Springer, New York.
Wang, J. Z., Li, J. (2002) Learning-based linguistic indexing of pictures with 2-D MHHMs, ACM Multimedia’ 2002, 436–445
Zhang C, Chen, T. (2002) An active learning framework for content-based information retrieval, IEEE transactions on multimedia, 4, 260–268
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Chua, TS., Feng, H. (2005). A Scalable Bootstrapping Framework for Auto-Annotation of Large Image Collections. In: Tan, YP., Yap, K.H., Wang, L. (eds) Intelligent Multimedia Processing with Soft Computing. Studies in Fuzziness and Soft Computing, vol 168. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-32367-8_4
Download citation
DOI: https://doi.org/10.1007/3-540-32367-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23053-3
Online ISBN: 978-3-540-32367-9
eBook Packages: EngineeringEngineering (R0)