Abstract
Relevant and irrelevant web images collected by tag-based image retrieval have been employed as loosely labeled training data for learning SVM classifiers for image categorization by only using the visual features. In this work, we propose a new image categorization method by incorporating the textual features extracted from the surrounding textual descriptions (tags, captions, categories, etc.) as privileged information and simultaneously coping with noise in the loose labels of training web images. When the training and test samples come from different datasets, our proposed method can be further extended to reduce the data distribution mismatch by adding a regularizer based on the Maximum Mean Discrepancy (MMD) criterion. Our comprehensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed methods for image categorization and image retrieval by exploiting privileged information from web data.
Chapter PDF
Similar content being viewed by others
References
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: NIPS (2003)
Baktashmotlagh, M., Harandi, M., Brian Lovell, M.S.: Unsupervised domain adaptation by domain invariant projection. In: ICCV (2013)
Bergamo, A., Torresani, L.: Exploiting weakly-labeled web images to improve object classification: a domain adaptation approach. In: NIPS (2010)
Bruzzone, L., Marconcini, M.: Domain adaptation problems: A DASVM classification technique and a circular validation strategy. T-PAMI 32(5), 770–787 (2010)
Bunescu, R.C., Mooney, R.J.: Multiple instance learning for sparse positive bags. In: ICML (2007)
Chen, X., Shrivastava, A., Gupta, A.: NEIL: Extracting visual knowledge from web data. In: ICCV (2013)
Chen, Y., Bi, J., Wang, J.Z.: MILES: Multiple-instance learning via embedded instance selection. T-PAMI 28(12), 1931–1947 (2006)
Chu, W.S., DelaTorre, F., Cohn, J.: Selective transfer machine for personalized facial action unit detection. In: CVPR (2013)
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: CIVR (2009)
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: DeCAF: A deep convolutional activation feature for generic visual recognition. In: ICML (2014)
Duan, L., Li, W., Tsang, I.W., Xu, D.: Improving web image search by bag-based re-ranking. T-IP 20(11), 3280–3290 (2011)
Duan, L., Xu, D., Tsang, I.W.: Domain adaptation from multiple sources: A domain-dependent regularization approach. T-NNLS 23(3), 504–518 (2012)
Duan, L., Tsang, I.W., Xu, D.: Domain transfer multiple kernel learning. T-PAMI 34(3), 465–479 (2012)
Duan, L., Xu, D., Tsang, I.W., Luo, J.: Visual event recognition in videos by learning from web data. T-PAMI 34(9), 1667–1680 (2012)
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)
Farquhar, J.D.R., Hardoon, D.R., Meng, H., Shawe-Taylor, J., Szedmak, S.: Two view learning: SVM-2K, theory and practice. In: NIPS (2005)
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from Google’s image search. In: ICCV (2005)
Fernando, B., Habrard, A., Sebban, M., Tuytelaars, T.: Unsupervised visual domain adaptation using subspace alignment. In: ICCV (2013)
Ferrari, V., Zisserman, A.: Learning visual attributes. In: NIPS (2007)
Fouad, S., Tino, P., Raychaudhury, S., Schneider, P.: Incorporating privileged information through metric learning. T-NNLS 24(7), 1086–1098 (2013)
Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: CVPR (2012)
Gopalan, R., Li, R., Chellappa, R.: Domain adaptation for object recognition: An unsupervised approach. In: ICCV (2011)
Gretton, A., KBorgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. JMLR 13, 723–773 (2012)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Tech. rep., California Institute of Technology (2007)
Hardoon, D.R., Szedmak, S., Shawe-taylor, J.: Canonical correlation analysis: An overview with application to learning methods. Neural Computation 16(12), 2639–2664 (2004)
Huang, J., Smola, A., Gretton, A., Borgwardt, K., Scholkopf, B.: Correcting sample selection bias by unlabeled data. In: NIPS (2007)
Hwang, S.J., Grauman, K.: Learning the relative importance of objects from tagged images for retrieval and cross-modal search. IJCV 100(2), 134–153 (2012)
Krapac, J., Allan, M., Verbeek, J., Jurie, F.: Improving web image search results using query-relative classifier. In: CVPR (2010)
Kulis, B., Saenko, K., Darrell, T.: What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In: CVPR (2011)
Li, Q., Wu, J., Tu, Z.: Harvesting mid-level visual concepts from large-scale internet images. In: CVPR (2013)
Li, W., Duan, L., Tsang, I.W., Xu, D.: Batch mode adaptive multiple instance learning for computer vision tasks. In: CVPR, pp. 2368–2375 (2012)
Li, W., Duan, L., Tsang, I.W., Xu, D.: Co-labeling: A new multi-view learning approach for ambiguous problems. In: ICDM, pp. 419–428 (2012)
Li, W., Duan, L., Xu, D., Tsang, I.W.: Text-based image retrieval using progressive multi-instance learning. In: ICCV, pp. 2049–2055 (2011)
Li, W., Duan, L., Xu, D., Tsang, I.W.: Learning with augmented features for supervised and semi-supervised heterogeneous domain adaptation. T-PAMI 36(6), 1134–1148 (2014)
Liang, L., Cai, F., Cherkassky, V.: Predictive learning with structured (grouped) data. Neural Networks 22, 766–773 (2009)
Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. T-NN 22(2), 199–210 (2011)
Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. T-PAMI 33(4), 754–766 (2011)
Sharmanska, V., Quadrianto, N., Lampert, C.H.: Learning to rank using privileged information. In: ICCV (2013)
Torralba, A., Efros, A.A.: Unbiased look at dataset bias. In: CVPR (2011)
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: A large data set for nonparametric object and scene recognition. T-PAMI 30(11), 1958–1970 (2008)
Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)
Vapnik, V., Vashist, A.: A new learning paradigm: Learning using privileged infromatin. Neural Networks 22, 544–557 (2009)
Vijayanarasimhan, S., Grauman, K.: Keywords to visual categories: Multiple-instance learning for weakly supervised object categorization. In: CVPR (2008)
Zhou, Z., Zhang, M.: Multi-instance multi-label learning with application to scene classification. In: NIPS (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Li, W., Niu, L., Xu, D. (2014). Exploiting Privileged Information from Web Data for Image Categorization. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8693. Springer, Cham. https://doi.org/10.1007/978-3-319-10602-1_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-10602-1_29
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10601-4
Online ISBN: 978-3-319-10602-1
eBook Packages: Computer ScienceComputer Science (R0)