Which Looks Like Which: Exploring Inter-class Relationships in Fine-Grained Visual Categorization
Abstract
Fine-grained visual categorization aims at classifying visual data at a subordinate level, e.g., identifying different species of birds. It is a highly challenging topic receiving significant research attention recently. Most existing works focused on the design of more discriminative feature representations to capture the subtle visual differences among categories. Very limited efforts were spent on the design of robust model learning algorithms. In this paper, we treat the training of each category classifier as a single learning task, and formulate a generic multiple task learning (MTL) framework to train multiple classifiers simultaneously. Different from the existing MTL methods, the proposed generic MTL algorithm enforces no structure assumptions and thus is more flexible in handling complex inter-class relationships. In particular, it is able to automatically discover both clusters of similar categories and outliers. We show that the objective of our generic MTL formulation can be solved using an iterative reweighted ℓ2 method. Through an extensive experimental validation, we demonstrate that our method outperforms several state-of-the-art approaches.
Keywords
Fine-grained visual categorization inter-class relationship multiple task learningReferences
- 1.Angelova, A., Zhu, S.: Efficient object detection and segmentation for fine-grained recognition. In: CVPR (2013)Google Scholar
- 2.Argyriou, A., Evgeniou, T., Pontil, M.: Multi-task feature learning. In: NIPS (2007)Google Scholar
- 3.Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73(3), 243–272 (2008)CrossRefGoogle Scholar
- 4.Babenko, B., Branson, S., Belongie, S.: Similarity metrics for categorization: From monolithic to category specific. In: ICCV (2009)Google Scholar
- 5.Bar-Hillel, A., Weinshall, D.: Subordinate class recognition using relational object models. In: NIPS (2006)Google Scholar
- 6.Bart, E., Porteous, I., Perona, P., Welling, M.: Unsupervised learning of visual taxonomies. In: CVPR (2008)Google Scholar
- 7.Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Img. Sci. 2(1), 183–202 (2009)CrossRefzbMATHMathSciNetGoogle Scholar
- 8.Berg, T., Belhumeur, P.N.: How do you tell a blackbird from a crow? In: ICCV (2013)Google Scholar
- 9.Berg, T., Liu, J., Lee, S.W., Alexander, M.L., Jacobs, D.W., Belhumeur, P.N.: Birdsnap: Large-scale fine-grained visual categorization of birds. In: CVPR (2014)Google Scholar
- 10.Bo, L., Ren, X., Fox, D.: Kernel Descriptors for Visual Recognition. In: NIPS (2010)Google Scholar
- 11.Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 438–451. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 12.Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)CrossRefMathSciNetGoogle Scholar
- 13.Chai, Y., Rahtu, E., Lempitsky, V., Van Gool, L., Zisserman, A.: TriCoS: A tri-level class-discriminative co-segmentation method for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 794–807. Springer, Heidelberg (2012)CrossRefGoogle Scholar
- 14.Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: ICCV (2013)Google Scholar
- 15.Chapelle, O.: Training a support vector machine in the primal. Neural. Comput. 19(5), 1155–1178 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
- 16.Chen, J., Zhou, J., Ye, J.: Integrating low-rank and group-sparse structures for robust multi-task learning. In: KDD (2011)Google Scholar
- 17.Deng, J., Krause, J., Fei-Fei, L.: Fine-grained crowdsourcing for fine-grained recognition. In: CVPR (2013)Google Scholar
- 18.Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for fine-grained recognition. In: CVPR (2012)Google Scholar
- 19.Farrell, R., Oza, O., Zhang, N., Morariu, V.I., Darrell, T., Davis, L.S.: Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance. In: ICCV (2011)Google Scholar
- 20.Fergus, R., Bernal, H., Weiss, Y., Torralba, A.: Semantic label sharing for learning with many categories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 762–775. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 21.Gavves, E., Fernando, B., Snoek, C.G.M., Smeulders, A.W.M., Tuytelaars, T.: Fine-grained categorization by alignments. In: ICCV (2013)Google Scholar
- 22.Gong, P., Ye, J., Zhang, C.: Robust multi-task feature learning. In: KDD (2012)Google Scholar
- 23.Griffin, G., Perona, P.: Learning and using taxonomies for fast visual categorization. In: CVPR (2008)Google Scholar
- 24.Jalali, A., Ravikumar, P.D., Sanghavi, S., Ruan, C.: A dirty model for multi-task learning. In: NIPS (2010)Google Scholar
- 25.Kang, Z., Grauman, K., Sha, F.: Learning with whom to share in multi-task feature learning. In: ICML (2011)Google Scholar
- 26.Khan, F.S., Van De Weijer, J., Bagdanov, A.D., Vanrell, M.: Portmanteau vocabularies for multi-cue image representation. In: NIPS (2011)Google Scholar
- 27.Khosla, A., Jayadevaprakash, N., Yao, B., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: First Workshop on FGVC, CVPR (2011)Google Scholar
- 28.Kumar, A., Daumé III, H.: Learning task grouping and overlap in multi-task learning. In: ICML (2012)Google Scholar
- 29.Melacci, S., Belkin, M.: Laplacian Support Vector Machines Trained in the Primal. JMLR 12, 1149–1184 (2011)zbMATHMathSciNetGoogle Scholar
- 30.Salakhutdinov, R., Torralba, A., Tenenbaum, J.: Learning to share visual appearance for multiclass object detection. In: CVPR (2011)Google Scholar
- 31.Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRefGoogle Scholar
- 32.Su, H., Yu, A.W., Fei-Fei, L.: Efficient euclidean projections onto the intersection of norm balls. In: ICML (2012)Google Scholar
- 33.Todorovic, S., Ahuja, N.: Learning subcategory relevances for category recognition. In: CVPR (2008)Google Scholar
- 34.Wah, C., Branson, S., Perona, P., Belongie, S.: Multiclass recognition and part localization with humans in the loop. In: ICCV (2011)Google Scholar
- 35.Wang, H., Nie, F., Huang, H., Risacher, S.L., Ding, C.H.Q., Saykin, A.J., Shen, L.: Adni: Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance. In: ICCV (2011)Google Scholar
- 36.Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010)Google Scholar
- 37.Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD Birds 200. Tech. Rep. CNS-TR-2010-001, California Institute of Technology (2010)Google Scholar
- 38.Wipf, D.P., Nagarajan, S.S.: Iterative reweighted ℓ1 and ℓ2 methods for finding sparse solutions. J. Sel. Topics Signal Processing 4(2), 317–329 (2010)CrossRefGoogle Scholar
- 39.Xie, L., Tian, Q., Hong, R., Yan, S., Zhang, B.: Hierarchical Part Matching for Fine-Grained Visual Categorization. In: ICCV (2013)Google Scholar
- 40.Yang, S., Bo, L., Wang, J., Shapiro, L.: Unsupervised Template Learning for Fine-Grained Object Recognition. In: NIPS (2012)Google Scholar
- 41.Yao, B., Bradski, G., Fei-Fei, L.: A codebook-free and annotation-free approach for fine-grained image categorization. In: CVPR (2012)Google Scholar
- 42.Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: CVPR (2011)Google Scholar
- 43.Zhang, N., Farrell, R., Darrell, T.: Pose pooling kernels for sub-category recognition. In: CVPR (2012)Google Scholar
- 44.Zhang, N., Farrell, R., Iandola, F., Darrell, T.: Deformable part descriptors for fine-grained recognition and attribute prediction. In: ICCV (2013)Google Scholar
- 45.Zhou, J., Chen, J., Ye, J.: Clustered multi-task learning via alternating structure optimization. In: NIPS (2011)Google Scholar
- 46.Zweig, A., Weinshall, D.: Hierarchical regularization cascade for joint learning. In: ICML (2013)Google Scholar