In Defense of Active Part Selection for Fine-Grained Classification
Fine-grained classification is a recognition task where subtle differences distinguish between different classes. To tackle this classification problem, part-based classification methods are mostly used. Partbased methods learn an algorithm to detect parts of the observed object and extract local part features for the detected part regions. In this paper we show that not all extracted part features are always useful for the classification. Furthermore, given a part selection algorithm that actively selects parts for the classification we estimate the upper bound for the fine-grained recognition performance. This upper bound lies way above the current state-of-the-art recognition performances which shows the need for such an active part selection method. Though we do not present such an active part selection algorithm in this work, we propose a novel method that is required by active part selection and enables sequential part-based classification. This method uses a support vector machine (SVM) ensemble and allows to classify an image based on arbitrary number of part features. Additionally, the training time of our method does not increase with the amount of possible part features. This fact allows to extend the SVM ensemble with an active part selection component that operates on a large amount of part feature proposals without suffering from increasing training time.
Keywordsfine-grained recognition SVM ensemble bagging
Unable to display preview. Download preview PDF.
- 1.J. Ba, V. Mnih, and K. Kavukcuoglu, “Multiple object recognition with visual attention,” CoRR, abs/1412.7755 (2014). https://arxiv.org/abs/1412.7755Google Scholar
- 3.J. Denzler and C. M. Brown. “Information theoretic sensor data selection for active object recognition and state estimation,” IEEE Trans. Pattern Anal. Mach. Intell. 24 (2), 145–157 (2002).Google Scholar
- 4.M. Jaderberg, K. Simonyan, A. Zisserman, et al. “Spatial transformer networks,” in Advances in Neural Information Processing Systems 28: Proc. Annual Conf. NIPS 2015 (Montreal, Canada, 2015), pp. 2017–2025.Google Scholar
- 6.J. Krause, B. Sapp, A. Howard, H. Zhou, A. Toshev, T. Duerig, J. Philbin, and L. Fei–Fei, “The unreasonable effectiveness of noisy data for fine–grained recognition,” in Computer Vision–ECCV 2016, Proc. 14th European Conf., Part II, Ed. by B. Leibe et al., Lecture Notes in Computer Science (Springer, Cham, 2016), Vol. 9906, pp. 301–320.Google Scholar
- 7.B. Linghu and B.–Y. Sun, “Constructing effective SVM ensembles for image classification,” in Proc. 2010 3rd International Symposium on Knowledge Acquisition and Modeling (KAM) (Wuhan, China, 2010), IEEE, pp. 80–83.Google Scholar
- 8.X. Liu, T. Xia, J. Wang, and Y. Lin, “Fully convolutional attention localization networks: Efficient attention localization for fine–grained recognition,” arXiv:1603.06765 (2016). https://arxiv.org/abs/1603.06765Google Scholar
- 9.V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu, “Recurrent models of visual attention,” in Advances in Neural Information Processing Systems 27: Proc. Annual Conf. NIPS 2014 (Montreal, Canada, 2014), pp. 2204–2212.Google Scholar
- 10.P. Sermanet, A. Frome, and E. Real, “Attention for finegrained categorization,” arXiv:1412.7054 (2014). https://arxiv.org/abs/1412.7054Google Scholar
- 11.M. Simon and E. Rodner, “Neural activation constellations: Unsupervised part model discovery with convolutional networks,” in Proc. 2015 IEEE Int. Conf. on Computer Vision (ICCV) (Santiago, Chile, 2015), pp. 1143–1151.Google Scholar
- 12.M. Simon, E. Rodner, and J. Denzler, “Part detector discovery in deep convolutional neural networks,” in Computer Vision–ACCV 2014, Proc. 12th Asian Conference on Computer Vision, Ed. by D. Cremers et al., Lecture Notes in Computer Science (Springer, Cham, 2014), Vol. 9004, pp. 162–177.Google Scholar
- 13.K. Simonyan and A. Zisserman. “Very deep convolutional networks for large–scale image recognition,” CoRR, abs/1409.1556 (2014). https://arxiv.org/abs/1409.1556Google Scholar
- 14.C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, The Caltech–UCSD Birds–200–2011 Dataset, Technical Report CNS–TR–2011–001 (California Institute of Technology, 2011).Google Scholar