Weakly Supervised Object Localization with Stable Segmentations
Abstract
Multiple Instance Learning (MIL) provides a framework for training a discriminative classifier from data with ambiguous labels. This framework is well suited for the task of learning object classifiers from weakly labeled image data, where only the presence of an object in an image is known, but not its location. Some recent work has explored the application of MIL algorithms to the tasks of image categorization and natural scene classification. In this paper we extend these ideas in a framework that uses MIL to recognize and localize objects in images. To achieve this we employ state of the art image descriptors and multiple stable segmentations. These components, combined with a powerful MIL algorithm, form our object recognition system called MILSS. We show highly competitive object categorization results on the Caltech dataset. To evaluate the performance of our algorithm further, we introduce the challenging Landmarks-18 dataset, a collection of photographs of famous landmarks from around the world. The results on this new dataset show the great potential of our proposed algorithm.
Keywords
Image Categorization Object Categorization Salient Region Average Categorization Accuracy Multiple Instance LearnReferences
- 1.Fergus, R., Perona, P., Zisserman, A.: Weakly supervised scale-invariant learning of models for visual recognition. IJCV 71(3), 273–303 (2007)CrossRefGoogle Scholar
- 2.Opelt, A., Fussenegger, M., Auer, P.: Generic object recognition with boosting. PAMI 28(3), 416–431 (2006)CrossRefzbMATHGoogle Scholar
- 3.Russell, B., Efros, A., Sivic, J., Freeman, W., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: CVPR (2006)Google Scholar
- 4.Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering object categories in image collections. In: CVPR (2005)Google Scholar
- 5.Todorovic, S., Ahuja, N.: Extracting subimages of an unknown category from a set of images. In: CVPR (2006)Google Scholar
- 6.Bar-Hillel, A., Hertz, T., Weinshall, D.: Object class recognition by boosting a part-based model. In: CVPR (2005)Google Scholar
- 7.Crandall, D., Huttenlocher, D.: Weakly supervised learning of part-based spatial models for visual object recognition. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 16–29. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 8.Wang, G., Zhang, Y., Fei-Fei, L.: Using dependent regions for object categorization in a generative framework. In: CVPR (2006)Google Scholar
- 9.Chen, Y., Bi, J., Wang, J.: MILES: Multiple-instance learning via embedded instance selection. PAMI 28(12), 1931–1947 (2006)CrossRefGoogle Scholar
- 10.Qi, G., Hua, X., Rui, Y., Mei, T., Tang, J., Zhang, H.: Concurrent multiple instance learning for image categorization. In: CVPR (2007)Google Scholar
- 11.Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR (2003)Google Scholar
- 12.Dietterich, T.G., Lathrop, R.H., Perez, L.T.: Solving the multiple-instance problem with axis parallel rectangles. AAAI, Menlo Park (1997)zbMATHGoogle Scholar
- 13.Andrews, S., Hofmann, T., Tsochantaridis, I.: Multiple instance learning with generalized support vector machines. AAAI, Menlo Park (2002)zbMATHGoogle Scholar
- 14.Viola, P., Platt, J.C., Zhang, C.: Multiple instance boosting for object detection. In: NIPS, vol. 18 (2006)Google Scholar
- 15.Maron, O., Ratan, A.: Multiple-instance learning for natural scene classification. In: ICML (1998)Google Scholar
- 16.Zhou, Z., Zhang, M.: Multi-instance multi-label learning with application to scene classification. In: NIPS, vol. 19 (2007)Google Scholar
- 17.Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: NIPS, vol. 15 (2002)Google Scholar
- 18.Yang, C., Dong, M., Hua, J.: Region-based image annotation using asymmetrical support vector machine-based multi-instance learning. In: CVPR (2006)Google Scholar
- 19.Chen, Y., Wang, J.: Image categorization by learning and reasoning with regions. JMLR 5, 913–939 (2004)MathSciNetGoogle Scholar
- 20.Bi, J., Chen, Y., Wang, J.: A sparse support vector machine approach to region-based image categorization. In: CVPR (2005)Google Scholar
- 21.Friedman, J.H.: Greedy function approximation: A gradient boosting machine. The Annals of Statistics 29(5), 1189–1232 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
- 22.Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. JCSS 55, 119–139 (1997)MathSciNetzbMATHGoogle Scholar
- 23.Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)Google Scholar
- 24.Kadir, T., Brady, M.: Saliency, scale and image description. IJCV 45 (2001)Google Scholar
- 25.Carson, C., Belongie, S., Greenspan, H., Malik, J.: Blobworld: image segmentation using expectation-maximization and its application to image querying. PAMI 24(8), 1026–1038 (2002)CrossRefGoogle Scholar
- 26.Deng, Y., Manjunath, B.: Unsupervised segmentation of color-texture regions in images and video. PAMI 23(8), 800–810 (2001)CrossRefGoogle Scholar
- 27.Shi, J., Malik, J.: Normalized cuts and image segmentation. PAMI 22(8), 888–905 (2000)CrossRefGoogle Scholar
- 28.Rabinovich, A., Lange, T., Buhmann, J., Belongie, S.: Model order selection and cue combination for image segmentation. In: CVPR (2006)Google Scholar
- 29.Rabinovich, A., Vedaldi, A., Belongie, S.: Does image segmentation improve object categorization? UCSD Technical Report CSE CS2007-0908 (2007)Google Scholar
- 30.Malisiewicz, T., Efros, A.: Improving spatial support for objects via multiple segmentations. BMVC (2007)Google Scholar
- 31.Roth, V., Ommer, B.: Exploiting low-level image segmentation for object recognition. In: Franke, K., Müller, K.-R., Nickolay, B., Schäfer, R. (eds.) DAGM 2006. LNCS, vol. 4174, pp. 11–20. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 32.Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewora, E., Belongie, S.: Objects in context. In: ICCV (2007)Google Scholar
- 33.Malik, J., Belongie, S., Shi, J., Leung, T.: Textons, contours and regions: Cue integration in image segmentation. In: ICCV (1999)Google Scholar
- 34.Lowe, D.: Object recognition from local scale-invariant features. In: ICCV (1999)Google Scholar
- 35.Cour, T., Benezit, F., Shi, J.: Spectral segmentation with multiscale graph decomposition. In: CVPR (2005)Google Scholar