Abstract
Weakly-supervised image semantic segmentation aims to segment images into semantically consistent regions with only image-level labels are available, and is of great significance for fine-grained image analysis, retrieval and other possible applications. In this paper, we propose a Boosted Multi-Instance Multi-Label (BMIML) learning method to address this problem, the approach is built upon the following principles. We formulate the image semantic segmentation task as a MIML problem under the boosting framework, where the goal is to simultaneously split the superpixels obtained from over-segmented images into groups and train one classifier for each group. In the method, a loss function which uses the image-level labels as weakly-supervised constraints, is employed to suitable semantic labels to these classifiers. At the same time a contextual loss term is also combined to reduce the ambiguities existing in the training data. In each boosting round, we introduce an “objectness” measure to jointly reweigh the instances, in order to overcome the disturbance from highly frequent background superpixels. We demonstrate that BMIML outperforms the state-of-the-arts for weakly-supervised semantic segmentation on two widely used datasets, i.e., MSRC and LabelMe.
Similar content being viewed by others
References
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Ssstrunk S (2012) Slic superpixels compared to state-of-the-art superpixel methods. IEEE TPAMI 22(8):888–905
Alexe B, Deselaers T, Ferrari V (2010) What is an object? In: CVPR
Arbelaez P, Hariharan B, Gu C, Gupta S, Bourdev L, Malik J (2012) Semantic segmentation using regions and parts. In: CVPR
Babenko B, Dollar P, Tu Z, Belongie S (2008) Simultaneous learning and alignment: Multi-instance and multi-pose learning. In: ECCV Workshop
Friedman J (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232
Fulkerson B, Vedaldi A, Soatto S (2009) Class segmentation and object localization with superpixel neighborhoods. In: ICCV
Han Y, Wu F, Shao J, Tian Q, Zhuang Y (2012) Graph-guided sparse reconstruction for region tagging. In: CVPR
jia Li L, Socher R, Fei-fei L (2009) Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In: CVPR
Liu D, Yan S, Rui Y, Zhang HJ (2010) Unified tag analysis with multi-edge graph. In: ACM MM
Liu X, Cheng B, Yan S, Tang J, Chua TS, Jin H (2009) Label to region by bi-layer sparsity priors. In: ACM MM
Liu X, Yan S, Luo J, Tang J, Huango Z, Jin H (2010) Nonparametric label-to-region by search. In: CVPR
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. IJCV 60:91–110
Mason L, Baxter J, Bartlett P, Frean M (1999) Boosting algorithms as gradient descent in function space. NIPS
Viola PA, Platt J, Zhang C (2005) Multiple instance boosting for object detection. In: NIPS
Rabinovich A, Vedaldi A, Galleguillos C, Wiewiora E, Belongie S (2007) Objects in context. In: ICCV
Russell C, Torr PHS, Kohli P (2009) Associative hierarchical crfs for object class image segmentation. In: ICCV
Shotton J, Winn J, Rother C, Criminisi A (2009) Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. IJCV 81:2–23
Socher R, Fei-fei L (2010) Connecting modalities: semi-supervised segmentation and annotation of images using unaligned text corpora. In: CVPR
Tighe J, Lazebnik S (2010) Superparsing: scalable nonparametric image parsing with superpixels. In: ECCV
Vezhnevets A, Ferrari V, Buhmann J (2011) Weakly supervised semantic segmentation with a multi-image model. In: ICCV
Vezhnevets A, Ferrari V, Buhmann JM (2012) Weakly supervised structured output learning for semantic segmentation. In: CVPR
Yang Y, Yang Y, Huang Z, Shen HT, Nie F (2011) Tag localization with spatial correlations and joint group sparsity. In: CVPR
Yao J, Fidler S, Urtasun R (2012) Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: CVPR
Zha ZJ, Hua XS, Mei T, Wang J, Qi GJ, Wang Z (2008) Joint multi-label multi-instance learning for image classification. In: CVPR
Zhang ML, Zhou ZH (2008) M3miml: a maximum margin method for multi-instance multi-label learning. In: ICDM
Zhou z (2006) Multi-instance multi-label learning with application to scene classification
Acknowledgments
This work was supported by 973 Program (2010CB327905), National Natural Science Foundation of China (61272329, 61070104, 61202325) and Open Projects Program of National Laboratory of Pattern Recognition.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, Y., Li, Z., Liu, J. et al. Boosted MIML method for weakly-supervised image semantic segmentation. Multimed Tools Appl 74, 543–559 (2015). https://doi.org/10.1007/s11042-014-1967-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-1967-5