Abstract
In this paper, we present an object proposal generation method by applying energy optimization into superpixel merging algorithms in a multiscale framework, which could generate possible object locations in one image. As images in object detection datasets always enjoy high diversity, we adopt two different energy functions with multi-scales. Thus, our method enjoys the strength of global search, which is strong in locating salient object by concerning the whole image at one merge iteration, as well as the strength of local search which is more likely to recall the un-salient instances. What’s more, unlike most superpixel merging algorithms that are based on diversified segmentation results, our approach takes advantage of robust edge detection and segments each image only once, which greatly reduces the number of proposals. Experiments on PASCAL VOC 2007 test set show that the proposed method outperforms most previous superpixel merging based methods and also could compete with state-of-the-art proposal generators.
Similar content being viewed by others
Notes
Intersection-over-Union is to measure the overlap rate between the intersection of a candidate box and the ground truth box and the area of their union.
In practice we set 𝜖 e = 0.05.
In practice we set 𝜖 s = 0.1.
Here we use the fast version of [41], which performs better than their Quality version with less proposals.
References
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Susstrunk S (2012) Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282
Alexe B, Deselaers T, Ferrari V (2010) What is an object?. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):73–80
Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. IEEE Trans Pattern Anal Mach Intell 34(11):2189–2202
Arbelaez P, Pont-Tuset J, Barron J, Marques F, Malik J (2014) Multiscale combinatorial grouping. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):328–335
Branson S, Beijbom O, Belongie S (2013) Efficient large-scale structured learning. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):1806–1813
Bruce N, Tsotsos J (2006) Saliency based on information maximization. Advances in Neural Information Processing Systems (NIPS):155–162
Carreira J, Sminchisescu C (2012) Cpmc: Automatic object segmentation using constrained parametric min-cuts. IEEE Trans Pattern Anal Mach Intell 34(7):1312–1328
Cheng MM, Mitra NJ, Huang X, Torr PH, Hu SM (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582
Cheng MM, Zhang Z, Lin WY, Torr PHS (2014) BING: Binarized Normed gradients for objectness estimation at 300fps. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3286–3293
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):886–893
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L Imagenet large scale visual recognition competition 2012 (ilsvrc2012). http://www.image-net.org/challenges/LSVRC/2012/
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):248–255
Dollár P, Zitnick CL (2013) Structured forests for fast edge detection. IEEE International Conference on Computer Vision (ICCV):1841–1848
Endres I, Hoiem D (2010) Category independent object proposals. pp 575–588
Endres I, Hoiem D (2014) Category-independent object proposals with diverse ranking. IEEE Trans Pattern Anal Mach Intell 36(2):222–234
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181
Fidler S, Mottaghi R, Yuille A, Urtasun R (2013) Bottom-up segmentation for top-down detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3294–3301
Girshick R, Donahue J, Darrell T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158
Gonzalez-Garcia A, Vezhnevets A, Ferrari V (2015) An active search strategy for efficient object class detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3022–3031
Han J, He S, Qian X, Wang D, Guo L, Liu T (2013) An object-oriented visual saliency detection framework based on sparse coding representations. IEEE Trans Circ Syst Video Technol 23(12):2009–2021
Han J, Zhang D, Hu X, Guo L, Ren J, Wu F (2015) Background prior-based salient object detection via deep reconstruction residual. IEEE Trans Circ Syst Video Technol 25(8):1309–1321
Han J, Zhang D, Wen S, Guo L, Liu T, Li X (2016) Two-stage learning to predict human eye fixations via SDAEs. IEEE Trans Cybern 46(2):487–498
Hare S, Golodetz S, Saffari A, Vineet V, Cheng MM, Hicks S, Torr P (2016) Struck: Structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence
Hariharan B, Arbeláez P, Girshick R, Malik J (2014) Simultaneous detection and segmentation. pp 297–312
Hariharan B, Malik J, Ramanan D (2012) Discriminative decorrelation for clustering and classification. European Conference on Computer Vision (ECCV):459–472
Hosang J, Benenson R, Dollár P, Schiele B (2016) What makes for effective detection proposals?. IEEE Trans Pattern Anal Mach Intell 38(4):814–830
Hosang J, Benenson R, Schiele B (2014) How good are detection proposals, really? British Machine Vision Conference (BMVC)
Humayun A, Li F, Rehg JM (2014) RIGOR: Reusing inference in graph cuts for generating object regions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):336–343
Itti L, Koch C, Niebur E, et al. (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Karianakis N, Fuchs TJ, Soatto S (2015) Boosting convolutional features for robust object proposals. arXiv preprint arXiv:1503.06350
Krähenbühl P, Koltun V (2014) Geodesic object proposals. pp 725–739
Li N, Ye J, Ji Y, Ling H, Yu J (2014) Saliency detection on light field. pp 2806–2813
Li X, Lu H, Zhang L, Ruan X, Yang MH (2013) Saliency detection via dense and sparse reconstruction. pp 2976–2983
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. pp 740–755
Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. IEEE International Conference on Computer Vision (ICCV):89–96
Manen S, Guillaumin M, Gool LV (2013) Prime object proposals with randomized prim’s algorithm. IEEE International Conference on Computer Vision (ICCV):2536–2543
Rantalankila P, Kannala J, Rahtu E (2014) Generating object segmentation proposals using global and local search. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):2417–2424
Van de Sande KE, Uijlings JR, Gevers T, Smeulders AW (2011) Segmentation as selective search for object recognition. IEEE International Conference on Computer Vision (ICCV):1879–1886
Uijlings JR, Van de Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Valenti R, Sebe N, Gevers T (2009) Image saliency by isocentric curvedness and color. IEEE International Conference on Computer Vision (ICCV):2185–2192
Wang L, Lu H, Ruan X, Yang MH (2015) Deep networks for saliency detection via local estimation and global search. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3183–3192
Wei Y, Wen F, Zhu W, Sun J (2012) Geodesic saliency using background priors. pp 29–42
Yang C, Zhang L, Lu H, Ruan X, Yang MH (2013) Saliency detection via graph-based manifold ranking. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3166–3173
Zhang Z, Warrell J, Torr PH (2011) Proposal generation for object detection using cascaded ranking svms. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):1497–1504
Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):2814–2821
Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. pp 391–405
Acknowledgments
This work was supported by the National Natural Science Foundation of China(No.61301238, 61201424), China Scholarship Council(No.201506205024) and the Natural Science Foundation of Tianjin, China(No.14ZCDZGX00831).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, C., Yang, J., Wang, K. et al. Multi-scale energy optimization for object proposal generation. Multimed Tools Appl 76, 10481–10499 (2017). https://doi.org/10.1007/s11042-016-3616-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3616-7