Multimedia Tools and Applications

, Volume 78, Issue 10, pp 13111–13130 | Cite as

Insights of object proposal evaluation

  • Yuantian Wang
  • Lei Huang
  • Tongwei RenEmail author
  • Sheng-Hua Zhong
  • Han Gu
  • Yan Liu


Object proposal aims to locate category-independent objects in a given image with a limited number of object candidates indicated by bounding boxes, which can be served as a fundamental of various multimedia applications. Current evaluation criteria based on recall cannot reveal the real abilities of different object proposal methods in objectness measurement. In this paper, we propose a novel object proposal evaluation criterion instead of recall, named objectness measurement ability (OMA). We first analyze the probability to hit an object by non-repetitive random sampling (HPRS), and provide an algorithm for calculating HPRS efficiently. Based on HPRS, we define OMA and extend three commonly used object proposal evaluation criteria by replacing recall with OMA. We evaluated six typical object proposal methods using recall based criteria and OMA based criteria on the test data of PASCAL VOC 2007 and PASCAL VOC 2012. The experimental results show that OMA based criteria can provide more stable evaluation results than recall based ones in revealing objectness measurement ability.


Object proposal evaluation Objectness measurement ability Hit probability of random sampling 



This work is supported by National Science Foundation of China (61321491, 61202320), Undergraduate Innovation Project of Nanjing University (X201610284039), and Collaborative Innovation Center of Novel Software Technology and Industrialization.


  1. 1.
    Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. TPAMI 34(11): 2189–2202CrossRefGoogle Scholar
  2. 2.
    Arbelaez P, Pont-Tuset J, Barron J, Marques F, Malik J (2014) Multiscale combinatorial grouping. In: CVPR, pp 328–335Google Scholar
  3. 3.
    Bai J, Chen Z, Feng B, Xu B (2014) Chinese image text recognition on grayscale pixels. In: ICASSP, pp 1380–1384Google Scholar
  4. 4.
    Bao BK, Zhu G, Shen J, Yan S (2013) Robust image analysis with sparse representation on quantized visual features. TIP 22(3):860–871MathSciNetzbMATHGoogle Scholar
  5. 5.
    Carreira J, Sminchisescu C (2012) Cpmc: automatic object segmentation using constrained parametric min-cuts. TPAMI 34(7):1312–1328CrossRefGoogle Scholar
  6. 6.
    Chavali N, Agrawal H, Mahendru A, Batra D (2015) Object-proposal evaluation protocol is ‘gameable’. Comp SciGoogle Scholar
  7. 7.
    Chen X, Ma H, Wang X, Zhao Z (2015) Improving object proposals with multi-thresholding straddling expansion. In: CVPR, pp 2587–2595Google Scholar
  8. 8.
    Chen Z, Sun L, Yang S (2009) Auto-cut for web images. In: MM, pp 529–532Google Scholar
  9. 9.
    Chen Z, Cao J, Song Y, Zhang Y, Li J (2010) Web video categorization based on wikipedia categories and content-duplicated open resources. In: MM, pp 1107–1110Google Scholar
  10. 10.
    Cheng MM, Zhang Z, Lin WY, Torr P (2014) Bing: binarized normed gradients for objectness estimation at 300fps. In: CVPR, pp 3286–3293Google Scholar
  11. 11.
    Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. IJCV 88(2):303–338CrossRefGoogle Scholar
  12. 12.
    Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A The PASCAL visual object classes challenge 2012 (VOC2012) results.
  13. 13.
    Gao Z, Zhang H, Xu G, Xue Y (2015) Multi-perspective and multi-modality joint representation and recognition model for 3d action recognition. NEUCOM 151:554–564Google Scholar
  14. 14.
    Gao Z, Zhang Y, Zhang H, Xue YB, Xu GP (2016) Multi-dimensional human action recognition model based on image set and group sparisty. NEUCOM 215:138–149Google Scholar
  15. 15.
    Guo J, Ren T, Huang L, Bei J (2017) Saliency detection on sampled images for tag ranking. MMSJ.
  16. 16.
    Hosang J, Benenson R, Dollar P, Schiele B (2015) What makes for effective detection proposals? TPAMI 38(4):6644–6665Google Scholar
  17. 17.
    Jiang F, Hu HM, Zheng J, Li B (2016) A hierarchal bow for image retrieval by enhancing feature salience. NEUCOM 175(PA):146–154Google Scholar
  18. 18.
    Krähenbühl P, Koltun V (2014) Geodesic object proposals. In: ECCV, pp 725–739Google Scholar
  19. 19.
    Liu AA, Su YT, Nie WZ, Kankanhalli M (2017) Hierarchical clustering multi-task learning for joint human action grouping and recognition. TPAMI 39 (1):102–114CrossRefGoogle Scholar
  20. 20.
    Liu Y, Liu J, Li Z, Tang J, Lu H (2013) Weakly-supervised dual clustering for image semantic segmentation. In: CVPR, pp 2075–2082Google Scholar
  21. 21.
    Liu J, Li Z, Tang J, Jiang Y, Lu H (2014) Personalized geo-specific tag recommendation for photos on social websites. TMM 16(3):588–600Google Scholar
  22. 22.
    Liu J, Ren T, Bao BK, Bei J (2016) Depth-aware layered edge for object proposal. In: ICME. IEEE, pp 1–6Google Scholar
  23. 23.
    Liu J, Ren T, Wang Y, Zhong SH, Bei J, Chen S (2017) Object proposal on rgb-d images via elastic edge boxes. NEUCOM 236:134–146Google Scholar
  24. 24.
    Manen S, Guillaumin M, Gool LV (2013) Prime object proposals with randomized prim’s algorithm. In: ICCV, pp 2536–2543Google Scholar
  25. 25.
    Rahman ASMM, Saddik AE (2011) Mobile based multimodal retrieval and navigation of learning objects using a 3d car metaphor. In: ICIMCS, pp 103–107Google Scholar
  26. 26.
    Ren T, Qiu Z, Liu Y, Yu T, Bei J (2015) Soft-assigned bag of features for object tracking. MMSJ 21(2):189–205Google Scholar
  27. 27.
    Ren T, Liu Y, Ju R, Wu G (2016) How important is location information in saliency detection of natural images. MTAP 75(5):2543–2564Google Scholar
  28. 28.
    Sang J, Mei T, Xu YQ, Zhao C, Xu C, Li S (2013) Interaction design for mobile visual search. TMM 15(7):1665–1676Google Scholar
  29. 29.
    Sang J, Xu C (2012) Robust face-name graph matching for movie character identification. TMM 14(3):586–596Google Scholar
  30. 30.
    Sang J, Xu C, Liu J (2012) User-aware image tag refinement via ternary semantic analysis. TMM 14(3):883–895Google Scholar
  31. 31.
    Sang J, Xu C, Lu D (2012) Learn to personalized image search from the photo sharing websites. TMM 14(4):963–974Google Scholar
  32. 32.
    Song X, Zhang J, Han Y, Jiang J (2016) Semi-supervised feature selection via hierarchical regression for web image classification. MMSJ 22(1):41–49Google Scholar
  33. 33.
    Tang J, Li H, Qi GJ, Chua TS (2010) Image annotation by graph-based inference with integrated multiple/single instance representations. TMM 12(2):131–141Google Scholar
  34. 34.
    Uijlings JRR, Sande KEAVD, Gevers T, Smeulders AWM (2013) Selective search for object recognition. IJCV 104(2):154–171CrossRefGoogle Scholar
  35. 35.
    Wang P, Sun L, Yang S, Smeaton AF (2016) Towards training-free refinement for semantic indexing of visual media. In: MMMGoogle Scholar
  36. 36.
    Wang S, Huang Q, Jiang S, Tian Q (2010) Nearest-neighbor classification using unlabeled data for real world image application. In: MM, pp 1151–1154Google Scholar
  37. 37.
    Zhang K, Liu Q, Song H, Li X (2014) A variational approach to simultaneous image segmentation and bias correction. T Cybernetics 45(8):1426–1437CrossRefGoogle Scholar
  38. 38.
    Zhu S, Aloufi S, El-Saddik A (2015) Utilizing image social clues for automated image tagging. In: ICME, pp 1–6Google Scholar
  39. 39.
    Zhu Y, Huang X, Huang Q, Tian Q (2016) Large-scale video copy retrieval with temporal-concentration sift. NEUCOM 187(C):83–91Google Scholar
  40. 40.
    Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. Springer International PublishingGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2017

Authors and Affiliations

  1. 1.State Key Laboratory for Novel Software TechnologyNanjing UniversityNanjingChina
  2. 2.College of Computer Science and Software EngineeringShenzhen UniversityShenzhenChina
  3. 3.Computing DepartmentThe Hong Kong Polytechnic UniversityHong KongChina

Personalised recommendations