Beyond Bounding-Boxes: Learning Object Shape by Model-Driven Grouping

Monroy, Antonio; Ommer, Björn

doi:10.1007/978-3-642-33712-3_42

Antonio Monroy²¹ &
Björn Ommer²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7574))

Included in the following conference series:

European Conference on Computer Vision

9337 Accesses
9 Citations

Abstract

Visual recognition requires to learn object models from training data. Commonly, training samples are annotated by marking only the bounding-box of objects, since this appears to be the best trade-off between labeling information and effectiveness. However, objects are typically not box-shaped. Thus, the usual parametrization of object hypotheses by only their location, scale and aspect ratio seems inappropriate since the box contains a significant amount of background clutter. Most important, however, is that object shape becomes only explicit once objects are segregated from the background. Segmentation is an ill-posed problem and so we propose an approach for learning object models for detection while, simultaneously, learning to segregate objects from clutter and extracting their overall shape. For this purpose, we exclusively use bounding-box annotated training data. The approach groups fragmented object regions using the Multiple Instance Learning (MIL) framework to obtain a meaningful representation of object shape which, at the same time, crops away distracting background clutter to improve the appearance representation.

Download to read the full chapter text

Chapter PDF

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Article 28 November 2014

Object Segmentation through Multiple Instance Learning

A Study on Self-Supervised Object Detection Pretraining

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. PAMI (2010)
Google Scholar
Levin, A., Weiss, Y.: Learning to combine bottom-up and top-down segmentation. IJCV 81(1), 105–118 (2009)
Article Google Scholar
Gao, T., Packer, B., Koller, D.: A segmentation-aware object detection model with occlusion handling. In: CVPR, pp. 1361–1368 (2011)
Google Scholar
Marszalek, M., Schmidt, C.: Accurate object recognition with shape masks. IJCV (97), 191–209 (2011)
Google Scholar
Vijayanarasimhan, S., Grauman, K.: Efficient region search for object detection. In: CVPR (2011)
Google Scholar
Malisiewicz, T., Efros, A.: Improving spacial support for objects via multiple segmentations. In: BMVC (2007)
Google Scholar
Todorovic, S., Ahuja, N.: Learning subcategory relevances for category recognition. In: CVPR (2008)
Google Scholar
Wang, X., Han, T., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: ICCV (2009)
Google Scholar
Chen, Y., Zhu, L(L.), Yuille, A.: Active Mask Hierarchies for Object Detection. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 43–56. Springer, Heidelberg (2010)
Chapter Google Scholar
Carreira, J., Li, F., Sminchisescu, C.: Object Recognition by Sequential Figure-Ground Ranking. IJCV (November 2011)
Google Scholar
Gu, C., Lim, J., Arbeláez, J., Malik, J.: Recognition using regions. In: ICCV (2009)
Google Scholar
Van de Sande, K., Uijlings, J., Gevers, T., Smeulders, A.: Segmentation as selective search for object recognition. In: ICCV (2011)
Google Scholar
Zhu, L., Chen, Y., Yuille, A.L., Freeman, W.: Latent hierarchical structural learning for object detection. In: CVPR, pp. 1062–1069 (2010)
Google Scholar
Ommer, B., Malik, J.: Multi-scale object detection by clustering lines. In: ICCV (2009)
Google Scholar
Carreira, J., Scminchisescu, C.: Constrained parametric min-cuts for automatic object segmentation. In: CVPR (2010)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Andrews, S., Tscochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: NIPS, vol. 15 (2003)
Google Scholar
Deselaers, T., Ferrari, V.: A conditional random field for multiple-instance learning. In: ICML (2010)
Google Scholar
Ferrari, V., Jurie, F., Schmid, C.: Accurate object detection with deformable shape models learnt from images. In: CVPR (2007)
Google Scholar
Toshev, A., Taskar, B., Daniilidis, K.: Object detection via boundary structure segmentation. In: CVPR (2010)
Google Scholar
Yarlagadda, P., Monroy, A., Ommer, B.: Voting by Grouping Dependent Parts. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 197–210. Springer, Heidelberg (2010)
Chapter Google Scholar
Maji, S., Malik, J.: Object detection using a max-margin hough transform. In: CVPR (2009)
Google Scholar
Ferrari, V., Fevrier, L., Jurie, F., Schmid, C.: Groups of adjacent contour segments for object detection. PAMI 30(1), 36–51 (2008)
Article Google Scholar
Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: ICCV (2009)
Google Scholar
Harzallah, H., Jurie, F., Schmid, C.: Combining efficient object localization and image classification. In: ICCV (2009)
Google Scholar
Mark, E., Gool, L., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes challenge 2007 (voc 2007). Results (2007)
Google Scholar
Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for mulit-class object layout. In: ICCV, pp. 229–236 (2009)
Google Scholar
Pedersoli, M., Vedaldi, A., Gonzalez, J.: A coarse-to-fine approach for fast deformable object detection. In: CVPR (2011)
Google Scholar
Razavi, N., Gall, J., van Gool, L.: Scalable mulit-class object detection. In: CVPR (2011)
Google Scholar
Schnitzpan, P., Fritz, M., Roth, S., Schiele, B.: Discriminative structure learning of hierarchical representations for object detection. In: CVPR, pp. 2238–2245 (2009)
Google Scholar
Schnitzspan, P., Roth, S., Schiele, B.: Automatic discovery of meaningful object parts with latent crfs. In: CVPR, pp. 121–128 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Interdisciplinary Center for Scientific Computing, University of Heidelberg, Germany
Antonio Monroy & Björn Ommer

Authors

Antonio Monroy
View author publications
You can also search for this author in PubMed Google Scholar
Björn Ommer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd., CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Monroy, A., Ommer, B. (2012). Beyond Bounding-Boxes: Learning Object Shape by Model-Driven Grouping. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33712-3_42

Download citation

DOI: https://doi.org/10.1007/978-3-642-33712-3_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33711-6
Online ISBN: 978-3-642-33712-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Beyond Bounding-Boxes: Learning Object Shape by Model-Driven Grouping

Abstract

Chapter PDF

Similar content being viewed by others

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Object Segmentation through Multiple Instance Learning

A Study on Self-Supervised Object Detection Pretraining

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Beyond Bounding-Boxes: Learning Object Shape by Model-Driven Grouping

Abstract

Chapter PDF

Similar content being viewed by others

Scene Parsing with Object Instance Inference Using Regions and Per-exemplar Detectors

Object Segmentation through Multiple Instance Learning

A Study on Self-Supervised Object Detection Pretraining

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation