Abstract
A natural way to improve the detection of objects is to consider the contextual constraints imposed by the detection of additional objects in a given scene. In this work, we exploit the spatial relations between objects in order to improve detection capacity, as well as analyze various properties of the contextual object detection problem. To precisely calculate context-based probabilities of objects, we developed a model that examines the interactions between objects in an exact probabilistic setting, in contrast to previous methods that typically utilize approximations based on pairwise interactions. Such a scheme is facilitated by the realistic assumption that the existence of an object in any given location is influenced by only few informative locations in space. Based on this assumption, we suggest a method for identifying these relevant locations and integrating them into a mostly exact calculation of probability based on their raw detector responses. This scheme is shown to improve detection results and provides unique insights about the process of contextual inference for object detection. We show that it is generally difficult to learn that a particular object reduces the probability of another, and that in cases when the context and detector strongly disagree this learning becomes virtually impossible for the purposes of improving the results of an object detector. Finally, we demonstrate improved detection results through use of our approach as applied to the PASCAL VOC and COCO datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Arbel, N., Avraham, T., Lindenbaum, M.: Inner-scene similarities as a contextual cue for object detection (2017). arXiv preprint
Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: CVPR, pp. 2874–2883 (2016)
Chechetka, A., Guestrin, C.: Evidence-specific structures for rich tractable CRFs. In: Advances in Neural Information Processing Systems, pp. 352–360 (2010)
Cinbis, R.G., Sclaroff, S.: Contextual object detection using set-based classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 43–57. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_4
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)
Desai, C., Ramanan, D., Fowlkes, C.C.: Discriminative models for multi-class object layout. Int. J. Comput. Vis. 95(1), 1–12 (2011)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: CVPR, pp. 1–8 (2008)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. In: CVPR, pp. 1–8 (2008)
Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Heitz, G., Koller, D.: Learning spatial context: using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_4
Hoiem, D., Efros, A.A., Hebert, M.: Putting objects in perspective. Int. J. Comput. Vis. 80(1), 3–15 (2008)
Kohli, P., Rother, C.: Higher-order models in computer vision. In: Image Processing and Analysing with Graphs: Theory and Practice, Chap. 3, pp. 65–92. CRC Press (2012)
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)
Komodakis, N., Tziritas, G.: Image completion using efficient belief propagation via priority scheduling and dynamic pruning. IEEE Trans. Pattern Anal. Mach. Intell. 16(11), 2649–2661 (2007)
Li, J., Wei, Y., Liang, X., Dong, J., Xu, T., Feng, J., Yan, S.: Attentive contexts for object detection. IEEE Trans. Multimed. 19(5), 944–954 (2017)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: NIPS, pp. 4898–4906 (2016)
Mairon, R., Ben-Shahar, O.: A closer look at context: from coxels to the contextual emergence of object saliency. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 708–724. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_46
Mottaghi, R., et al.: The role of context for object detection and semantic segmentation in the wild. In: CVPR, pp. 891–898 (2014)
Oramas, J., Tuytelaars, T.: Recovering hard-to-find object instances by sampling context-based object proposals. CVIU 152, 118–130 (2016)
José Oramas, M., De Raedt, L., Tuytelaars, T.: Allocentric pose estimation. In: ICCV (2013)
José Oramas, M., De Raedt, L., Tuytelaars, T.: Towards cautious collective inference for object verification. In: WACV (2014)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)
Perko, R., Leonardis, A.: A framework for visual-context-aware object detection in still images. CVIU 114(6), 700–711 (2010)
Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: ICCV, pp. 1–8. IEEE (2007)
Ren, S., He, K., Girshick, R., Sun, J.: Fast R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014)
Torralba, A., Murphy, K.P., Freeman, W.T.: Contextual models for object detection using boosted random fields. In: NIPS, pp. 1401–1408 (2004)
Torralba, A., Sinha, P.: Statistical context priming for object detection. In: ICCV, vol. 1, pp. 763–770. IEEE (2001)
Wolf, L., Bileschi, S.: A critical view of context. Int. J. Comput. Vis. 69(2), 251–261 (2006)
Yu, R., Chen, X., Morariu, V.I., Davis, L.S.: The role of context selection in object detection. British Machine Vision Conference abs/1609.02948 (2016)
Acknowledgments
This research was supported in part by Israel Ministry of Science, Technology and Space (MOST Grant 54178). We also thank the Frankel Fund and the Helmsley Charitable Trust through the ABC Robotics Initiative, both at Ben-Gurion University of the Negev, for their generous support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Barnea, E., Ben-Shahar, O. (2019). Contextual Object Detection with a Few Relevant Neighbors. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11362. Springer, Cham. https://doi.org/10.1007/978-3-030-20890-5_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-20890-5_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20889-9
Online ISBN: 978-3-030-20890-5
eBook Packages: Computer ScienceComputer Science (R0)