Nested Pictorial Structures
Abstract
We propose a theoretical construct coined nested pictorial structure to represent an object by parts that are recursively nested. Three innovative ideas are proposed: First, the nested pictorial structure finds a part configuration that is allowed to be deformed in geometric arrangement, while being confined to be topologically nested. Second, we define nested features which lend themselves to better, more detailed accounting of pixel data cost and describe occlusion in a principled way. Third, we develop the concept of constrained distance transform, a variation of the generalized distance transform, to guarantee the topological nesting relations and to further enforce that parts have no overlap with each other. We show that matching an optimal nested pictorial structure of K parts on an image of N pixels takes O(NK) time using dynamic programming and constrained distance transform. In our MATLAB/C++ implementation, it takes less than 0.1 seconds to do the global optimal matching when K = 10 and N = 400 ×400. We demonstrate the usefulness of nested pictorial structures in the matching of objects of nested patterns, objects in occlusion, and objects that live in a context.
Keywords
Nest Pattern Pictorial Structure Maximal Part Nest Relation Nest FeatureReferences
- 1.Fischler, M., Elschlager, R.: The representation and matching of pictorial structures 22, 67–92 (1973)Google Scholar
- 2.Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. IJCV 61, 55–79 (2005)CrossRefGoogle Scholar
- 3.Felzenszwalb, P., Huttenlocher, D.: Distance transforms of sampled functions. Technical Report TR2004-1963, Cornell Computing and Information Science (2004)Google Scholar
- 4.Yuille, A., Hallinan, P., Cohen, D.: Feature extraction from faces using deformable templates. IJCV 8, 99–111 (1992)CrossRefGoogle Scholar
- 5.Burl, M.C., Weber, M., Perona, P.: A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 628–641. Springer, Heidelberg (1998)Google Scholar
- 6.Weber, M., Welling, M., Perona, P.: Towards automatic discovery of object categories. In: CVPR, pp. 2101–2108 (2000)Google Scholar
- 7.Coughlan, J., Yuille, A., English, C., Snow, D.: Efficient deformable template detection and localization without user initialization. Computer Vision and Image Understanding 78, 303–319 (2000)CrossRefGoogle Scholar
- 8.Cootes, T., Edwards, G., Taylor, C.: Active appearance models. IEEE PAMI 23, 681–685 (2001)CrossRefGoogle Scholar
- 9.Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: IEEE CVPR, pp. 264–271 (2003)Google Scholar
- 10.Crandall, D., Felzenszwalb, P., Huttenlocher, D.: Spatial priors for part-based recognition using statistical models. In: IEEE CVPR, pp. 10–17 (2005)Google Scholar
- 11.Amit, Y., Trouvé, A.: Pop: Patchwork of parts models for object recognition. IJCV 75, 267–282 (2007)CrossRefGoogle Scholar
- 12.Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. IJCV 77, 259–289 (2008)CrossRefGoogle Scholar
- 13.Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE PAMI 32, 1627–1645 (2010)CrossRefGoogle Scholar
- 14.Zhu, L., Chen, Y., Yuille, A., Freeman, W.: Latent hierarchical structural learning for object detection. In: IEEE CVPR, pp. 1062–1069 (2010)Google Scholar
- 15.Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: ICCV (2011)Google Scholar
- 16.Felzenszwalb, P., Huttenlocher, D.: Efficient matching of pictorial structures. In: IEEE CVPR, pp. 2066–2073 (2000)Google Scholar
- 17.Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE CVPR, pp. 886–893 (2005)Google Scholar
- 18.Everingham, M., Gool, L.V., Williams, C., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
- 19.Gao, T., Packer, B., Koller, D.: A segmentation-aware object detection model with occlusion handling. In: IEEE CVPR, pp. 1361–1368 (2011)Google Scholar
- 20.Sundberg, P., Brox, T., Maire, M., Arbelaez, P., Malik, J.: Occlusion boundary detection and figure/ground assignment from optical flow. In: IEEE CVPR, pp. 2233–2240 (2011)Google Scholar
- 21.Humayun, A., Aodha, O., Brostow, G.: Learning to find occlusion regions. In: IEEE CVPR, pp. 2161–2168 (2011)Google Scholar
- 22.Woodcock, C., Harward, V.: Nested-hierarchical scene models and image segmentation. Internal Journal of Remote Sensing 13 (1992)Google Scholar
- 23.Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV 42, 145–175 (2001)zbMATHCrossRefGoogle Scholar
- 24.Li, L., Socher, R., Fei-Fei, L.: Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In: IEEE CVPR, pp. 2036–2043 (2009)Google Scholar
- 25.Lampert, C., Blaschko, M., Hofmann, T.: Efficient subwindow search: A branch and bound framework for object localization. IEEE PAMI 31, 2129–2142 (2009)CrossRefGoogle Scholar
- 26.Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007)Google Scholar