Abstract
In this chapter we present a principled Bayesian method for detecting and segmenting instances of a particular object category within an image, providing a coherent methodology for combining top down and bottom up cues. The work draws together two powerful formulations: pictorial structures (ps) and Markov random fields (mrfs) both of which have efficient algorithms for their solution. The resulting combination, which we call the object category specific mrf, suggests a solution to the problem that has long dogged mrfs namely that they provide a poor prior for specific shapes. In contrast, our model provides a prior that is global across the image plane using the ps. We develop an efficient method, ObjCut, to obtain segmentations using this model. Novel aspects of this method include an efficient algorithm for sampling the ps model, and the observation that the expected log likelihood of the model can be increased by a single graph cut. Results are presented on two object categories, cows and horses. We compare our methods to the state of the art in object category specific image segmentation and demonstrate significant improvements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agarwal, A., Triggs, B.: Tracking articulated motion using a mixture of autoregressive models. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3023, pp. 54–65. Springer, Heidelberg (2004)
Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 113–127. Springer, Heidelberg (2002)
Blake, A., Rother, C., Brown, M., Perez, P., Torr, P.: Interactive image segmentation using an adaptive GMMRF model. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 428–441. Springer, Heidelberg (2004)
Borenstein, E., Ullman, S.: Class-specific, top-down segmentation. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2351, pp. 109–122. Springer, Heidelberg (2002)
Boykov, Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In: ICCV, pp.I: 105–112 (2001)
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient matching of pictorial structures. In: CVPR, pp.II: 66–73 (2000)
Felzenszwalb, P.F., Huttenlocher, D.P.: Fast algorithms for large state space HMMs with applications to web usage analysis. In: NIPS (2003)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR, pp.II: 264–271 (2003)
Freedman, D., Zhang, T.: Interactive graph cut based segmentation with shape priors. In: CVPR, pp.I: 755–762 (2005)
Gavrila, D.M.: Pedestrian detection from a moving vehicle. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 37–49. Springer, Heidelberg (2000)
Gelman, A., Carlin, J., Stern, H., Rubin, D.: Bayesian Data Analysis. Chapman and Hall, Boca Raton (1995)
Goldstein, J., Platt, J., Burges, C.: Indexing high-dimensional rectangles for fast multimedia identification. Technical Report MSR-TR-2003-38, Microsoft Research (2003)
Huang, R., Pavlovic, V., Metaxas, D.N.: A graphical model framework for coupling MRFs and deformable models. In: CVPR, pp.II: 739–746 (2004)
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts. IEEE PAMI 26(2), 147–159 (2004)
Kumar, M.P., Torr, P.H.S., Zisserman, A.: Extending pictorial structures for object recognition. In: BMVC, pp.II: 789–798 (2004)
Kumar, M.P., Torr, P.H.S., Zisserman, A.: Learning layered pictorial structures from video. In: ICVGIP, pp.148–153 (2004)
Leibe, B., Schiele, B.: Interleaved object categorization and segmentation. In: BMVC, pp.II: 264–271 (2003)
Leung, T., Malik, J.: Recognizing surfaces using three-dimensional textons. In: ICCV, pp. 1010–1017 (1999)
Meer, P., Georgescu, B.: Edge detection with embedded confidence. PAMI 23, 1351–1365 (2001)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1998)
Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. In: SIGGRAPH, pp. 309–314 (2004)
Thayananthan, A., Stenger, B., Torr, P.H.S., Cipolla, R.: Shape context and chamfer matching in cluttered scenes. In: CVPR, pp.I: 127–133 (2003)
Varma, M., Zisserman, A.: Texture classification: Are filter banks necessary? In: CVPR, pp.II: 691–698 (2003)
Yedidia, J., Freeman, W., Weiss, Y.: Bethe free energy, Kikuchi approximations, and belief propagation algorithms. Technical Report TR2001-16, MERL (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kumar, M.P., Torr, P.H.S., Zisserman, A. (2006). An Object Category Specific mrf for Segmentation. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_30
Download citation
DOI: https://doi.org/10.1007/11957959_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)