Abstract
We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by (1) solving for correspondences between points on the two shapes, and (2) using the correspondences to estimate an aligning transform. In order to solve the correspondence problem, we attach a descriptor, the shape context, to each point. The shape context at a reference point captures the distribution of the remaining points relative to it, thus offering a globally discriminative characterization. Corresponding points on two similar shapes will have similar shape contexts, enabling us to solve for correspondences as an optimal assignment problem. Given the point correspondences, we estimate the transformation that best aligns the two shapes; regularized thin-plate splines provide a flexible class of transformation maps for this purpose. The dissimilarity between the two shapes is computed as a sum of matching errors between corresponding points, together with a term measuring the magnitude of the aligning transform. We treat recognition in a nearest neighbor classification framework as the problem of finding the stored prototype shape that is maximally similar to that in the image. We also demonstrate that shape contexts can be used to quickly prune a search for similar shapes. We present two algorithms for rapid shape retrieval: representative shape contexts, performing comparisons based on a small number of shape contexts, and shapemes, using vector quantization in the space of shape contexts to obtain prototypical shape pieces. Results are presented for silhouettes, handwritten digits and visual CAPTCHAs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Y. Amit, D. Geman, and K. Wilder. Joint induction of shape features and tree classifiers. IEEE Trans. Pattern Analysis and Machine Intelligence, 19(11):1300–1305, November 1997.
S. Belongie, J. Malik, and J. Puzicha. Matching shapes. In Proc. 8th Int. Conf. Computer Vision, volume 1, pages 454–461, July 2001.
S. Belongie, J. Malik, and J. Puzicha. Shape context: A new descriptor for shape matching and object recognition. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, pages 831–837, 2001.
S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Analysis and Machine Intelligence, 24(4):509–522, April 2002.
A. Berg, T. Berg, and J. Malik. Shape matching and object recognition using low distortion correspondences. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 26–33, San Diego, CA, June 2005.
A. Berg and J. Malik. Geometric blur for template matching. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 607–614, Kauai, HI, December 2001.
P. J. Bickel. A distribution free version of the Smirnov two-sample test in the multivariate case. Annals of Mathematical Statistics, 40:1–23, 1969.
F. L. Bookstein. Principal warps: thin-plate splines and decomposition of deformations. IEEE Trans. Pattern Analysis and Machine Intelligence, 11(6):567–585, June 1989.
F. L. Bookstein. Morphometric tools for landmark data: geometry and biology. Cambridge Univ. Press, London, 1991.
H. Chui and A. Rangarajan. A new algorithm for non-rigid point matching. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 44–51, Hilton Head, SC, June 2000.
D. DeCoste and B. Schölkopf. Training invariant support vector machines. Machine Learning, 46(1–3):161–190, January–February–March 2002.
G. Dorko and C. Schmid. Selection of scale invariant neighborhoods for object class recognition. In Proc. 9th Int. Conf. Computer Vision, pages 634–640, 2003.
J. Duchon. Splines minimizing rotation-invariant semi-norms in Sobolev spaces. In W. Schempp and K. Zeller, editors, Constructive Theory of Functions of Several Variables, pages 85–100. Springer-Verlag, Berlin, 1977.
R. Fergus, P. Perona, and A. Zisserman. Object class recognition by unsupervised scale-invariant learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages 264–271, Madison, WI, June 2003.
M. Fischler and R. Elschlager. The representation and matching of pictorial structures. IEEE Trans. Computers, C-22(1):67–92, 1973.
A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik. Recognizing objects in range data using regional point descriptors. In Proc. 8th Europ. Conf. Comput. Vision, volume 3, pages 224–237, 2004.
F. Girosi, M. Jones, and T. Poggio. Regularization theory and neural networks architectures. Neural Computation, 7(2):219–269, 1995.
U. Grenander, Y. Chow, and D. Keenan. HANDS: A Pattern Theoretic Study of Biological Shapes. Springer, New York, 1991.
H. Guo, A. Rangarajan, S. Joshi, and L. Younes. A new joint clustering and diffeomorphism estimation algorithm for non-rigid shape matching. In IEEE Workshop on Articulated and Non-rigid motion (ANM), Washington, DC, 2004.
A. Ben Hamza and H. Krim. Geodesic object representation and recognition. In Discrete Geometry for Computer Imagery: 11th International Conference, DGCI, pages 378–387, 2003.
S. Jeannin and M. Bober. Description of core experiments for MPEG-7 motion/shape. Technical Report ISO/IEC JTC 1/SC 29/WG 11 MPEG99/N2690, MPEG-7, Seoul, March 1999.
A. E. Johnson and M. Hebert. Recognizing objects by matching oriented points. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 684–689, San Juan, Puerto Rico, 1997.
D. Jones and J. Malik. Computational framework to determining stereo correspondence from a set of linear spatial filters. Image and Vision Computing, 10(10):699–708, Dec. 1992.
R. Jonker and A. Volgenant. A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing, 38:325–340, 1987.
M. Lades, C. Vorbrüggen, J. Buhmann, J. Lange, C. von der Malsburg, R. Wurtz, and W. Konen. Distortion invariant object recognition in the dynamic link architecture. IEEE Trans. Computers, 42(3):300–311, March 1993.
L. J. Latecki, R. Lakämper, and U. Eckhardt. Shape descriptors for non-rigid shapes with a single closed contour. In Proc. IEEE Conf. Comput. Vision and Pattern Recognition, pages 424–429, 2000.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, November 1998.
D. G. Lowe. Distinctive image features from scale-invariant keypoints. Int. Journal of Computer Vision, 60(2):91–110, 2004.
D. Martin, C. Fowlkes, and J. Malik. Learning to find brightness and texture boundaries in natural images. NIPS, 2002.
J. Meinguet. Multivariate interpolation at arbitrary points made simple. J. Appl. Math. Phys. (ZAMP), 5:439–468, 1979.
G. Mori, S. Belongie, and J. Malik. Shape contexts enable efficient retrieval of similar shapes. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., Kauai, HI, December 2001.
G. Mori and J. Malik. Recognizing objects in adversarial clutter: Breaking a visual captcha. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., volume 1, pages 134–141, Madison, WI, 2003.
R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin. Shape distributions. A CM Transactions on Graphics, 21(4):807–832, October 2002.
C. Papadimitriou and K. Stieglitz. Combinatorial Optimization: Algorithms and Complexity. Prentice-Hall, Upper Saddle River, NJ, 1982.
M. J. D. Powell. A thin plate spline method for mapping curves into curves in two dimensions. In Computational Techniques and Applications (CTAC95), Melbourne, Australia, 1995.
B. D. Ripley. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, 1996.
E. Rosch. Natural categories. Cognitive Psychology, 4(3):328–350, 1973.
E. Rosch, C. B. Mervis, W. D. Gray, D. M. Johnson, and P. Boyes-Braem. Basic objects in natural categories. Cognitive Psychology, 8(3):382–439, 1976.
J. G. Snodgrass and M. Vanderwart. A standardized set of 260 pictures: Norms for name agreement, familiarity and visual complexity. Journal of Experimental Psychology: Human Learning and Memory, 6:174–215, 1980.
A. Thayananthan, B. Stenger, P. H. S. Torr, and R. Cipolla. Shape context and chamfer matching in cluttered scenes. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., volume I, pages 127–133, Madison, WI, June 2003.
D’Arcy Wentworth Thompson. On Growth and Form. Cambridge University Press, London, 1917.
L. von Ahn, M. Blum, and J. Langford. Telling humans and computers apart (automatically). CMU Tech Report CMU-CS-02-117, February 2002.
G. Wahba. Spline Models for Observational Data. SIAM, 1990.
A. Yuille. Deformable templates for face recognition. J. Cognitive Neuroscience, 3(1):59–71, 1991.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Birkhäuser Boston
About this chapter
Cite this chapter
Belongie, S., Mori, G., Malik, J. (2006). Matching with Shape Contexts. In: Krim, H., Yezzi, A. (eds) Statistics and Analysis of Shapes. Modeling and Simulation in Science, Engineering and Technology. Birkhäuser Boston. https://doi.org/10.1007/0-8176-4481-4_4
Download citation
DOI: https://doi.org/10.1007/0-8176-4481-4_4
Publisher Name: Birkhäuser Boston
Print ISBN: 978-0-8176-4376-8
Online ISBN: 978-0-8176-4481-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)