Unsupervised Learning of Models for Recognition
Abstract
We present a method to learn object class models from unlabeled and unsegmented cluttered scenes for the purpose of visual object recognition. We focus on a particular type of model where objects are represented as flexible constellations of rigid parts (features). The variability within a class is represented by a joint probability density function (pdf) on the shape of the constellation and the output of part detectors. In a first stage, the method automatically identifies distinctive parts in the training set by applying a clustering algorithm to patterns selected by an interest operator. It then learns the statistical shape model using expectation maximization. The method achieves very good classification results on human faces and rear views of cars.
Keywords
Expectation Maximization Training Image Object Class Unsupervised Learn Expectation Maximization AlgorithmReferences
- 1.Y. Amit and D. Geman. A computational model for visual selection. Neural Computation, 11(7):1691–1715, 1999.CrossRefGoogle Scholar
- 2.M.C. Burl, T.K. Leung, and P. Perona. “Face Localization via Shape Statistics”. In Int Workshop on Automatic Face and Gesture Recognition, 1995.Google Scholar
- 3.M.C. Burl, T.K. Leung, and P. Perona. “Recognition of Planar Object Classes”. In Proc. IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recogn., 1996.Google Scholar
- 4.M.C. Burl, M. Weber, and P. Perona. A probabilistic approach to object recognition using local photometry and global geometry. In proc. ECCV’98, pages 628–641, 1998.Google Scholar
- 5.T.F. Cootes and C.J. Taylor. “Locating Objects of Varying Shape Using Statistical Feature Detectors”. In European Conf. on Computer Vision, pages 465–474, 1996.Google Scholar
- 6.A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society B, 39:1–38, 1976.MathSciNetGoogle Scholar
- 7.R.O. Duda and P.E. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons,Inc., 1973.Google Scholar
- 8.G.J. Edwards, T.F. Cootes, and C.J. Taylor. Face recognition using active appearance models. In Proc. 5th Europ. Conf. Comput.Vision, H. Burkhardt and B. Neumann (Eds.), LNCS-Series Vol. 1406–1407, Springer-Verlag, pages 581–595, 1998.Google Scholar
- 9.R.M. Haralick and L.G. Shapiro. Computer and Robot Vision II. Addison-Wesley, 1993.Google Scholar
- 10.M. Lades, J.C. Vorbruggen, J. Buhmann, J. Lange, C. v.d. Malsburg, R.P. Wurtz, and W. Konen. “Distortion Invariant Object Recognition in the Dynamic Link Architecture”. IEEE Trans. Comput., 42(3):300–311, Mar 1993.Google Scholar
- 11.T.K. Leung, M.C. Burl, and P. Perona. “Finding Faces in Cluttered Scenes using Random Labeled Graph Matching”. Proc. 5th Int. Conf. Computer Vision, pages 637–644, June 1995.Google Scholar
- 12.T.K. Leung, M.C. Burl, and P. Perona. Probabilistic affine invariants for recognition. In Proc. IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recogn., pages 678–684, 1998.Google Scholar
- 13.T.K. Leung and J. Malik. Reconizing surfaces using three-dimensional textons. In Proc. 7th Int. Conf. Computer Vision, pages 1010–1017, 1999.Google Scholar
- 14.K. N.Walker, T. F. Cootes, and C. J. Taylor. Locating salient facial features. In Int. Conf. on Automatic Face and Gesture Recognition, Nara,Japan, 1998.Google Scholar
- 15.A.L. Yuille. DeformableTemplates for Face Recognition. J. of Cognitive Neurosci., 3(1):59–70, 1991.CrossRefGoogle Scholar