Learning the Topology of Object Views

  • Jan Wieghardt
  • Rolf P. Würtz
  • Christoph von der Malsburg
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2353)


A visual representation of an object must meet at least three basic requirements. First, it must allow identification of the object in the presence of slight but unpredictable changes in its visual appearance. Second, it must account for larger changes in appearance due to variations in the object’s fundamental degrees of freedom, such as, e.g., changes in pose. And last, any object representation must be derivable from visual input alone, i.e., it must be learnable.

We here construct such a representation by deriving transformations between the different views of a given object, so that they can be parameterized in terms of the object’s physical degrees of freedom. Our method allows to automatically derive the appearance representations of an object in conjunction with their linear deformation model from example images. These are subsequently used to provide linear charts to the entire appearance manifold of a three-dimensional object. In contrast to approaches aiming at mere dimensionality reduction the local linear charts to the object’s appearance manifold are estimated on a strictly local basis avoiding any reference to a metric embedding space to all views. A real understanding of the object’s appearance in terms of its physical degrees of freedom is this way learned from single views alone.


Object recognition pose estimation view sphere correspondence maps learning 


  1. [1]
    M. Becker, E. Kefalea, E. Maël, C. von der Malsburg, M. Pagel, J. Triesch, J. C. Vorbrüggen, R. P. Würtz, and S. Zadel. GripSee: A Gesture-controlled Robot for Object Perception and Manipulation. Autonomous Robots, 6(2):203–221, 1999.zbMATHCrossRefGoogle Scholar
  2. [2]
    D. Beymer and T. Poggio. Image representations for visual learning. Science, 272:1905–1909, June 1996.Google Scholar
  3. [3]
    R. Cattel. The Scree test for the number factors. Multivar. Behav. Res., 1:245–276, 1966.CrossRefGoogle Scholar
  4. [4]
    C. Eckes and J. C. Vorbrüggen. Combining Data-Driven and Model-Based Cues for Segmentation of Video Sequences. In Proc. WCNN96, pages 868–875. INNS Press & Lawrence Erlbaum Ass., 1996.Google Scholar
  5. [5]
    E. Kefalea. Object localization and recognition for a grasping robot. In Proc. IECON, pages 2057–2062. IEEE, 1998.Google Scholar
  6. [6]
    M. Lades, J. C. Vorbrüggen, J. Buhmann, J. Lange, C. von der Malsburg, R. P. Würtz, and W. Konen. Distortion invariant object recognition in the dynamic link architecture. IEEE Transactions on Computers, 42:300–311, 1993.CrossRefGoogle Scholar
  7. [7]
    H. Murase and S. K. Nayar. Visual learning and recognition of 3-d objects from appearance. International Journal of Computer Vision, 14(1):5–24, 1995.CrossRefGoogle Scholar
  8. [8]
    K. Okada. Analysis, Synthesis and Recognition of Human Faces with Pose Variations. PhD thesis, Comp. Sci., Univ. of Southern California, 2001.Google Scholar
  9. [9]
    T. Poggio and S. Edelman. A network that learns to recognize three-dimensional objects. Nature, 343, January 1990.Google Scholar
  10. [10]
    A. Selinger and R. C. Nelson. Appearance-based object recognition using multiple views. In Proceedings CVPR, pages I-905–I-911, 2001.Google Scholar
  11. [11]
    A. Selinger and R. C. Nelson. Minimally supervised acquisition of 3d recognition models from cluttered images. In Proceedings CVPR, pages I-213–I-220, 2001.Google Scholar
  12. [12]
    J. Tenenbaum, V. de Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290:2319–2323, December 2000.Google Scholar
  13. [13]
    J. B. Tenenbaum and W. T. Freeman. Separating style and content with bilinear models. Neural Computation, 12(6):1247–1284, June 2000.Google Scholar
  14. [14]
    J. Wieghardt and C. von der Malsburg. Pose-independent object representation by 2-d views. In IEEE International Workshop on Biologically Motivated Computer Vision, May 15–17, Seoul, 2000.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Jan Wieghardt
    • 1
  • Rolf P. Würtz
    • 2
  • Christoph von der Malsburg
    • 2
    • 3
  1. 1.SIEMENS AG, CT SE 1MünchenGermany
  2. 2.Institut für NeuroinformatikRuhr-Universität BochumBochumGermany
  3. 3.Laboratory for Computational and Biological VisionUniversity of Southern CaliforniaUSA

Personalised recommendations