Shape Recognition in Mind, Brain, and Machine

  • Irving Biederman
  • John E. Hummel
  • Eric E. Cooper
  • Peter C. Gerhardstein
Conference paper
Part of the Research Notes in Neural Computing book series (NEURALCOMPUTING, volume 4)


We present an overview of our recent work on object recognition. One issue concerns what aspects of performance should be modeled. We have focused on real-time activation of a representation of entry-level classes from line drawings. Three striking and fundamental characteristics of such recognition are its invariance with viewpoint in depth (including scale), its ability to operate on unfamiliar objects, and its robustness with the actual contours present in an image (as long as the same convex parts [geons] can be activated). These characteristics are expressed in an implemented neural network model (Hummel & Biederman, 1992) that takes a line drawing of an object as input and generates a structural description of geons and their relations which is then used for object classification. The model’s capacity for structural description derives from its solution to the dynamic binding problem of neural networks: Independent units representing an object’s parts (in terms of their shape attributes and interrelations) are bound temporarily when those attributes occur in conjunction in the system’s input. Temporary conjunctions of attributes are represented by synchronized activity among the units representing those attributes. Specifically, the model induces temporal correlation in the firing of activated units to: a) parse images into their constituent parts; b) bind together the attributes of a part; and c) determine the relations among the parts and bind them to the parts to which they apply. Because it conjoins independent units temporarily, dynamic binding allows tremendous economy of representation, and permits the representation to reflect an object’s attribute structure. The model’s recognition performance conforms well to recent results from shape priming experiments. Moreover, the manner in which the model’s performance degrades due to accidental synchrony produced by an excess of phase sets suggests a basis for a theory of visual attention.


Line Drawing Structural Description Complementary Image Motor Interaction Strong Invariance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115–147.Google Scholar
  2. Biederman, I., & Cooper, E. E. (1991a). Priming contour-deleted images: Evidence for intermediate representations in visual object recognition. Cognitive Psychology, 23, 393–419.CrossRefGoogle Scholar
  3. Biederman, I., & Cooper, E. E. (1991b). Evidence for complete translational and reflectional invariance in visual object priming. Perception, in press.Google Scholar
  4. Biederman, I., & Cooper, E. E. (1992). Size invariance in visual object priming. Journal of Experimental Psychology: Human Perception and Performance, 18, 121–133.CrossRefGoogle Scholar
  5. Cooper, E. E., Biederman, I., & Hummel, J. E. (1992). Metric invariance: A review and further evidence. Canadian Journal of Psychology, in press.Google Scholar
  6. Crick, F. H. C. (1984). The function of the thalamic reticular spotlight: The searchlight hypothesis. Proceedings of the National academy of Sciences, USA 81, 4586–4590.CrossRefGoogle Scholar
  7. Gerhardstein, P. C., & Biederman, I. (1991). 3D Orientation invariance in visual object recognition. Paper presented at the Annual Meeting of The Association for Research in Vision and Ophthalmology, Sarasota, Fl. May.Google Scholar
  8. Hummel, J. E., & Biederman, I. (1992). Dynamic binding in a neural network for shape recognition. Psychological Review, in press.Google Scholar
  9. Lowe, D. G. (1987). The viewpoint consistency constraint. International Journal of Computer Vision, 1, 57–72.CrossRefGoogle Scholar
  10. Mishkin, M., & Appenzeller, T. (1987). The anatomy of memory. Scientific American, 256, 80–89.CrossRefGoogle Scholar
  11. Moran & Desimone (1985). Selective attention gates visual processing in the extrastriate cortex. Science, 229, 782–784.CrossRefGoogle Scholar
  12. Ullman, S. (1989). Aligning pictorial descriptions: An approach to object recognition. Cognition, 32, 193–254.CrossRefGoogle Scholar
  13. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, and R. J. W. Mansfield (Eds.)(Pp. 549–586) Analysis of visual behavior. Cambridge, MA: MIT.Google Scholar
  14. von der Malsburg, C. (1987). Synaptic plasticity as a basis of brain organization. In J. P. Chaneaux & M. Konishi (Eds.), The Neural and Molecular Bases of Learning (pp. 4111–432). John Wiley & Sons LimitedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • Irving Biederman
    • 1
  • John E. Hummel
    • 2
  • Eric E. Cooper
    • 3
  • Peter C. Gerhardstein
    • 3
  1. 1.Department of Psychology, Hedco Neuroscience Bldg.University of Southern CaliforniaUSA
  2. 2.Department of Psychology, Franz HallUniversity of California at Los AngelesUSA
  3. 3.Department of Psychology, Elliott HallUniversity of MinnesotaMinneapolisUSA

Personalised recommendations