Visual Information Processing: The Structure and Creation of Visual Representations

  • David Marr
Conference paper
Part of the Lecture Notes in Biomathematics book series (LNBM, volume 44)

Summary

For human vision to be explained by a computational theory, the first question is plain: What are the problems the brain solves when we see? It is argued that vision is the construction of efficient symbolic descriptions from images of the world. An important aspect of vision is therefore the choice of representations for the different kinds of information in a visual scene. An overall framework is suggested for extracting shape information from images, in which the analysis proceeds through three representations; (1) the primal sketch, which makes explicit the intensity changes and local two-dimensional geometry of an image, (2) the 2 1/2-D sketch, which is a viewer-centred representation of the depth, orientation and discontinuities of the visible surfaces, and (3) the 3-D model representation, which allows an object-centred description of the three-dimensional structure and organization of a viewed shape. The critical act in formulating computational theories for processes capable of constructing these representations is the discovery of valid constraints on the way the world behaves, that provide sufficient additional information to allow recovery of the desired characteristic. Finally, once a computational theory for a process has been formulated, algorithms for implementing it may be designed, and their performance compared with that of the human visual processor.

Keywords

Migration Milling Retina Sine Wallach 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agin, G.J. 1972. Representation and description of curved objects. Stanford Artificial Intelligence Project, Memo AIM-173, Stanford University.Google Scholar
  2. Binford, T.O. 1971. Visual perception by computer. Presented to the IEEE Conference on Systems and Control, Miami, December.Google Scholar
  3. Blum, H. 1973. Biological shape and visual science, (part 1). J. theor. Biol., 38. 205–287.CrossRefGoogle Scholar
  4. Freuder, E.G. 1975. A computer vision system for visual recognition using active knowledge. M.I.T.A.I. Lab. Technical Report 345.Google Scholar
  5. Helmholtz, H.L.F. von 1910. Treatis on physiological optics. Translated by J.P. Southall, 1925, N.Y. Dover Publications.Google Scholar
  6. Horn, B.K.P. 1975. Obtaining shape from shading information. In The Psychology of Computer Vision, Ed. P.H. Winston. McGraw-Hall, New York, pp 115–155.Google Scholar
  7. Hubel, D.H. and Wiesel, T.N. 1962. Receptive fields and functional architecture of monkey striate cortex. J. Physiol. (Lond.) 195, 215–243.Google Scholar
  8. Julesz, B. 1971. Foundations of Cyclopean Perception. Chicago: The University of Chicago Press.Google Scholar
  9. Kuffler, S.W. 1953. Discharge patterns and functional organization of mammalian retina. J. Neurophysiol. 16, 37–68.Google Scholar
  10. Marr, D. 1976. Early processing of visual information. Phil. Trans.Roy.Soc.B. 275, 483–524.CrossRefGoogle Scholar
  11. Marr, D. 1977a. Artificial Intelligence — a personal view. Artificial Intelligence 9, 37–48.CrossRefGoogle Scholar
  12. Marr, D. 1977b. Analysis of occluding contour. Proc.Roy.Soc.B. 197, 441–475.CrossRefGoogle Scholar
  13. Marr, D. 1978. Representing visual information. Lectures on mathematics in the life sciences, volume 10, Some Mathematical Questions in Biology, 101–180.Google Scholar
  14. Marr, D., and Hildreth, E. (1979). Theory of edge detection. M.I.T.A.I. Memo #518.Google Scholar
  15. Marr, D. and Nishihara, H.K. 1978. Representation and recognition of the spatial organization of three-dimensional shapes. Proc.Roy.Soc.B. 200, 269–294.CrossRefGoogle Scholar
  16. Marr, D. and Poggio, T. 1976. From understanding computation to understanding neural circuitry. Neurosciences Res. Prog. Bull. 15, 470–488.Google Scholar
  17. Marr, D., Poggio, T. and Palm, G. 1977. Analysis of a cooperative stereo algorithm. Biol. Cybernetics 28, 223–239.CrossRefGoogle Scholar
  18. Marr, D. and Poggio, T. 1979. A theory of human stereo vision. Proc.Roy.Soc.Lond. (in the press).Google Scholar
  19. Marr, D. Poggio, T. and Ullman, S.1979. Bandpass channels, zero-crossings and early visual information processing. J. opt.Soc.Am., (in the press).Google Scholar
  20. Marr, D. and Ullman, S. 1979. Directional selectivity and its use in early visual processing. (In preparation).Google Scholar
  21. Nevatia, R. 1974. Structured descriptions of complex curved objects for recognition and visual memory. Stanford Artificial Intelligence Project, Memo AIM-250, Stanford University.Google Scholar
  22. Newton, I. 1704. Optics. London.Google Scholar
  23. Shepard, R.N. and Metzler, J. 1971. Mental rotation of three-dimensional objects. Science. 171, 701–703.CrossRefGoogle Scholar
  24. Stevens, K.A. 1978. Computation of locally parallel structure. Biol.Cybernetics 29, 19–28.CrossRefMATHGoogle Scholar
  25. Tenenbaum, J.M. and Barrow, H.G. 1976. Experiments in interpretation-guided segmentation. Stanford Research Institute Technical Note 123.Google Scholar
  26. Ullman, S. 1979a. The interpretation of structure from motion. Proc.Roy.Soc.Lond. (in the press).Google Scholar
  27. Ullman, S. 1979b. The interpretation of visual motion. M.I.T. press, March.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1982

Authors and Affiliations

  • David Marr
    • 1
  1. 1.M.I.T. Artificial Intelligence Laboratory and Department of PsychologyCambridgeUSA

Personalised recommendations