Summary
For human vision to be explained by a computational theory, the first question is plain: What are the problems the brain solves when we see? It is argued that vision is the construction of efficient symbolic descriptions from images of the world. An important aspect of vision is therefore the choice of representations for the different kinds of information in a visual scene. An overall framework is suggested for extracting shape information from images, in which the analysis proceeds through three representations; (1) the primal sketch, which makes explicit the intensity changes and local two-dimensional geometry of an image, (2) the 2 1/2-D sketch, which is a viewer-centred representation of the depth, orientation and discontinuities of the visible surfaces, and (3) the 3-D model representation, which allows an object-centred description of the three-dimensional structure and organization of a viewed shape. The critical act in formulating computational theories for processes capable of constructing these representations is the discovery of valid constraints on the way the world behaves, that provide sufficient additional information to allow recovery of the desired characteristic. Finally, once a computational theory for a process has been formulated, algorithms for implementing it may be designed, and their performance compared with that of the human visual processor.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agin, G.J. 1972. Representation and description of curved objects. Stanford Artificial Intelligence Project, Memo AIM-173, Stanford University.
Binford, T.O. 1971. Visual perception by computer. Presented to the IEEE Conference on Systems and Control, Miami, December.
Blum, H. 1973. Biological shape and visual science, (part 1). J. theor. Biol., 38. 205–287.
Freuder, E.G. 1975. A computer vision system for visual recognition using active knowledge. M.I.T.A.I. Lab. Technical Report 345.
Helmholtz, H.L.F. von 1910. Treatis on physiological optics. Translated by J.P. Southall, 1925, N.Y. Dover Publications.
Horn, B.K.P. 1975. Obtaining shape from shading information. In The Psychology of Computer Vision, Ed. P.H. Winston. McGraw-Hall, New York, pp 115–155.
Hubel, D.H. and Wiesel, T.N. 1962. Receptive fields and functional architecture of monkey striate cortex. J. Physiol. (Lond.) 195, 215–243.
Julesz, B. 1971. Foundations of Cyclopean Perception. Chicago: The University of Chicago Press.
Kuffler, S.W. 1953. Discharge patterns and functional organization of mammalian retina. J. Neurophysiol. 16, 37–68.
Marr, D. 1976. Early processing of visual information. Phil. Trans.Roy.Soc.B. 275, 483–524.
Marr, D. 1977a. Artificial Intelligence — a personal view. Artificial Intelligence 9, 37–48.
Marr, D. 1977b. Analysis of occluding contour. Proc.Roy.Soc.B. 197, 441–475.
Marr, D. 1978. Representing visual information. Lectures on mathematics in the life sciences, volume 10, Some Mathematical Questions in Biology, 101–180.
Marr, D., and Hildreth, E. (1979). Theory of edge detection. M.I.T.A.I. Memo #518.
Marr, D. and Nishihara, H.K. 1978. Representation and recognition of the spatial organization of three-dimensional shapes. Proc.Roy.Soc.B. 200, 269–294.
Marr, D. and Poggio, T. 1976. From understanding computation to understanding neural circuitry. Neurosciences Res. Prog. Bull. 15, 470–488.
Marr, D., Poggio, T. and Palm, G. 1977. Analysis of a cooperative stereo algorithm. Biol. Cybernetics 28, 223–239.
Marr, D. and Poggio, T. 1979. A theory of human stereo vision. Proc.Roy.Soc.Lond. (in the press).
Marr, D. Poggio, T. and Ullman, S.1979. Bandpass channels, zero-crossings and early visual information processing. J. opt.Soc.Am., (in the press).
Marr, D. and Ullman, S. 1979. Directional selectivity and its use in early visual processing. (In preparation).
Nevatia, R. 1974. Structured descriptions of complex curved objects for recognition and visual memory. Stanford Artificial Intelligence Project, Memo AIM-250, Stanford University.
Newton, I. 1704. Optics. London.
Shepard, R.N. and Metzler, J. 1971. Mental rotation of three-dimensional objects. Science. 171, 701–703.
Stevens, K.A. 1978. Computation of locally parallel structure. Biol.Cybernetics 29, 19–28.
Tenenbaum, J.M. and Barrow, H.G. 1976. Experiments in interpretation-guided segmentation. Stanford Research Institute Technical Note 123.
Ullman, S. 1979a. The interpretation of structure from motion. Proc.Roy.Soc.Lond. (in the press).
Ullman, S. 1979b. The interpretation of visual motion. M.I.T. press, March.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1982 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marr, D. (1982). Visual Information Processing: The Structure and Creation of Visual Representations. In: Albrecht, D.G. (eds) Recognition of Pattern and Form. Lecture Notes in Biomathematics, vol 44. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-93199-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-93199-4_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-11206-8
Online ISBN: 978-3-642-93199-4
eBook Packages: Springer Book Archive