Object Perception: Generative Image Models and Bayesian Inference

  • Daniel Kersten
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2525)


Humans perceive object properties such as shape and material quickly and reliably despite the complexity and objective ambiguities of natural images. The visual system does this by integrating prior object knowledge with critical image features appropriate for each of a discrete number of tasks. Bayesian decision theory provides a prescription for the optimal utilization of knowledge for a task that can guide the possibly sub-optimal models of human vision. However, formulating optimal theories for realistic vision problems is a non-trivial problem, and we can gain insight into visual inference by first characterizing the causal structure of image features—the generative model. I describe some experimental results that apply generative models and Bayesian decision theory to investigate human object perception.


Bayesian Inference Image Measurement Object Perception Ideal Observer Bayesian Decision Theory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bloj, M. G., Kersten, D., & Hurlbert, A. C. (1999). Perception of three-dimensional shape influences colour perception via mutual illumination. Nature, 402, 877–879.Google Scholar
  2. 2.
    Clark, J. J., & Yuille, A. L. (1990). Data Fusion for Sensory Information Processing. Boston: Kluwer Academic Publishers.CrossRefGoogle Scholar
  3. 3.
    Brainard, D. H., & Freeman, W. T. (1997). Bayesian color constancy. J Opt Soc Am A, 14, (7), 1393–411.CrossRefGoogle Scholar
  4. 4.
    Bültho., H. H., & Mallot, H. A. (1988). Integration of depth modules: stereo and shading. Journal of the Optical Society of America, A, 5, (10), 1749–1758.CrossRefGoogle Scholar
  5. 5.
    Drew, M., & Funt, B. (1990). Calculating surface reflectance using a single-bounce model of mutual reflection. Proceedings of the 3rd International Conference on Computer Vision Osaka: 393–399.Google Scholar
  6. 6.
    Foley, J., van Dam, A., Feiner, S., & Hughes, J. (1990). Computer Graphics Principles and Practice, (2nd ed.). Reading, Massachusetts: Addison-Wesley Publishing Company.Google Scholar
  7. 7.
    Gegenfurtner, K. R. (1999). Reflections on colour constancy. Nature, 402, 855–856.CrossRefGoogle Scholar
  8. 8.
    Geisler, W. S., & Kersten, D. (2002). Illusions, perception and Bayes. Nat Neurosci, 5, (6), 508–10.CrossRefGoogle Scholar
  9. 9.
    Green, D. M., & Swets, J. A. (1974). Signal Detection Theory and Psychophysics. Huntington, New York: Robert E. Krieger Publishing Company. 1974.Google Scholar
  10. 10.
    Grenander, U. (1996). Elements of Pattern theory. Baltimore: Johns Hopkins University Press.Google Scholar
  11. 11.
    Grill-Spector, K., Kourtzi, Z., & Kanwisher, N. (2001). The lateral occipital complex and its role in object recognition. Vision Res, 41, (10-11), 1409–22.CrossRefGoogle Scholar
  12. 12.
    Jacobs, R. A. (2002). “What determines visual cue reliability?” Trends Cogn Sci 6(8): 345–350.MathSciNetCrossRefGoogle Scholar
  13. 13.
    Kersten, D. (1997). Inverse 3D Graphics: A Metaphor for Visual Perception. Behavior Research Methods, Instruments, & Computers, 29, (1), 37–46.MathSciNetCrossRefGoogle Scholar
  14. 14.
    Kersten, D. (1999). High-level vision as statistical inference. In Gazzaniga, M. S. (Ed.), The New Cognitive Neurosciences2nd Edition(pp. 353–363). Cambridge, MA: MIT Press.Google Scholar
  15. 15.
    Kersten, D., & Schrater, P. R. (2002). Pattern Inference Theory: A Probabilistic Approach to Vision. In Mausfeld, R.,& Heyer, D. (Ed.), Perception and the Physical World (pp. Chichester: John Wiley& Sons, Ltd.Google Scholar
  16. 16.
    Knill, D. C., & Richards, W. (1996). Perception as Bayesian Inference. Cambridge: Cambridge University Press.Google Scholar
  17. 17.
    Landy, M. S., Maloney, L. T., Johnston, E. B., & Young, M. J. (1995). Measurement and modeling of depth cue combination: In defense of weak fusion. Vision Research, 35, 389–412.CrossRefGoogle Scholar
  18. 18.
    Lerner, Y., Hendler, T., & Malach, R. (2002). Object-completion Effects in the Human Lateral Occipital Complex. Cereb Cortex, 12, (2), 163–77.CrossRefGoogle Scholar
  19. 19.
    Lorenceau, J., &amp Shiffrar, M. (1992). The influence of terminators on motion integration across space. Vision Res, 32, (2), 263–73.CrossRefGoogle Scholar
  20. 20.
    Lorenceau, J., & Alais, D. (2001). Form constraints in motion binding. Nat Neurosci, 4, (7), 745–51.CrossRefGoogle Scholar
  21. 21.
    Madison, C., Thompson, W., Kersten, D., Shirley, P., & Smits, B. (2001). Use of interreflection and shadow for surface contact. Perception and Psychophysics, 63, (2), 187–194.CrossRefGoogle Scholar
  22. 22.
    Mamassian, P., Knill, D. C., & Kersten, D. (1998). The Perception of Cast Shadows. Trends in Cognitive Sciences, 2, (8), 288–295.CrossRefGoogle Scholar
  23. 23.
    McDermott, J., Weiss, Y., & Adelson, E. H. (2001). Beyond junctions: nonlocal form constraints on motion interpretation. Perception, 30, (8), 905–23.CrossRefGoogle Scholar
  24. 24.
    Mumford, D. (1992). On the computational architecture of the neocortex. II. The role of cortico-cortical loops. Biol Cybern, 66, (3), 241–51.CrossRefGoogle Scholar
  25. 25.
    Murray, S. O., Kersten, D, Olshausen, B. A., Schrater P., & Woods, D.L. (Under review) Shape perception reduces activity in human primary visual cortex. Submitted to the Proceedings of the National Academy of Sciences.Google Scholar
  26. 26.
    Pearl, J. (1988).Probabilistic reasoning in intelligent systems: networks of plausible inference, (Rev. 2nd printing. ed.). San Mateo, Calif.: Morgan Kaufmann Publishers.Google Scholar
  27. 27.
    Poggio, T., Torre, V., & Koch, C. (1985). Computational vision and regularization theory. Nature, 317, 314–319.CrossRefGoogle Scholar
  28. 28.
    Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects[see comments]. Nat Neurosci, 2, (1), 79–87.CrossRefGoogle Scholar
  29. 29.
    Ripley, B. Pattern Recognition and Neural Networks. Cambridge University Press. 1996.Google Scholar
  30. 30.
    Schrater, P. R., & Kersten, D. (2000). How optimal depth cue integration depends on the task. International Journal of Computer Vision, 40, (1), 73–91.CrossRefzbMATHGoogle Scholar
  31. 31.
    Schrater, P., & Kersten, D. (2001). Vision, Psychophysics, and Bayes. In Rao, R. P. N., Olshausen, B. A., & Lewicki, M. S. (Ed.), Probabilistic Models of the Brain: Perception and Neural Function(pp. Cambridge, Massachusetts: MIT Press.Google Scholar
  32. 32.
    Simoncelli, E. P. (1997). Statistical Models for Images: Compression, Restoration and Synthesis. Pacific Grove, CA.: IEEE Signal Processing Society.Google Scholar
  33. 33.
    Weiss, Y., Simoncelli, E. P., &amp Adelson, E. H. (2002). Motion illusions as optimal percepts. Nat Neurosci, 5, (6), 598–604.CrossRefGoogle Scholar
  34. 34.
    Yuille, A.L., & Bültho., H.H. (1996). Bayesian decision theory and psychophysics. In D.C., K., & W., R. (Ed.), Perception as Bayesian Inference(plCambridge, U.K.: Cambridge University Press.Google Scholar
  35. 35.
    Zhu, S.C., Wu, Y., and Mumford, D. (1997). “Minimax Entropy Principle and Its Application to Texture Modeling”. Neural Computation. 9(8).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Daniel Kersten
    • 1
  1. 1.Psychology DepartmentUniversity of MinnesotaMinneapolisUSA

Personalised recommendations