Scene-Centered Description from Spatial Envelope Properties

  • Aude Oliva
  • Antonio Torralba
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2525)


In this paper, we propose a scene-centered representation able to provide a meaningful description of real world images at multiple levels of categorization (from superordinate to subordinate levels). The scene-centered representation is based upon the estimation of spatial envelope properties describing the shape of a scene (e.g. size, perspective, mean depth) and the nature of its content. The approach is holistic and free of segmentation phase, grouping mechanisms, 3D construction and object-centered analysis.


Large Space Human Observer Natural Scene Scene Image Verbal Label 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barnard, K., Forsyth, D.A.: Learning the semantics of words and pictures. Proceedings of the International Conference on Computer Vision, Vancouver, Canada (2001) 408–415Google Scholar
  2. 2.
    Barrow, H. G., Tannenbaum, J.M.: Recovering intrinsic scene characteristics from images. In: Hanson, A., Riseman, E. (eds.): Computer Vision Systems, New York, Academic press (1978) 3–26Google Scholar
  3. 3.
    Biederman, I.: Recognition-by-components: A theory of human image interpretation. Psychological Review. 94 (1987) 115–148CrossRefGoogle Scholar
  4. 4.
    Carson, C., Belongie, S., Greenspan, H., Malik, J.: Blobworld: Image segmentation using Expectation-Maximization and its Application to Image Querying. IEEE Transactions on Pattern Analysis and Machine Intelligence. 24 (2002) 1026–1038CrossRefGoogle Scholar
  5. 5.
    Gershnfeld, N.: The nature of mathematical modeling. Cambridge university press (1999)Google Scholar
  6. 6.
    Heaps, C., Handel, S.: Similarity and features of natural textures. Journal of Experimental Psychology: Human Perception and Performance. 25 (1999) 299–320CrossRefGoogle Scholar
  7. 7.
    Henderson, J.M., Hollingworth, A.: High level scene perception. Annual Review of Psychology. 50 (1999) 243–271.CrossRefGoogle Scholar
  8. 8.
    Marr, D.: Vision. San Francisco, CA. WH Freeman (1982)Google Scholar
  9. 9.
    Oliva, A., Schyns, P. G.: Diagnostic color blobs mediate scene recognition. Cognitive Psychology. 41 (2000) 176–210.CrossRefGoogle Scholar
  10. 10.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the Spatial Envelope. International Journal of Computer Vision. 42 (2001) 145–175CrossRefzbMATHGoogle Scholar
  11. 11.
    Potter, M. C.: Meaning in visual search. Science. 187 (1975) 965–966.CrossRefGoogle Scholar
  12. 12.
    Rao, A.R., Lohse, G.L.: Identifying high level features of texture perception. Graphical Models and Image Processing. 55 (1993) 218–233CrossRefGoogle Scholar
  13. 13.
    Rensink, R. A., O'Regan, J. K., Clark, J. J.: To see or not to see: the need for attention to perceive changes in scenes. Psychological Science. 8 (1997) 368–373CrossRefGoogle Scholar
  14. 14.
    Rogowitz, B., Frese, T., Smith, J., Bouman, Kalin, E.: Perceptual image similarity experiments. Human Vision and Electronic Imaging, SPIE Vol 3299. (1998) 576–590Google Scholar
  15. 15.
    Schyns, P.G., Oliva, A.: From blobs to boundary edges: evidence for time-and spatial-scale dependent scene recognition. Psychological Science. 5 (1994) 195–200CrossRefGoogle Scholar
  16. 16.
    Szummer, M., Picard, R. W.: Indoor-outdoor image classification. IEEE InternationalWorkshop on Content-based Access of Image and Video Databases, Bombay, India (1998)Google Scholar
  17. 17.
    Torralba, A.: Contextual Modulation of Target Saliency. In: Dietterich, T. G., Becker, S, Ghahramani, Z. (eds.): Advances in Neural Information Processing Systems, Vol. 14. MIT Press, Cambridge, MA (2002)Google Scholar
  18. 18.
    Torralba, A., Oliva, A.: Depth estimation from image structure. IEEE Transactions on Pattern Analysis and Machine Intelligence. 24 (2002)Google Scholar
  19. 19.
    Torralba, A., Sinha, P.: Statistical context priming for object detection: scale selection and focus of attention. Proceedings of the International Conference in Computer Vision, Vancouver, Canada (2001) 763–770.Google Scholar
  20. 20.
    Vailaya, A., Jain, A., Zhang, H. J.: On image classification: city images vs. landscapes. Pattern Recognition. 31 (1998) 1921–1935CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Aude Oliva
    • 1
  • Antonio Torralba
    • 2
  1. 1.Department of Psychology and Cognitive Science ProgramMichigan State UniversityEast LansingUSA
  2. 2.Artificial Intelligence LaboratoryMITCambridgeUSA

Personalised recommendations