Photobook: Content-Based Manipulation of Image Databases

  • A. Pentland
  • R. W. Picard
  • S. Sclaroff
Part of the The Kluwer International Series in Engineering and Computer Science book series (SECS, volume 359)


We describe the Photobook system, which is a set of interactive tools for browsing and searching images and image sequences. These query tools differ from those used in standard image databases in that they make direct use of the image content rather than relying on text annotations. Direct search on image content is made possible by use of semantics-preserving image compression, which reduces images to a small set of perceptually-significant coefficients. We discuss three types of Photobook descriptions in detail: one that allows search based on appearance, one that uses 2-D shape, and a third that allows search based on textural properties. These image content descriptions can be combined with each other and with text-based descriptions to provide a sophisticated browsing and search capability. In this paper we demonstrate Photobook on databases containing images of people, video keyframes, hand tools, fish, texture swatches, and 3-D medical data.


Training Image Image Database Foreground Object Text Annotation Hand Tool 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    E. Adelson and J. Bergen, “The Plenoptic Function and the Elements of Early Vision,” in: M. Landy and J. A. Movshon, (eds) Computational Models of Visual Processing, MIT Press (1991).Google Scholar
  2. [2]
    ACM SIGIR. Proceedings of International Conference on Multimedia Information Systems, Singapore, 1991.Google Scholar
  3. [3]
    D. Ballard and C. Brown. Computer Vision. Prentice Hall, 1982Google Scholar
  4. E. Binaghi, I. Gagliardi, and R. Schettini. “Indexing and fuzzy logic-based retrieval of color images.” In Visual Database Systems, II, IFIP Transactions A-7, pages 79–92.Google Scholar
  5. [5]
    W. E. Blanz, D. Petkovic, and J. L. Sanz. Algorithms and Architectures for Machine Vision, ed. C.H. Chen, Marcel Decker Inc., 1989.Google Scholar
  6. [6]
    T. Breuel, Indexing for Recognition from a Large Model Base, M.I.T. Artificial Intelligence Laboratory Memo 1108, August 1990Google Scholar
  7. [7]
    P. Brodatz. Textures: A Photographic Album for Artists and Designers. Dover, New York, 1966.Google Scholar
  8. [8]
    C. C. Chang and S. Y. Lee. “Retrieval of similar pictures on pictorial databases.” Pattern recognition, 24(7):675 – 680, 1991.CrossRefGoogle Scholar
  9. [9]
    C-C. Chang and T-C. Wu. “Retrieving the most similar symbolic pictures from pictorial databases.” Information Processing and Management, 28(5):581–588, 1992.CrossRefGoogle Scholar
  10. [10]
    Z. Chen and S-Y. Ho. “Computer vision for robust 3D aircraft recognition with fast library search.” Pattern Recognition, 24(5): 375–390, 1991.CrossRefGoogle Scholar
  11. [11]
    T. Darrell and A. Pentland, “Robust Estimation of a Multi-Layer Motion Representation”, in Proceedings IEEE Workshop on Visual Motion, pp. 173–177, 1991. longer version available as M.I.T. Media Laboratory Perceptual Computing Technical Report No. 163CrossRefGoogle Scholar
  12. [12]
    T. Darrell, P. Maes, B. Blumberg, and A. Pentland, “A Novel Environment for Situated Vision and Behavior,” IEEE Workshop on Visual Behaviors pp. 68–72, Seattle, WA., June 19, 1994.Google Scholar
  13. [13]
    S. Smoliar, and H. Zhang, “Content-Based Video Indexing and Retrieval,” IEEE Multimedia Magazine, Vol. 1, No. 2, pp. 62–72, 1994.CrossRefGoogle Scholar
  14. [14]
    R. Duda and P. Hart Pattern Classification and Scene Analysis. Wiley, New York, 1973.MATHGoogle Scholar
  15. [15]
    J. Francos “Orthogonal Decompositions of 2-D Random Fields and their Applications for 2-D Spectral Estimation”, Signal Processing and its Applications, pp. 287–327, N.K. Bose and C.R. Rao (eds.), North-Holland., 1993.Google Scholar
  16. [16]
    P. Gast, “Integrating Eigenpicture Analysis with an Image Database,” M.I.T. Bachelors Thesis, Computer Science and Electrical Engineering Deptartment, Advisor: Alex Pentland, 1993.Google Scholar
  17. [17]
    W. I. Grosky, P. Neo, and R. Mehrotra. “A pictorial index mechanism for model-based matching.” Data and Knowledge Engineering, 8:309–327, 1992CrossRefGoogle Scholar
  18. [18]
    K. Haase, “FRAMER: A Portable Persistent Representation Library,” Proceedings of the AAAI Workshop on AI in Systems and Support, Am. Asso. for AI, 1993.Google Scholar
  19. [19]
    K. Haase, “AI in Service and Support: Bridging the Gap”, Haase, Proceedings of Am. Asso. AI, 1993.Google Scholar
  20. [20]
    H. Helson and D. Lowdenslager, “Prediction Theory and Fourier Series in Several Variables.II”, Acta Mathmatica, Vol. 196, pp. 175–213, 1962.MathSciNetGoogle Scholar
  21. [21]
    K. Hirata and T. Kato. “Query by visual example,” In Advances in Database Technology EDBT ′92, Third International Conference on Extending Database Technology, Vienna, Austria, March 1992. Springer-Verlag.Google Scholar
  22. [22]
    M. Ioka. “A method of defining the similarity of images on the basis of color information,” Technical Report RT-003 0, IBM Tokyo Research Lab, 1989.Google Scholar
  23. [23]
    M. A. Ireton and C. S. Xydeas. “Classification of shape for content retrieval of images in a multimedia database,” In Sixth International Conference on Digital Processing of Signals in Communications, pages 111–116, Loughborough, UK, 2–6 Sept., 1990. IEE.Google Scholar
  24. [24]
    H. V. Jagadish. “A retrieval technique for similar shapes,” In International Conference on Management of Data, SIGMOD 91, pages 208–217, Denver CO, May 1991. ACM.Google Scholar
  25. [25]
    R. Jain and W. Niblack. NSF workshop on visual information management, February 1992.Google Scholar
  26. [26]
    T. Kato, T. Kurita, H. Shimogaki, T. Mizutori, and K. Fujimura. “A cognitive approach to visual interaction. In International Conference of Multimedia Information Systems,” MIS ′91, pages 109–120. ACM and National University of Singapore, January 1991.Google Scholar
  27. [27]
    Y. Lamdan and H. J. Wolfson. “Geometric hashing: A general and efficient model-based recognition scheme,” In 2nd International Conference on Computer Vision (ICCV), pages 238–249, Tampa, Florida, 1988. IEEE.Google Scholar
  28. [28]
    S-Y. Lee and F-J. Hsu. “2D C-string: A new spatial knowledge representation for image database systems,” Pattern Recognition, 23(10):1077–1087, 1990.CrossRefGoogle Scholar
  29. [29]
    S-Y. Lee and F-J. Hsu. “Spatial reasoning and similarity retrieval of images using 2D c-string knowledge representation,” Pattern Recognition, 25(3):305–318, 1992.MathSciNetCrossRefGoogle Scholar
  30. [30]
    A Lippman. “Semantic bandwidth compression,” Picture Coding Symposium, 1981.Google Scholar
  31. [31]
    P. McLean, “Structured Video Coding,” M.I.T. Masters Thesis, Advisor: Andrew Lippman, 1989.Google Scholar
  32. [32]
    J. Mao and A. Jain, “Texture Classification and Segmentation using Mul-tiresolution Simultaneous Autoregressive Models”, Pattern Recognition, Vol. 25, No. 2, pp 173–188, 1992.CrossRefGoogle Scholar
  33. [33]
    R. Mehrotra and W. I. Grosky. “Shape matching utilizing indexed hypotheses generation and testing,” IEEE Transactions of Robotics and Automation, 5(1):70–77, 1989.CrossRefGoogle Scholar
  34. [34]
    B. Moghaddam and A. Pentland, “Face recognition using view-based and modular eigenspaces for Identification And Inspection of Humans,” SPIE Conf. on Automatic Systems, San Diego, July 1994Google Scholar
  35. [35]
    W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, and P. Yanker. “The QBIC project: Querying image s by content using color, texture, and shape,” In IS & T/SPIE 1993 International Symposium on Electronic Imaging: Science & Technology,, Conference 1908, Storage and Retrieval for Image and Video Databases, February 1993.Google Scholar
  36. [36]
    J. Martin, A. Pentland, and R. Kikinis “Shape Analysis of Brain Structures using Physical and Experimental Modes,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 752–755, Seattle, WA., June 1994.Google Scholar
  37. A. Pentland and S. Sclaroff “Closed-Form Solutions For Physically Based Shape Modeling and Recognition.” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 13, No. 7, pp. 715–730.Google Scholar
  38. [38]
    A. Pentland, R. Picard, G. Davenport, R. Welsh, “The BT/MIT Project on Advanced Image Tools for Telecommunications: An Overview,” ImageCom ′93, 2nd International Conference on Image Communications, Bordeaux, France, 23–25 March, 1993.Google Scholar
  39. [39]
    A. Pentland, B. Moggadam, and T. Starner, “View-Based and Modular Eigenspaces for Face Recognition,” IEEE Conf Computer Vision and Pattern Recognition, pp. 84–90, Seattle, WA, June 1994Google Scholar
  40. [40]
    R. W. Picard “Random Field Texture Coding,” Society for Information Display International Symposium Digest, Vol XXIII, May 1992, pages 685–688.Google Scholar
  41. [41]
    R. W. Picard and M. Gorkani. “Finding perceptually dominant orientations in natural textures.” Spatial Vision Vol. 8, No. 2, pp. 221–253, 1994.CrossRefGoogle Scholar
  42. [42]
    R. W. Picard and T. Kabir. “Finding similar patterns in large image databases.” Proc. ICASSP, Minneapolis, MN, Vol. V, pp. 161–164, 1993.Google Scholar
  43. [43]
    R. W. Picard and F. Liu, “A new Wold ordering for image similarity,” IEEE Conf on ASSP, Adelaide, Australia, April, 1994.Google Scholar
  44. R. W. Picard and T. P. Minka, “Vision Texture for Annotation” ACM/Springer-Verlag Journal of Multimedia Systems, to appear.Google Scholar
  45. A. R. Rao and G. L. Lohse, “Towards a Texture Naming System: Identifying Relevant Dimensions of Texture,” IEEE Conf on Visualization 1993, San Jose, CA.Google Scholar
  46. [46]
    L. Sirovich, and M. Kirby, “Low-dimensional procedure for the characterization of human faces,” J. Opt. Soc. Am. A, Vol. 4, No. 3, March 1987, 519–524.CrossRefGoogle Scholar
  47. [47]
    S. Sclaroff and A. Pentland, “A finite-element framework for correspondence and matching,” 4th International Conference on Computer Vision, pp. 308–313, May 11–14, 1993, Berlin, Germany.Google Scholar
  48. S. Sclaroff and A. Pentland, “Modal Matching for Correspondence and Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, to appear. Also available as: M.I.T. Media Laboratory Perceptual Computing Technical Note No. 304.Google Scholar
  49. [49]
    R. Sriram, J. M. Francos and W. A. Pearlman, “Texture coding Using a Wold Decomposition Model,” sl Proc. 12th IAPR Int. Conf. Pat. Rec, Jerusalem, Israel, Oct. 1994.Google Scholar
  50. [50]
    M. Swain and D. Ballard, “Color indexing”. Int. J. of Computer Vision, 7(1):11–32, 1991.CrossRefGoogle Scholar
  51. [51]
    S. Tanaka, M. Shima, J. Shibayama, and A. Maeda. “Retrieval method for an image database based on topographical structure.” In Applic. of Digital Image Processing, Vol. 1153, pages 318–327. SPIE, 1989.Google Scholar
  52. Discrete Random Signals and Statistical Signal Processing, C. W. Therrien, Prentice-Hall, Englewood Cliffs, NJ 1992.Google Scholar
  53. M. Turk and A. Pentland, “Eigenfaces for Recognition”, Journal of Cognitive Neuroscience, May 1991.Google Scholar
  54. [54]
    K. Wakimoto, M. Shima, S. Tanaka, and A. Maeda. “An intelligent user interface to an image database using a figure interpretation method.” In 9th Int. Conference on Pattern Recognition, volume 2, pages 516–991, 1990.CrossRefGoogle Scholar
  55. [55]
    J. Y. A. Wang and E. H. Adelson, “Layered Representation for Motion Analysis” IEEE CVPR ′93. Longer version available as: M.I.T. Media Laboratory Perceptual Computing Technical Report No. 228.Google Scholar

Copyright information

© Kluwer Academic Publishers 1996

Authors and Affiliations

  • A. Pentland
    • 1
  • R. W. Picard
    • 1
  • S. Sclaroff
    • 1
    • 2
  1. 1.Perceptual Computing Section, The Media LaboratoryMassachusetts Institute of TechnologyCambridgeUSA
  2. 2.Computer Science DepartmentBoston UniversityUSA

Personalised recommendations