Skip to main content
Log in

Tools for Browsing a TV Situation Comedy Based on Content Specific Attributes

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents general purpose video analysis and annotation tools, which combine high-level and low-level information, and which learn through user interaction and feedback. The use of these tools is illustrated through the construction of two video browsers, which allow a user to fast forward (or rewind) to frames, shots, or scenes containing a particular character, characters, or other labeled content. The two browsers developed in this work are: (1) a basic video browser, which exploits relations between high-level scripting information and closed captions, and (2) an advanced video browser, which augments the basic browser with annotations gained from applying machine learning. The learner helps the system adapt to different peoples' labelings by accepting positive and negative examples of labeled content from a user, and relating these to low-level color and texture features extracted from the digitized video. This learning happens interactively, and is used to infer labels on data the user has not yet seen. The labeled data may then be browsed or retrieved from the database in real time.An evaluation of the learning performance shows that a combination of low-level color signal features outperforms several other combinations of signal features in learning character labels in an episode of the TV situation comedy, Seinfeld. We discuss several issues that arise in the combination of low-level and high-level information, and illustrate solutions to these issues within the context of browsing television sitcoms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. B. Astle, “Video database indexing and method of presenting video database index to a user,” US Patent Office 5,485,611, January 1996. Assigned to Intel Corporation of Santa Clara CA, Filed Dec. 30, 1994.

  2. V.M. Bove Jr., “Personalcasting: Interactive local augmentation of television programming,” Master's thesis, MIT, 1983.

  3. S. Chang, J. Smith, and H. Wang, “Automatic feature extraction and indexing for content based visual query,” Technical Report 408-95-14, Columbia University, New York, NY, 1991.

    Google Scholar 

  4. S.S. Intille and A.F. Bobick, “Visual tracking using closed-worlds,” Technical Report 294, MIT Media Laboratory Perceptual Computing, 20 Ames Street, Cambridge, MA 02139, 1994.

    Google Scholar 

  5. A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Prentice Hall: Englewood Cliffs, NJ, 1988.

    Google Scholar 

  6. K. Karahalios, “Salient movies,” Master's thesis, MIT, Cambridge, MA 02139, 1995.

    Google Scholar 

  7. B. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” Image Understanding Workshop, April 1981, pp. 121–130.

  8. J. Meng and S.F. Chang, “Tools for compressed-domain video indexing and editing,” IS&T/SPIE Symposium on Electronic Imaging: Science and Technology—Storage & Retrieval for Image and Video Database IV, 2670, Feb. 1996.

  9. T.P. Minka, “An image database browser that learns from user interaction,” Master's thesis, MIT, Cambridge, MA 02139, Feb. 1996. Also appears as MIT Media Lab Perceptual Computing Section Technical Report #365.

    Google Scholar 

  10. T.P. Minka and R.W. Picard, “Interactive learning using a 'society of models',” Special Issue of Pattern Recognition on Image Databases Classification and Retrieval, 1995. Also appears as MIT Media Lab Perceptual Computing Section Technical Report #349.

  11. Y.I. Ohta, T. Kanade, and T. Sakai, “Color information for region segmentation,” Computer Graphics and Image Processing, Vol. 13, pp. 222–241, 1980.

    Google Scholar 

  12. R.W. Picard and T.P. Minka, “Vision texture for annotation,” ACM/Springer-Verlag Journal of Multimedia Systems, Vol. 3, pp. 3–15, 1995. Also appears as MIT Media Laboratory Perceptual Computing Section Technical Report #302.

    Google Scholar 

  13. B. Salt, Film Style and Technology and Analysis, Starwood, London, 1983.

  14. R.K. Srihari, R. Chopra, D. Burhans, M. Venkatraman, and V. Govindaraju, “Use of collateral text in image interpretation,” in CEDAR Proceedings of The Image UnderstandingWorkshop, Vol. II, Monterey, CA, Nov. 13–16, 1994, pp. 897–905. ARPA Software and Intelligent Systems Technology Office.

    Google Scholar 

  15. R.K. Srihari, “Linguistic context in vision,” in Proceedings, MIT, Nov. 10–12, 1995, pp. 78–88. MIT AAA-I Fall Symposium Series Computational Models for Integrating Language and Vision.

  16. M. Swain and D. Ballard, “Color indexing,” International Journal of ComputerVision,Vol. 7, No. 1, pp. 11–32, 1991.

    Google Scholar 

  17. A. Tversky, “Features of similarity,” Psychological Review, Vol. 84, No. 4, pp. 327–352, July 1977.

    Google Scholar 

  18. J.Y.A. Wang and E.A. Adelson, “Layered representation for motion analysis,” in Proceedings of the Computer Vision and Pattern Recognition Conference, June 1993. Also appears asMITMedia Lab Perceptual Computing Section Technical Report #221.

  19. M.M. Yeung and B. Liu, “Efficient matching and clustering of video shots,” in Proceedings IEEE International Conference on Image Processing Vol. 1.,Washington, D.C., Oct. 23–26, 1995. Princeton University, pp. 338-341.

    Google Scholar 

  20. H. Zhang, A. Kankanhalli, and S. Smoliar, “Automatic partitioning of full-motion video,” Multimedia Systems, Vol. 1, pp. 10–28, 1993.

    Google Scholar 

  21. H. Zhang, C.Y. Low, and S. Smoliar, “Video parsing and browsing using compressed data,” Journal of Multimedia Tools and Applications, Vol. 1, No. 1, pp. 89–111, March 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wachman, J.S., Picard, R.W. Tools for Browsing a TV Situation Comedy Based on Content Specific Attributes. Multimedia Tools and Applications 13, 255–284 (2001). https://doi.org/10.1023/A:1009681230513

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009681230513

Navigation