Multimedia Tools and Applications

, Volume 24, Issue 3, pp 253–272 | Cite as

A Formal Model for Video Shot Segmentation and its Application via Animate Vision

  • Massimiliano Albanese
  • Angelo Chianese
  • Vincenzo Moscato
  • Lucio Sansone


The first step in a video indexing process is the segmentation of videos into meaningful parts called shots. In this paper we present a formal model of the video shot segmentation process. Starting from a mathematical characterization of the most common transition effects, a video segmentation algorithm capable to detect both abrupt and gradual transitions is proposed. The proposed algorithm is based on the computation of an arbitrary similarity measure between consecutive frames of a video. The algorithm has been tested adopting a similarity metric based on the Animate Vision theory and results have been reported.

video indexing video segmentation shot transitions animate vision 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    M. Albanese, G. Boccignone, V. Moscato, and A. Picariello, "Image similarity based on animate vision: Information-path matching," in Proc. 8thWorkshop Multimedia Information System, Tempe, AZ, USA, Nov. 2002, pp. 66-75.Google Scholar
  2. 2.
    D. Ballard, "Animate vision," Artificial intelligence, No. 48, pp. 57-86, 1991.Google Scholar
  3. 3.
    D.R. Bull, W.A.C. Fernando, and C.N. Canagarajah, "Fade and dissolve detection detection in uncompressed and compressed video sequences," IEEE Int. Conf. on Image Processing, 1999, pp. 299- 303.Google Scholar
  4. 4.
    P.J. Burt and E.H. Adelson, "The Laplacian pyramid as a compact image code," IEEE Trans. on Communication, Vol. 9, pp. 532-540, 1983.Google Scholar
  5. 5.
    W.A.C. Fernando, C.N. Canagarajah, and D.R. Bull, "Sudden scene change detection in MPEG-2 video sequences," in Proc. IEEE InternationalWorkshop on Multimedia Signal Processing, Copenhagen, Denmark, Sep. 1999, pp. 259-264.Google Scholar
  6. 6.
    W.A.C. Fernando, C.N. Canagarajah, and D.R. Bull, "Video segmentation and classification for content based storage and retrieval using motion vectors," in Proc. SPIE Conference on Storage and Retrieval for Image and Video Databases VII, San Jose, CA, USA, Jan. 1999, pp. 687-698.Google Scholar
  7. 7.
    W.A.C. Fernando, C.N. Canagarajah, and D.R. Bull, "A unified approach to scene change detection in uncompressed and compressed video," IEEE Trans. on Consumer Electronics, Vol. 46, No. 3, Aug. 2000.Google Scholar
  8. 8.
    U. Gargi, R. Kasturi, and S.H. Strayer, "Performance characterization of video-shot change detection methods," IEEE Trans. on Circ. Sys. for Video Tech., Vol. 10, No. 1, pp. 1-13, 2000.Google Scholar
  9. 9.
    J. Hafnerm, H. Sawhney, W. Equitz, M. Flickner, and W. Niblack, "Efficient color histogram indexing for quadratic form distance functions," IEEE Trans. on Pattern Analisys and Machine Intell., Vol. 17, No. 7, pp. 729-736, 1995.Google Scholar
  10. 10.
    A. Hanjalic, "Shot-boundary detection: Unraveled and resolved," IEEE Trans. on Circ. Sys. for Video Tech., Vol. 12, pp. 90-105, Feb. 2002.Google Scholar
  11. 11.
    L. Itti and C. Koch, "Computational modelling of visual attention," Natire Reviews-Neuroscience, Vol. 2, pp. 1-11, 2001.Google Scholar
  12. 12.
    L. Itti, C. Koch, and E. Niebur, "A model of saliency based visual attention for rapid scene analysis," IEEE Trans. on PAMI, Vol. 20, pp. 1254-1259, 1998.Google Scholar
  13. 13.
    C. Koch and S. Ullman, "Shifts in selective visual attention: towards the underlying neural circuitry," Hum Neurobiol, No. 4, pp. 219-227, 1985.Google Scholar
  14. 14.
    Z.N. Li and J. Wei, "Spatio-temporal joint probability images for video segmentation," in Proc. IEEE Int. Conf. on Image Processing, Vancouver, BC, Canada, Sep. 2000, pp. 295-298.Google Scholar
  15. 15.
    H.Y.M. Liao, L.H. Chen, C.W. Su, and H.R. Tyan, "A motion-tolerant dissolve detection algorithm," in Proc. IEEE Int. Conf. on Multimedia and Expo, Lausanne, Switzerland, Aug. 2002, pp. 225-228.Google Scholar
  16. 16.
    R. Lienhart, "Reliable dissolve detection," in Proc. SPIE: Storage and Retrieval for Media Databases, 2001, pp. 219-230.Google Scholar
  17. 17.
    S. Mallat, A wavelet Tour of Signal Processing, Academic Press, NY, 1998.Google Scholar
  18. 18.
    J. Meng, Y. Juan, and S.F. Chang, "Scene change detection in an MPEG compressed video sequence," SPIE, Alg. and Tech., Vol. 2419, Feb. 1995.Google Scholar
  19. 19.
    A. Nagasaka and Y. Tanaka, "Automatic video indexing and full-video search for object appearence," Visual Database Systems, Vol. II, pp. 113-127, 1992.Google Scholar
  20. 20.
    J. Nam and A.H. Tewfik, "Dissolove transition detection using B-splines interpolation," IEEE Int. Conf. on Multimedia and Expo, July 2000.Google Scholar
  21. 21.
    D. Noton and L. Stark, "Scanpaths in the saccadice eye movements during pattern perception," Vision Research, No. 11, pp. 929-942, 1990.Google Scholar
  22. 22.
    D. Parkhurst, K. Law, and E. Niebur, "Modeling the role of salience in the allocation of overt visual attention," Visual Research, Vol. 42, pp. 107-123, 2002.Google Scholar
  23. 23.
    M. Philips and W.Wolf, "A multi-attribute shot segmentation algorithm for video programs," Telecomm. Sys., No. 9, pp. 393-402, 1998.Google Scholar
  24. 24.
    R.P.N. Rao and D.H. Ballard, "Dynamic model of visual recognition predicts neural response properties in the visual cortex," Neur. Comp., Vol. 9, 1997.Google Scholar
  25. 25.
    M.J. Swain and D.H. Ballard, "Color indexing," Int. J. of Computer Vision, Vol. 7, No. 1, pp. 11-32, 1991.Google Scholar
  26. 26.
    J.K. Tsotsos, S.M. Culhane, W.Y.K. Wai, Y.H. Lai, N. Davis, and F. Nuflo, "Modeling visual-attention via selective tuning," Artif. Intell., Vol. 78, pp. 507-545, 1995.Google Scholar
  27. 27.
    G.J. Walker-Smith, A.G. Gale, and J.M. Findlay, "Eye movement strategies involved in face perception," Perception, Vol. 6, pp. 313-326, 1997.Google Scholar
  28. 28.
    A.L. Yarbus, Eye Movements and Vision, Plenum Press: New York, USA, 1967.Google Scholar
  29. 29.
    B.L. Yeo and B. Liu, "Rapid scene analysis on compressed video," IEEE Trans. on Circ. Sys. for Video Tech., Vol. 5, No. 6, pp. 533-544, 1995.Google Scholar
  30. 30.
    R. Zabih, J. Miller, and K. Mai, "A feature based algorithm for detecting and classifying scene breaks," in Proc. of ACM Multimedia '95, 1995, pp. 189-200.Google Scholar
  31. 31.
    H.J. Zhang, "Automatic partitioning of full-motion video," ACM/Spinger Multimedia Systems, Vol. 1, No. 1, pp. 10-28, 1993.Google Scholar
  32. 32.
    H.J. Zhang, L.Y. Ghong, and S.W. Smoliar, "Video parsing using compressed data," in Proc. IS&T/SPIE Conf. on Image and Video Processing II, 1994.Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Massimiliano Albanese
    • 1
  • Angelo Chianese
    • 1
  • Vincenzo Moscato
    • 1
  • Lucio Sansone
    • 1
  1. 1.Dipartimento di Informatica e SistemisticaUniversità di Napoli “Federico II”Italy

Personalised recommendations