Hybrid Rule-Based/Neural Approach for Segmentation of MPEG Compressed Video

Koprinska, Irena; Carrato, Sergio

doi:10.1023/A:1019940716250

Hybrid Rule-Based/Neural Approach for Segmentation of MPEG Compressed Video

Published: December 2002

Volume 18, pages 187–212, (2002)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Irena Koprinska¹ &
Sergio Carrato²

67 Accesses
4 Citations
Explore all metrics

Abstract

An approach for video segmentation into shots and sub-shots that works directly in the MPEG compressed domain is presented. It is based only on the information about macroblock coding mode and motion vectors in P and B frames. The system follows a two-pass scheme and has a hybrid rule-based/neural structure. A rough scan over the P frames locates the potential shot boundaries and the solution is then refined by a precise scan over the B frames of the respective neighborhoods. The “simpler” boundaries are recognized by the rule-based module, while the decisions for the “complex” ones are refined by the neural part. The latter is also used to distinguish dissolves from object and camera motions and to further divide shots into sub-shots. The experiments demonstrate high speed and classification accuracy without computationally expensive calculations and need for many thresholds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey on Video Segmentation

Self-propagating video segmentation using patch matching and enhanced Onecut

Article 18 January 2020

Object detection in video sequences by a temporal modular self-adaptive SOM

Article 06 March 2015

References

P. Aigrain and P. Joly, “The automatic real-time analysis of film editing and transition effects and its applications,” Computers and Graphics, Vol. 18, No. 1, pp. 93-103, 1994.
Google Scholar
A. Akutsu and Y. Tonomura, “Video tomography: An efficient method for camerawork extraction and motion analysis,” in Proc. ACM Multimedia'94, 1994, pp. 349-356.
A. Akutsu, Y. Tonomura, H. Hashimoto, and Y. Ohba, “Video indexing using motion vectors,” in Proc. SPIE: Visual Communications and Image Processing, Vol. 1818, Boston, USA, 1992, pp. 1522-1530.
Google Scholar
G. Ananger and T.D.C. Little, “A survey of technologies for parsing and indexing digital video,” Journal of Visual Communication and Image Representation, Vol. 7, No. 1, pp. 28-43, 1996.
Google Scholar
F. Arman, A. Hsu, and M.-Y. Chiu, “Image processing on compressed data for large video databases,” in Proc. First ACM Intern. Conference on Multimedia, 1993, pp. 267-272.
J. Astola, P. Haavisto, and Y. Neuvo,“Vector median filters,” in Proc. IEEE, Vol. 78, No. 4, 1990, pp. 678-689.
Google Scholar
J.S. Boreczky and L.A. Rowe, “Comparison of video shot boundary detection techniques,” in Proc. IS&T/SPIE Intern. Symposium Electronic Imaging, San Jose, CA, 1996.
J.S. Boreczky and L.D. Wilcox, “A hidden Markov model framework for video segmentation using audio and image features,” in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, Seattle, Vol. 6, 1998, pp. 3741-3744.
Google Scholar
B. Furht, S.W. Smoliar, and H.J. Zhang, Video and Image Processing in Multimedia Systems, Kluwer Academic Publ.: Boston, MA, 1995.
Google Scholar
U. Gargi, R. Kasturi, and S. Antani, “Performance characterization and comparison of video indexing algorithms,” in Proc. Computer Society Conf. Computer Vision and Pattern Recognition (CVPR'98), 1998.
U. Gargi, S. Oswald, D. Kosiba, S. Devadiga, and R. Kasturi, “Evaluation of video sequence indexing and hierarchical video indexing,” in Proc. SPIE Conf. Storage and Retrieval in Image and Video Databases, 1995, pp. 144-151.
A. Hampapur, R. Jain, and T.E. Weymouth, “Production model based digital video segmentation,” Multimadia Tools and Applications, Vol. 1, No. 1, 1995, pp. 9-46.
Google Scholar
ISO/IEC 13818 Draft International Standard: Generic Coding of Moving Pictures and Associated Audio, Part 2: video.
R. Kasturi and R. Jain, “Dynamic vision,” in Computer Vision: Principles, R. Kasturi and R. Jain (Eds.), IEEE Computer Society Press: Washington DC, 1991, pp. 469-480.
Google Scholar
T. Kohonen, “The self-organizing map,” in Proc. of the IEEE, Vol. 78, No. 9, 1990, pp. 1464-1480.
Google Scholar
I. Koprinska and S. Carrato, “Camera operation detection in MPEG video data by means of neural networks,” in Proc. COST 254 Workshop Emerging Technologies for Communication Terminals, Toulouse, France, 1997, pp. 300-304.
I. Koprinska and S. Carrato, “Detecting and classifying video shot boundaries in MPEG compressed sequences,” in Proc. IX European Signal Processing Conf. (EUSIPCO'98), Island of Rhodes, Greece, 1998, pp. 1729-1732.
J. Maeda, “Method for extracting camera operations to describe sub-scenes in video sequences,” in Proc. IS&T/SPIE/IEEE Conf. Digital Video Compression, Vol. 2187, San Jose, USA, 1994, pp. 56-67.
Google Scholar
J. Meng, Y. Juan, and S. Chang, “Scene change detection in a MPEG compressed video sequence,” in Proc. IS&T/SPIE Intern. Symposium Electronic Imaging, Vol. 2417, San Jose, USA, 1995, pp. 14-25.
Google Scholar
A. Nagasaka and Y. Tanaka, “Automatic video indexing and full-video search for object appearances,” in Visual Database Systems II, E. Knuth and L.M. Wegner (Eds.), Elsevier, 1995, pp. 113-127.
N.V. Patel and I.K. Sethi, “Video shot detection and characterization for video databases,” Pattern Recognition, special issue on Multimedia, Vol. 30, pp. 583-592, 1997.
Google Scholar
J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.
I.K. Sethi and N.V. Patel, “A statistical approach to scene change detection,” in Proc. IS&T/SPIE Conf. Storage and Retrieval for Image and Video Databases III, Vol. 2420, San Jose, CA, 1995, pp. 2-11.
Google Scholar
I.K. Sethi and G.P.R. Sarvarayuda, “Hierarchical classifier design using mutual information,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 4, 1982, pp. 441-445.
Google Scholar
B. Shahraray, “Scene change detection and content-based sampling of video sequences,” in Proc. IS&T/SPIE, Vol. 2419, 1995, pp. 2-13.
Google Scholar
K. Shen and E. Delp, “A fast algorithm for video parsing using MPEG compressed sequences,” in Proc. Intern. Conf. Image Processing (ICIP'96), Lausanne, Switzerland, 1996.
D. Swanberg, C.-F. Shu, and R. Jain, “Knowledge guided parsing in video databases,” in Proc. SPIE Conf., Vol. 1908, 1993, pp. 13-24.
Google Scholar
C. Taskiran and E. Delp, “Video scene change detection using the generalized sequence trace,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), Seattle, USA, 1998, pp. 2961-2964.
L. Teodosio and W. Bender, “Silent video stills: Content and context preserved,” in Proc. ACM Multimedia Conference, Anaheim, CA, 1993, pp. 39-46.
Y. Tonomura, “Video handling based on structured information for hypermedia systems,” in Proc. ACM Intern. Conference on Multimedia Information Systems'91, 1991, pp. 333-344.
S.W. Weiss and C.A. Kulikowski, Computer Systems That Learn. Morgan Kaufmann, 1991.
W. Xiong, J.C.-M. Lee, and M.C. Ip, “Net comparison: A fast and effective method for classifying image sequences,” in Proc. of SPIE Conference on Storage and Retrieval for Image and Video Databases III, Vol. 2420, San Jose, CA, 1995, pp. 318-328.
Google Scholar
B. Yeo and B. Liu, “Rapid scene analysis on compressed video,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 5, No. 6, 1995, pp. 533-544.
Google Scholar
H. Yu, G. Bozdagi, and S. Harrington, “Feature-based hierarchical video segmentation,” in Proc. of the Int. Conf. on Image Processing (ICIP'97), Santa Barbara, 1997, pp. 498-501.
R. Zabih, J. Miler, and K. Mai, “A feature-based algorithm for detecting and classifying production effects,” Multimedia Systems, Vol. 7, pp. 119-128, 1999.
Google Scholar
H.J. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic partitioning of full-motion video,” Multimedia Systems, Vol. 1, No. 1, pp. 10-28, 1993.
Google Scholar
H.J. Zhang, C.Y. Low, and S.W. Smoliar, “Video parsing and browsing using compressed data,” Multimedia Tools and Applications, Vol. 1, pp. 89-111, 1995.
Google Scholar
http://nucleus.hut.fi/nnrc/.
http://www.mpeg.org/MPEG/MSSG/.

Download references

Author information

Authors and Affiliations

School of Information Technologies, University of Sydney, Sydney, NSW, 2006, Australia
Irena Koprinska
Department of Electrical Engineering and Computer Science (D.E.E.I.), Image Processing Laboratory, University of Trieste, via Valerio 10, 34127, Trieste, Italy
Sergio Carrato

Authors

Irena Koprinska
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Carrato
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koprinska, I., Carrato, S. Hybrid Rule-Based/Neural Approach for Segmentation of MPEG Compressed Video. Multimedia Tools and Applications 18, 187–212 (2002). https://doi.org/10.1023/A:1019940716250

Download citation

Issue Date: December 2002
DOI: https://doi.org/10.1023/A:1019940716250

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid Rule-Based/Neural Approach for Segmentation of MPEG Compressed Video

Abstract

Access this article

Similar content being viewed by others

A Survey on Video Segmentation

Self-propagating video segmentation using patch matching and enhanced Onecut

Object detection in video sequences by a temporal modular self-adaptive SOM

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Hybrid Rule-Based/Neural Approach for Segmentation of MPEG Compressed Video

Abstract

Access this article

Similar content being viewed by others

A Survey on Video Segmentation

Self-propagating video segmentation using patch matching and enhanced Onecut

Object detection in video sequences by a temporal modular self-adaptive SOM

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation