Skip to main content
Log in

Hybrid Rule-Based/Neural Approach for Segmentation of MPEG Compressed Video

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

An approach for video segmentation into shots and sub-shots that works directly in the MPEG compressed domain is presented. It is based only on the information about macroblock coding mode and motion vectors in P and B frames. The system follows a two-pass scheme and has a hybrid rule-based/neural structure. A rough scan over the P frames locates the potential shot boundaries and the solution is then refined by a precise scan over the B frames of the respective neighborhoods. The “simpler” boundaries are recognized by the rule-based module, while the decisions for the “complex” ones are refined by the neural part. The latter is also used to distinguish dissolves from object and camera motions and to further divide shots into sub-shots. The experiments demonstrate high speed and classification accuracy without computationally expensive calculations and need for many thresholds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. P. Aigrain and P. Joly, “The automatic real-time analysis of film editing and transition effects and its applications,” Computers and Graphics, Vol. 18, No. 1, pp. 93-103, 1994.

    Google Scholar 

  2. A. Akutsu and Y. Tonomura, “Video tomography: An efficient method for camerawork extraction and motion analysis,” in Proc. ACM Multimedia'94, 1994, pp. 349-356.

  3. A. Akutsu, Y. Tonomura, H. Hashimoto, and Y. Ohba, “Video indexing using motion vectors,” in Proc. SPIE: Visual Communications and Image Processing, Vol. 1818, Boston, USA, 1992, pp. 1522-1530.

    Google Scholar 

  4. G. Ananger and T.D.C. Little, “A survey of technologies for parsing and indexing digital video,” Journal of Visual Communication and Image Representation, Vol. 7, No. 1, pp. 28-43, 1996.

    Google Scholar 

  5. F. Arman, A. Hsu, and M.-Y. Chiu, “Image processing on compressed data for large video databases,” in Proc. First ACM Intern. Conference on Multimedia, 1993, pp. 267-272.

  6. J. Astola, P. Haavisto, and Y. Neuvo,“Vector median filters,” in Proc. IEEE, Vol. 78, No. 4, 1990, pp. 678-689.

    Google Scholar 

  7. J.S. Boreczky and L.A. Rowe, “Comparison of video shot boundary detection techniques,” in Proc. IS&T/SPIE Intern. Symposium Electronic Imaging, San Jose, CA, 1996.

  8. J.S. Boreczky and L.D. Wilcox, “A hidden Markov model framework for video segmentation using audio and image features,” in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, Seattle, Vol. 6, 1998, pp. 3741-3744.

    Google Scholar 

  9. B. Furht, S.W. Smoliar, and H.J. Zhang, Video and Image Processing in Multimedia Systems, Kluwer Academic Publ.: Boston, MA, 1995.

    Google Scholar 

  10. U. Gargi, R. Kasturi, and S. Antani, “Performance characterization and comparison of video indexing algorithms,” in Proc. Computer Society Conf. Computer Vision and Pattern Recognition (CVPR'98), 1998.

  11. U. Gargi, S. Oswald, D. Kosiba, S. Devadiga, and R. Kasturi, “Evaluation of video sequence indexing and hierarchical video indexing,” in Proc. SPIE Conf. Storage and Retrieval in Image and Video Databases, 1995, pp. 144-151.

  12. A. Hampapur, R. Jain, and T.E. Weymouth, “Production model based digital video segmentation,” Multimadia Tools and Applications, Vol. 1, No. 1, 1995, pp. 9-46.

    Google Scholar 

  13. ISO/IEC 13818 Draft International Standard: Generic Coding of Moving Pictures and Associated Audio, Part 2: video.

  14. R. Kasturi and R. Jain, “Dynamic vision,” in Computer Vision: Principles, R. Kasturi and R. Jain (Eds.), IEEE Computer Society Press: Washington DC, 1991, pp. 469-480.

    Google Scholar 

  15. T. Kohonen, “The self-organizing map,” in Proc. of the IEEE, Vol. 78, No. 9, 1990, pp. 1464-1480.

    Google Scholar 

  16. I. Koprinska and S. Carrato, “Camera operation detection in MPEG video data by means of neural networks,” in Proc. COST 254 Workshop Emerging Technologies for Communication Terminals, Toulouse, France, 1997, pp. 300-304.

  17. I. Koprinska and S. Carrato, “Detecting and classifying video shot boundaries in MPEG compressed sequences,” in Proc. IX European Signal Processing Conf. (EUSIPCO'98), Island of Rhodes, Greece, 1998, pp. 1729-1732.

  18. J. Maeda, “Method for extracting camera operations to describe sub-scenes in video sequences,” in Proc. IS&T/SPIE/IEEE Conf. Digital Video Compression, Vol. 2187, San Jose, USA, 1994, pp. 56-67.

    Google Scholar 

  19. J. Meng, Y. Juan, and S. Chang, “Scene change detection in a MPEG compressed video sequence,” in Proc. IS&T/SPIE Intern. Symposium Electronic Imaging, Vol. 2417, San Jose, USA, 1995, pp. 14-25.

    Google Scholar 

  20. A. Nagasaka and Y. Tanaka, “Automatic video indexing and full-video search for object appearances,” in Visual Database Systems II, E. Knuth and L.M. Wegner (Eds.), Elsevier, 1995, pp. 113-127.

  21. N.V. Patel and I.K. Sethi, “Video shot detection and characterization for video databases,” Pattern Recognition, special issue on Multimedia, Vol. 30, pp. 583-592, 1997.

    Google Scholar 

  22. J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.

  23. I.K. Sethi and N.V. Patel, “A statistical approach to scene change detection,” in Proc. IS&T/SPIE Conf. Storage and Retrieval for Image and Video Databases III, Vol. 2420, San Jose, CA, 1995, pp. 2-11.

    Google Scholar 

  24. I.K. Sethi and G.P.R. Sarvarayuda, “Hierarchical classifier design using mutual information,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 4, 1982, pp. 441-445.

    Google Scholar 

  25. B. Shahraray, “Scene change detection and content-based sampling of video sequences,” in Proc. IS&T/SPIE, Vol. 2419, 1995, pp. 2-13.

    Google Scholar 

  26. K. Shen and E. Delp, “A fast algorithm for video parsing using MPEG compressed sequences,” in Proc. Intern. Conf. Image Processing (ICIP'96), Lausanne, Switzerland, 1996.

  27. D. Swanberg, C.-F. Shu, and R. Jain, “Knowledge guided parsing in video databases,” in Proc. SPIE Conf., Vol. 1908, 1993, pp. 13-24.

    Google Scholar 

  28. C. Taskiran and E. Delp, “Video scene change detection using the generalized sequence trace,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), Seattle, USA, 1998, pp. 2961-2964.

  29. L. Teodosio and W. Bender, “Silent video stills: Content and context preserved,” in Proc. ACM Multimedia Conference, Anaheim, CA, 1993, pp. 39-46.

  30. Y. Tonomura, “Video handling based on structured information for hypermedia systems,” in Proc. ACM Intern. Conference on Multimedia Information Systems'91, 1991, pp. 333-344.

  31. S.W. Weiss and C.A. Kulikowski, Computer Systems That Learn. Morgan Kaufmann, 1991.

  32. W. Xiong, J.C.-M. Lee, and M.C. Ip, “Net comparison: A fast and effective method for classifying image sequences,” in Proc. of SPIE Conference on Storage and Retrieval for Image and Video Databases III, Vol. 2420, San Jose, CA, 1995, pp. 318-328.

    Google Scholar 

  33. B. Yeo and B. Liu, “Rapid scene analysis on compressed video,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 5, No. 6, 1995, pp. 533-544.

    Google Scholar 

  34. H. Yu, G. Bozdagi, and S. Harrington, “Feature-based hierarchical video segmentation,” in Proc. of the Int. Conf. on Image Processing (ICIP'97), Santa Barbara, 1997, pp. 498-501.

  35. R. Zabih, J. Miler, and K. Mai, “A feature-based algorithm for detecting and classifying production effects,” Multimedia Systems, Vol. 7, pp. 119-128, 1999.

    Google Scholar 

  36. H.J. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic partitioning of full-motion video,” Multimedia Systems, Vol. 1, No. 1, pp. 10-28, 1993.

    Google Scholar 

  37. H.J. Zhang, C.Y. Low, and S.W. Smoliar, “Video parsing and browsing using compressed data,” Multimedia Tools and Applications, Vol. 1, pp. 89-111, 1995.

    Google Scholar 

  38. http://nucleus.hut.fi/nnrc/.

  39. http://www.mpeg.org/MPEG/MSSG/.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koprinska, I., Carrato, S. Hybrid Rule-Based/Neural Approach for Segmentation of MPEG Compressed Video. Multimedia Tools and Applications 18, 187–212 (2002). https://doi.org/10.1023/A:1019940716250

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1019940716250

Navigation