Abstract
The explosive growth of audiovisual information in the last few years has made the development of advanced video modeling and management tools an urgent task. In this research, we investigate the use of stratification approach to model the contextual information of video contents as multi-layered strata. By judiciously choosing the right level and types of strata, we are able to automate a major part of the strata extraction process. Using the strata as the basis, we have developed advanced functionalities to support the flexible retrieval and content-based browsing of video. A prototype has been developed to support the whole process of video management, from strata extraction, to indexing, retrieval and browsing. The prototype is tested in the domain of news video and the system has been found to be effective.
Similar content being viewed by others
References
S. Adali, K.S. Candan, S.S. Chen, K. Erol, and V.S. Subrahmanian, “The advanced video information system: Data structures and query processing,” ACM Multimedia Systems, Vol. 4, pp. 172-186, 1996.
T.G. Aguierre-Smith and N.C. Pincever, “Parsing movies in context,” USENIX-summer',Vol. 91, pp. 157-168, 1991.
B. Balazs, Theory of the Film, Dennis Dobson Ltd: London, 1952.
N. Chinchor, “OverviewofMUC-7/MET-2,” in Proc. Of the 7th Message Understanding Conference (MUC-7), 1998, http://www.muc.saic.com/proceedings/muc 7 toc.html.
T.S. Chua and L.Q. Ruan, “A video retrieval and sequencing system,” ACM Trans of Information Systems, Vol. 13, No. 4, pp. 373-407, 1995.
G. Davenport, T.A. Smith, and N. Pincever, “Cinematic primitives for multimedia,” IEEE Computer Graphics and Applications, Vol. 11, No. 4, pp. 67-74, 1991.
A.G. Hauptmann and W.J. Witbrock, “Story segmentation and detection of commercials in broadcast news video,” IEEE Int'l Forum on Research & Technology Advances in Digital Libraries, ADL 98, 1998, pp. 168–179.
A. Hisashi, S. Shigeyoshi, and H. Osamu, “A shot classification method of selecting effective key-frames for video browsing,” ACM Multimedia '96, ACM Press, 1996, pp. 1–10.
Y.-S. Ho and A. Gersho, “Classified transform coding of images using vector quantization,” IEEE Int. Conf. On Acoustics, Speech, and Signal Processing, pp. 1890-1893, 1989.
A.K. Jain, A. Vailaya, and X. Wei, “Query by video clip,” Multimedia Systems, Vol. 7, pp. 369-384, 1999.
R. Lienhart, “Automatic text recognition for video indexing,” in ACM Multimedia '96, 1996, pp. 11–20.
J.M. Liu and T.S. Chua, “Building semantic perceptron net for topic spotting,” in Proceedings of Joint ACL (Association of Computational Linguistics)-EACL Conference, Toulouse, France, July 2001, pp. 370–377.
S. Mori, C.Y. Suen, and K. Yamamoto, “Historical review of OCR research and development,” in Proc. of IEEE, Vol. 80, pp. 1029-1058, 1992.
MPEG, “Information technology-coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s,” Moving Picture Expert Group Committee, ISO/IEC 11172-1,2,3,4. 1993.
MPEG7, Web-site for video test data set: http://drogo.cselt.stet.it/mpeg/. 1997.
H.T. Ng, L.H. Teo, and J.L.P. Kwan, “A machine learning approach to answering questions for reading comprehension tests,” in Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), Hong Kong, 2000, pp. 124–132.
C.P. Papageorgiou, M. Oren, and T. Poggio, “A general framework for object detection,” IEEE Conf. On Computer Vision, 1998, pp. 555–562.
H.A. Rowley, S. Baluja, and T. Kanade, “Neural network-based face detection,” IEEE Trans on Pattern Analysis and Machine Intelligence, Vol. 20, No. 1, pp. 23-38, 1998.
B. Rubin and G. Davenport, “Structured content modeling for cinematic information,” SIGCHI Bulletin, Vol. 21, No. 2, pp. 78-79, 1989.
G. Salton and M. McGill, Introduction to Modern Information Retrieval, McGraw-Hill: New York, 1983.
T. Sato, T. Kanade, E.K. Huges, M.A. Smith, and S. Satoh, “Video OCR: Indexing digital news libraries by recognition of superimposed captions,” Multimedia Systems, Vol. 7, pp. 385-394, 1999.
K.K. Sung and T. Poggio, “Example-based learning for view-based human face detection,” IEEE Trans on Pattern Analysis & Machine Intelligence, Vol. 20, No. 1, pp. 39-51, 1998.
Y. Taniguchi, A. Akutsu, Y. Tonomura, and H. Hamada, “An intuitive and efficient access interface to real-time incoming video based on automatic indexing,” ACM Multimedia 95, ACM Press, 1995.
ViaVoice, “IBM's speech recognition technology,” http://www.software.ibm.com/speech/, 1998.
H. Wang and S.F. Chang, “A highly efficient system for automatic face detection MPEG Video,” IEEE Trans. on Circuits and Systems for Video Technology, Vol. 7, No. 4, pp. 615-628, 1997.
J. Yang, W. Lu, and A. Waibel, “Skin-color modeling and adaption,” Technical Report of School of Computer Science, CMU, CMU-CS-97-146. May 1997.
A. Yoshitaka, S. Kishida, M. Hirakawa, and T. Ichikawa, “Knowledge-assisted content-based retrieval of multimedia databases,” IEEE Multimedia, Winter, Vol. 1, No. 4, pp. 12-21, 1994.
Y. Zhang, “Detection of text captions in compressed domain video,” Master dissertation, School of Computing, National University of Singapore, 2001.
H.J. Zhang, S.W. Smoliar, and J.H. Wu, “Content-based video browsing tools,” in Proc. SPIE Multimedia Computing and Networking, Vol. 2417, pp. 389-398, 1995.
Y. Zhao, T.S. Chua, and M. Kankanhalli, “A compressed-domain fractional scaling technique for image and video,” Technical Report of School of Computing, National University of Singapore, Oct. 2000.
Y. Zhong, K. Karu, and A.K. Jain, “Locating text in complex color image,” in Proc. of 3rd Int'l Conference on Document Analysis and Recognition, Montreal, Canada, Aug. 1995, pp. 146–149.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Chua, TS., Chen, L. & Wang, J. Stratification Approach to Modeling Video. Multimedia Tools and Applications 16, 79–97 (2002). https://doi.org/10.1023/A:1013293702591
Issue Date:
DOI: https://doi.org/10.1023/A:1013293702591