Abstract
We present an effective technique for automatic extraction, representation, and classification of digital video, and a visual language for formulation of queries to access the semantic information contained in digital video. We have devised an algorithm that extracts motion information from a video sequence. This algorithm provides a low-cost extension to the motion compensation component of the MPEG compression algorithm. In this paper, we present a visual language called VEVA for querying multimedia information in general, and video semantic information in particular. Unlike many other proposals that concentrate on browsing the data, VEVA offers a complete set of capabilities for specifying relationships between the image components and formulating queries that search for objects, their motions and their other associated characteristics. VEVA has been shown to be very expressive in this context mainly due to the fact that many types of multimedia information are inherently visual in nature.
Similar content being viewed by others
References
A. Akutsu, Y. Tonomura, H. Hashimoto, and Y. Ohba, “Video indexing using motion vectors,” in SPIE Proc.: Visual Communications and Image Processing, 1992, Vol. 1818, pp. 1522–1530.
D.H. Ballard and C.M. Brown, Computer Vision, Prentice Hall: Englewood Cliffs, NJ, 1982.
A.F. Bobick, “Representational frames in video annotation,” in Proc. of the 27th Annual Conf. on Signals, Systems and Computers, Asilomar, Nov. 1993.
M.C. Buchanan and P.T. Zellweger, “Automatically generating consistent schedules for multimedia documents,” Multimedia Systems Journal, Vol. 1, No. 2, Sept. 1993.
M.M. Burnett, M.J. Baker, C. Bohus, P. Carlson, S. Yang, and P. van Zee, “Scaling up visual programming languages,” IEEE Computer, Vol. 28, No. 3, pp. 45–54, March 1995.
S.-K. Chang, “A visual language compiler for information retrieval by visual processing,” IEEE Trans. on Software Engineering, Vol. 16, No. 10, pp. 1136–1149, 1990.
S.-K. Chang and A. Hsu, “Image information systems: Where do we go from here?,” IEEE Trans. on Knowledge and Data Engineering, Vol. 4, No. 5, pp. 431–442, 1992.
J. Chen, A. Aiken, N. Nathan, C. Paxson, M. Stonebraker, and J. Wu, “Extending a graphical query language to support updates, foreign systems, and transactions,” Tech. Rep. UCB//S2K-93-38, University of California, Berkeley, 1993.
R. Connor, Q. Cutts, G. Kirby, V. Moore, and R. Morrison, “Unifying interaction with persistent data and program,” in Proc. of 2nd Int. Workshop on Interfaces to Databases, Ambleside, Cumbria, 1994, pp. 185–200.
M. Davis, “Media streams: An iconic visual language for video anotation,” in Proc. of IEEE Symposium on Visual Languages, Bergen, Norway, 1993, pp. 196–202.
N. Dimitrova and F. Golshani, “Rx for semantic video database retrieval,” in Proc. of ACM Multimedia' 94, San Francisco, ACM Press: New York, Oct. 1994, pp. 219–226.
N. Dimitrova and F. Golshani, “Motion recovery for video content classification,” ACM Trans. on Information Systems, Vol. 13, No. 4, pp. 408–439, Oct. 1995.
J. Goguen, J. Thatcher, and E. Wagner, “An initial algebra approach to the specification, correctness and implementation of abstract data types,” in Current Trends in Programming Methodology, R. Yeh (Ed.), Prentice Hall, 1978, Vol. 4, pp. 80–149.
F. Golshani, “A modal system of algebras for database specification and query/update language support,” in Proc. of 9th Int. Conf. on Very Large Data Bases, 1983, pp. 331–339.
F. Golshani and N. Dimitrova, “Retrieval and delivery of information in multimedia database systems,” Information and Software Technology, Vol. 36, No. 4, pp. 235–242, May 1994.
A. Hampapur, T. Weymouth, and A. Jain, “Digital video segmentation,” in Proc. of ACM Multimedia' 94, San Francisco, ACM Press: New York, Oct. 1994, pp. 357–364.
D. Jackson and M.A. Bell, “String pattern matching in a visual programming language,” Tech. Rep. 94-004, University of Liverpool, Department of Computer Science, 1994.
G. Johansson, “Spatio-temporal differentiation and integration in visual motion perception,” Psychological Research, Vol. 38, pp. 379–393, 1976.
T. Little, “Synchronization for distributed multimedia database systems,” Ph.D. Thesis, Syracuse University, 1991.
T. Little, G. Ahanger, R. Folz, J. Gibbon, F. Reeve, D. Schelleng, and D. Venkatesh, “A digital on-demand video service supporting content-based queries,” in Proc. of ACM Multimedia' 93, Anaheim, ACM Press: New York, Aug. 1993, pp. 427–436.
N. Michael, VEENA—A visual query language, Master's Thesis, Arizona State University, 1994.
A. Nagasaka and Y. Tanaka, “Automatic video indexing and full-video search for object appearances,” Visual Database Systems, II, E. Knuth and L. Wegner (Eds.), Elsevier Science Publishers/North-Holland, 1992, pp. 113–127.
J.K. Osterhout, Tcl and the Tk Toolkit, Addison Wesley Publishers, 1994.
K. Otsuji and Y. Tonomura, “Projection detection filter for video cut detection,” in Proc. of ACM Multimedia' 93, Anaheim, ACM Press: New York, Aug. 1993, pp. 251–257.
R.D. Riecken, “Human-machine interactions and perception,” Multimedia Interface Design, M.M. Blattner and R.B. Dannenberg (Eds.), ACM Press/Addison Wesley, 1992, pp. 319–338.
L.A. Rowe, J.S. Boreczky, and C.A. Eads, “Indexes for user access to large video databases,” in Proc. SPIE IS&T Symp. on Storage and Retrieval for Image and Video Databases, San Jose, Feb. 1994.
R.J. Schalkoff, Digital Image Processing and Computer Vision, John Wiley and Sons, Inc., 1989.
D. Swanberg, C.-F. Shu, and R. Jain, “Knowledge guided parsing in video databases,” in Image and Video Processing Conf.; Symposium on Electronic Imaging: Science & Technology, IS&T/SPIE, San Jose, CA, Feb. 1993, Vol. 1908, pp. 13–24.
W. Wadge and E. Ashcroft, Lucid: The Dataflow Language, Academic Press, 1985.
R. Weiss, A. Duda, and D.K. Gifford, “Composition and search with a video algebra,” IEEE Multimedia, Vol. 2, No. 1, pp. 12–25, Spring 1995.
R.J. Wieringa, “Algebraic foundations for dynamic conceptual models,” Ph.D. Thesis, Vrije Universiteit te Amsterdam, 1990.
H. Zhang, Y. Gong, S. Smoliar, and S.Y. Tan, “Automatic parsing of news video,” in Proc. of the Int. Conf. on Multimedia Computing and Systems, Boston, MA, IEEE Computer Society Press, May 1994, pp. 45–54.
Rights and permissions
About this article
Cite this article
Golshani, F., Dimitrova, N. A Language for Content-Based Video Retrieval. Multimedia Tools and Applications 6, 289–312 (1998). https://doi.org/10.1023/A:1009612532460
Issue Date:
DOI: https://doi.org/10.1023/A:1009612532460