Accessing Video Contents through Key Objects over IP

Fan, Jianping; Zhu, Xingquan; Najarian, Kayvan; Wu, Lide

doi:10.1023/A:1025086200838

Accessing Video Contents through Key Objects over IP

Published: September 2003

Volume 21, pages 75–96, (2003)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jianping Fan¹,
Xingquan Zhu²,
Kayvan Najarian¹ &
…
Lide Wu³

60 Accesses
2 Citations
Explore all metrics

Abstract

In order to support content-based video database access over the Internet Protocol (IP), achieving the following objectives are important: (i) video query by a representative object (key object) or some statistical characterization of the target contents, (ii) bandwidth-efficient browsing over IP, and (iii) scalable and user-centric video transmission over a heterogeneous and variable-bandwidth network. We present a video object extraction and scalable coding system designed to meet the above objectives. In our system, key objects of meaning to video database users are generated via a human-computer-interaction procedure, and are tracked across frames. Given a key object, an algorithm classifies a subset of its VOPs as key VOPs. This subset forms the basis of a highly bandwidth-efficient base layer for supporting activities such as browsing and refining queries. Over the base layer, a number of enhancement layers can be defined to progressively increase the spatial and temporal resolutions of retrieved video. It is expected that heterogeneous users can subscribe to different numbers of the enhancement layers according to their own conditions, such as access authorization, available connection bandwidth, and quality preference.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

G. Adiv, “Determining three-dimensional motion and structure from optical flow generated by several moving objects,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 7, pp. 384–401, 1985.
Google Scholar
A.A. Alatan, L. Onural, M. Wollborn, R. Mech, E. Tuncel, and T. Sikora, “Image sequence analysis for emerging interactive multimedia services-The European COST 211 framework,” IEEE Trans. on Circuits and Systems for Video Technology, Vol. 8, pp. 802–813, 1998.
Google Scholar
A.D. Bimbo, E. Vicario, and D. Zingoni, “Symbolic description and visual querying of image sequences using spatio-temporal logic,” IEEE Trans. on Knowledge and Data Engineering, Vol. 7, pp. 609, 1995.
Google Scholar
P. Bouthemy and E. Francois, “Motion segmentation and qulitative dynamic scene analysis from an image sequence,” Int'l J. Computer Vision, Vol. 10, pp. 157–182, 1993.
Google Scholar
J. Cai and A. Goshtasby, “Detecting human faces in color image,” Image and Vision Computing, Vol. 18, pp. 63–75, 1999.
Google Scholar
R. Castagno, T. Ebrahimi, and M. Kunt, “Semiautomatic segmentation and tracking of semantic video objects,” IEEE Trans on Circuits and Systems for Video Technology, Vol. 8, pp. 572–584, 1998.
Google Scholar
S.F. Chang, W. Chen, H.J. Meng, H. Sundaram, and D. Zhong, “A fully automatic content-based video search engine supporting spatiotemporal queries,” IEEE Trans. on Circuits and Systems for Video Technology, Vol. 8, pp. 602–615, 1998.
Google Scholar
J.-Y. Chen, C. Taskiran, A. Albiol, E.J. Delp, and C.A. Bouman, “ViBE: A Compressed Video Database Structured for Active Browsing and Search,” in Proc. SPIE: Multimedia Storage and Archiving Systems IV, Sept. 1999, Boston, Vol. 3846, pp. 148–164.
Google Scholar
J.D. Courtney, “Automatic video indexing via object motion analysis,” Pattern Recognition, Vol. 30, pp. 607–625, 1997.
Google Scholar
I.J. Cox, M. Miller, T.P. Minka, T.V. Papathomas, and P.N. Yianilos, “The Bayesian image retrieval system, PicHunter: Theory, implementation and psychophysical experiments,” IEEE Trans. on Image Processing, Vol. 9, pp. 20–37, 2000.
Google Scholar
Y. Deng and B.S. Manjunath, “NeTra-V: Toward an object-based video representation,” IEEE Trans. on Circuits and Systems for Video Technology, Vol. 8, pp. 616–627, 1998.
Google Scholar
N. Diehl, “Object-oriented motion estimation and segmentation in image sequences,” IEEE Trans. on Image Processing, Vol. 3, pp. 1901–1904, 1990.
Google Scholar
M.F. Dubuisson and A.K. Jain, “Contour extraction for moving objects in complex outdoor scene,” Int'l J. Computer Vision, Vol. 14, pp. 83–105, 1995.
Google Scholar
C. Faloutsos and K.-I. Lin, “FastMap: A fast algorithm for indexing, data-mining and visualization for traditional and multimedia datasets,” in ACM SIGMOD, San Jose, CA, 1995, pp. 163–174.
J. Fan, “Adaptive motion-compensated video coding scheme towards content-based bit rate allocation,” J. Electronic Imaging, Vol. 9, Oct. 2000.
J. Fan, M. Body, X. Zhu, and M.-S. Hacid, “Seeded image segmentation toward content-based image retrieval application,” Storage and Retrieval of Multimedia Database, San Jose, CA, Jan. 23–26, 2002.
J. Fan, G. Fujita, M. Furuie, T. Onoye, I. Shirakawa, and L. Wu, “Automatic moving object extraction towards compact video representation,” Optical Engineering, Vol. 39, No. 2, pp. 438–452, 2000.
Google Scholar
J. Fan and F. Gan, “Motion estimation based on uncompensability analysis,” IEEE Trans. on Image Processing, Vol. 6, pp. 1584–1587, 1997.
Google Scholar
J. Fan, R. Wang, L. Zhang, D. Xing, and F. Gan, “Image sequence segmentation based on 2D temporal entropy,” Pattern Recognition Letters, Vol. 17, pp. 1101–1107, 1996.
Google Scholar
J. Fan, D.K.Y. Yau, A.K. Elmagarmid, and W.G. Aref, “Automatic image segmentation by integrating color edge detection and seeded region growing,” IEEE Trans. on Image Processing,Vol. 10, No. 10, pp. 1454–1466, 2001.
Google Scholar
J. Fan, L. Zhang, and F. Gan, “Spatiotemporal segmentation based on two-dimensional spatio-temporal entropic thresholding,” Optical Engineering, Vol. 36, pp. 2845–2851, 1997.
Google Scholar
M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, “Query by image and video content: The QBIC system,” IEEE Computer, Vol. 38, pp. 23–31, 1995.
Google Scholar
D. Forsyth and M. Fleck, “Finding people and animals by guided assembly,” in Proc. of ICIP, Santa Barbara, USA, 1997.
C. Gu and M.C. Lee, “Semantic segmentation and tracking of semantic video objects,” IEEE Trans. on Circuits and Systems for Video Technology, Vol. 8, pp. 572–584, 1998.
Google Scholar
B. Gunsel, A.M. Ferman, and A.M. Tekalp, “Temporal video segmentation using unsupervised clustering and semantic object tracking,” J. Electronic Imaging, Vol. 7, pp. 592–604, 1998.
Google Scholar
J. Guo, J. Kim, and C.-C. J. Kuo, “SIVOG: Smart interactive video object generation system,” ACM Multimedia, Orlando, FL, 1999, pp. 13–16.
J. Haddon and J. Boyce, “Image segmentation by unifying region and boundary information,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 12, pp. 929–948, 1990.
Google Scholar
M. Hoetter and R. Thoma, “Image segmentation based on object oriented mapping parameter estimation,” Signal Processing, Vol. 15, pp. 315–334, 1989.
Google Scholar
A. Humrapur, A. Gupta, B. Horowitz, C.F. Shu, C. Fuller, J. Bach, M. Gorkani, and R. Jain, “Virage video engine,” in SPIE Proc. Storage and Retrieval for Image and Video Databases V, San Jose, CA, Feb. 1997, pp. 188–197.
D.P. Huttenlocher, G. Klanderman, and W. Rucklidge, “Comparing images using the Hausdorff distance,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 15, pp. 850–863, 1993.
Google Scholar
M. Irani and P. Anandan, “Video indexing based on mosaic representation,” Proc. IEEE, Vol. 86, pp. 905–921, 1998.
Google Scholar
Y. Ishikawa, R. Subramanya, and C. Faloutsos, “Mindreader: Query databases through multiple examples,” in Proc. of the 24th VLDB Conf., 1998.
A. Jaimes and S.F. Chang, “Model-based classification of visual information for content-based retrieval,” in Proc. SPIE: Storage and Retrieval for Image and Video Database, San Jose, CA, 1999.
A.K. Jain, A. Vailaya, and X. Wei, “Query by video clip,” ACM Multimedia Systems, Vol. 7, pp. 369–384, 1999.
Google Scholar
Y.-M. Kwon, E. Ferrari, and E. Bertino, “Modeling spatio-temporal constraints fro multimedia objects,” Data and Knowledge Engineering, Vol. 30, pp. 217–238, 1999.
Google Scholar
H. Luo and A. Eleftheriadis, “Designing an interactive tool for video object segmentation,” ACM Multimedia'99, pp. 265–269.
T. Meier and K.N. Ngan, “Automatic segmentation of moving objects for video object plane generation,” IEEE Trans. Circuits and Systems for Video Technology, Vol. 8, pp. 525–538, 1998.
Google Scholar
F. Moscheni, S. Bhattacharjee, and M. Kunt, “Spatiotemporal segmentation based on region merging,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 20, pp. 897–914, 1998.
Google Scholar
M.R. Naphade and T.S. Huang, “Aprobabilistic framework for semantic video indexing, filtering and retrieval,” IEEE Trans. on Multimedia, Vol. 3, pp. 141–151, 2001.
Google Scholar
E. Oomoto and K. Tanaka, “OVID: Design and implementation of a video object database system,” IEEE Trans Knowledge and Data Engineering, Vol. 5, pp. 629–643, 1993.
Google Scholar
N. Pal and S. Pal, “Entropic thresholding,” Signal Processing, Vol. 16, pp. 97–108, 1989.
Google Scholar
A. Pentland, R.W. Picard, and S. Sclaroff, “Photobook: Content-based manipulation of image databases,” Int. J. Computer Vision, Vol. 18, pp. 233–254, 1996.
Google Scholar
R.W. Picard and T.P. Minka, “Vision texture for annotation,” ACM Multimedia Systems, special issue on content-based retrieval, Vol. 3, pp. 3–14, 1995.
Google Scholar
Y. Rui and T.S. Huang, “A novel relevance feedback technique in image retrieval,” ACM Multimedia'99, pp. 67–70.
Y. Rui, T.S. Huang, and S. Mehrotra, “Browsing and retrieving video content in a unified framework,” in Proc. IEEE Int'l Conf. on Multimedia Computing and Systems, Austin, TX, 1998, pp. 237–240.
Y. Rui, T.S. Huang, M. Ortega, and S. Mehrotra, “Relevance feedback: A power tool for interactive content-based image retrieval,” IEEE Trans. on Circuits and Systems for Video Technology, Vol. 8, pp. 644–655, 1998.
Google Scholar
Y. Rui, A.C. She, and T.S. Huang, “Modified Fourier descriptors for shape representation-a practical approach,” in Proc. of First Int. Workshop on Image Database amd Multi Media Search, 1996.
P. Salembier and M. Pardás, “Hierarchical morphological segmentation for image sequence coding,” IEEE Trans. on Image Processing, Vol. 3, pp. 639–651, 1994.
Google Scholar
S. Satoh and T. Kanade, “Name-It: Association of face and name in video,” in Proc. of Computer Vision and Pattern Recognition, 1997.
G. Sheikholeslami, W. Chang, and A. Zhang, “Semantic clustering and querying on heterogeneous features for visual data,” ACM Multimedia'99, pp. 3–12, Bristol, UK.
J.R. Smith and S.F. Chang, “VisualSEEK: A fully automated content-based image query system,” in ACM Multimedia Conf., Bosston, MA, Nov. 1996, pp. 87–98.
H. Tamura, S. Mori, and T. Yamawaki, “Texture features corresponding to visual perception,” IEEE Trans. on System, Man, and Cybern., Vol. 8, pp. 460–472, 1978.
Google Scholar
A. Vailaya, M. Figueiredo, A.K. Jain, and H.J. Zhang, “A Bayesian framework for semantic classification of outdoor vacation images,” Prof. SPIE, Vol. 3656, pp. 415–426, 1999.
Google Scholar
J.Y.A. Wang and E.H. Adelson, “Representing moving image with layers,” IEEE Trans. Image Processing, Vol. 3, pp. 625–638, 1994.
Google Scholar
G. Wei and I.K. Sethi, “Face detection for image annotation,” Pattern Recognition Letters, Vol. 20, pp. 1313–1321, 1999.
Google Scholar
Y. Xu and E.C. Uberbacher, “2D image segmentation using minimum spanning trees,” Image and Vision Computing, Vol. 15, pp. 47–57, 1997.
Google Scholar
B.-L. Yeo and M.M. Yeung, “Classification, simplification and dynamic visualization of scene transition graphs for video browsing,” in Proc. SPIE, Vol. 3312, pp. 60–70, 1997.
Google Scholar
M.M. Yeung, B.-L. Yeo, and B. Liu, “Extracting story units from long program for video browsing and navigation,” in Proc. Third IEEE Int'l Conf. Multimedia Computing and Systems, June 1996.
H.J. Zhang, J. Wu, D. Zhong, and S. Smoliar, “An integrated system for content-based video retrieval and browsing,” Pattern Recognition, Vol. 30, pp. 643–658, 1997.
Google Scholar
D. Zhao and J. Chen, “Affine curve moment invariants for shape recognition,” Pattern Recognition, Vol. 30, pp. 895–901, 1997.
Google Scholar
D. Zhong, H.J. Zhang, and S.-F. Chang, “Clustering methods for video browsing and annotation,” in Proc. SPIE, 1996.

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of North Carolina, Charlotte, NC, 28223, USA
Jianping Fan & Kayvan Najarian
Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
Xingquan Zhu
Department of Computer Science, Fudan University, Shanghai, 200433, China
Lide Wu

Authors

Jianping Fan
View author publications
You can also search for this author in PubMed Google Scholar
Xingquan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Kayvan Najarian
View author publications
You can also search for this author in PubMed Google Scholar
Lide Wu
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, J., Zhu, X., Najarian, K. et al. Accessing Video Contents through Key Objects over IP. Multimedia Tools and Applications 21, 75–96 (2003). https://doi.org/10.1023/A:1025086200838

Download citation

Issue Date: September 2003
DOI: https://doi.org/10.1023/A:1025086200838

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accessing Video Contents through Key Objects over IP

Abstract

Access this article

Similar content being viewed by others

IVOS - The ITEC Interactive Video Object Search System at VBS2021

Combining Boolean and Multimedia Retrieval in vitrivr for Large-Scale Video Search

A Comparative Discussion on Various Modern Video Retrieval Strategies

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Accessing Video Contents through Key Objects over IP

Abstract

Access this article

Similar content being viewed by others

IVOS - The ITEC Interactive Video Object Search System at VBS2021

Combining Boolean and Multimedia Retrieval in vitrivr for Large-Scale Video Search

A Comparative Discussion on Various Modern Video Retrieval Strategies

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation