Multimedia Tools and Applications

, Volume 20, Issue 2, pp 135–158 | Cite as

Framework for Synthesizing Semantic-Level Indices

  • Ankush Mittal
  • Loong-Fah Cheong


Extraction of the syntactic features is a well-defined problem thereby lending them to be exclusively employed in most of the content-based retrieval systems. However, semantic-level indices are more appealing to user as they are closer to the user's personal space. Most of the work done at semantic level is confined to a limited domain as the features developed and employed therein apply satisfactorily only to that particular domain. Scaling up such systems would inevitably result in large numbers of features. Currently, there exists a lacuna in the availability of a framework that can effectively integrate these features and furnish semantic level indices.

The objective of this paper is to highlight some of the issues in the design of such a framework and to report on the status of its development. In our framework, construction of a high-level index is achieved through the synthesis of its large set of elemental features. From the large collection of these features, an image/video class is characterized by selecting automatically only a few principal features. By properly mapping the constrained multi-dimensional feature space constituted by these principal features, with the semantics of the data, it is feasible to construct high level indices. The problem remains, however, to automatically identify the principal or meaningful subset of features. This is done through the medium of Bayesian Network that discerns the data into cliques by training with pre-classified data. The Bayesian Network associates each clique of data points in the multi-dimensional feature space to one of the classes during training that can later be used for evaluating the most probable class to which that partition of feature space belongs. This framework neither requires normalization of different features or the aid of an expert knowledge base. The framework enables a stronger coupling between the feature extraction and meaningful high-level indices and yet the coupling is sufficiently domain independent, as shown by the experiments. The experiments were conducted over real video consisting of seven diverse classes and the results show its superiority over some of the standard classification tools.

content based retrieval syntactic features Bayesian Network semantic level indices meaningful-feature selection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Y.A. Aslandogan and C.T. Yu, “Techniques and systems for image and video retrieval,” IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 1, pp. 56–63, 1999.Google Scholar
  2. 2.
    M.L. Cascia and E. Ardizzone, “JACOB: Just a content-based query system for video databases,” in IEEE Int. Conf. on Acoustics, Speech and Signal Processing, May 1996.Google Scholar
  3. 3.
    C.C. Chang and C.-J. Lin, “LIBSVM: Introduction and benchmarks,” in Tech. Report, CS Deptt., NTU, Taiwan. http: // jlin/libsvm/, 2000.Google Scholar
  4. 4.
    G.F. Cooper, “Probabilistic inference using belief networks is NP-hard,” Technical Report, KSL-87-27, Stanford University.Google Scholar
  5. 5.
    J. Demsar and F. Solina, “Using machine learning for content-based image retrieving,” International Conference on Pattern Recognition, 1996, Vol. 3.Google Scholar
  6. 6.
    N.D. Doulamis, A.D. Doulamis, and S.D. Kollias, “A neural network approach to interactive content-based retrieval of video databases,” International Conference on Image Processing, 1999, Vol. 2.Google Scholar
  7. 7.
    A.M. Ferman and A.M. Tekalp, “Probabilistic analysis and extraction of video content,” in Proc. Of ICIP, 1999, Vol. 2, pp. 91–95.Google Scholar
  8. 8.
    S. Fischer, R. Lienhart, and W. Effelsberg, “Automatic recognition of film genres,” in ACM Multimedia 95—Electronic Proceedings, San Francisco, California, Nov. 1995.Google Scholar
  9. 9.
    M. Flickner et al., “Query by image and video Content: The QBIC system,” IEEE Computer, pp. 23–32, Sept. 1995.Google Scholar
  10. 10.
    V.N. Gudivada and V.V. Raghavan, “Content-based image retrieval systems,” IEEE Computer, Sept. '95.Google Scholar
  11. 11.
    A. Hampapur, “Designing video data management systems,” in Ph.D. Thesis, The University of Michigan, 1995.Google Scholar
  12. 12.
    S. Haykin, “Neural network: A comprehensive foundation,” 2nd ed., pp. 178–210, 1999.Google Scholar
  13. 13.
    M. Henrion, “Towards efficient probabilistic diagnosis in multiply connected belief networks,” Influence Diagrams, Belief Nets and Decision Analysis, pp. 385–410, 1990.Google Scholar
  14. 14.
    A.K. Jain, A. Vailaya, and X. Wei, “Query by video clip,” Multimedia Systems, pp. 369–384, 1999.Google Scholar
  15. 15.
    P.M. Kelly, T.M. Cannon, and D.R. Hush, “Query by image example: The CANDID approach,” in Proc. of the SPIE, Storage and Retrieval for Image and Video Databases III, Vol. 2420, pp. 238–248, 1995.Google Scholar
  16. 16.
    S.L. Lauritzen and D.J. Spiegelhalter, “Local computations with probabilities on graphical structures and their applications to expert systems,” J. Royal Statistical Society, pp. 157–224, 1988.Google Scholar
  17. 17.
    S. Lendis, “Content-based image retrieval systems project,” http: // Education/cs718/fall1995/landis/.Google Scholar
  18. 18.
    T.M. Mitchell, “Instance-based learning,” in Machine Learning, McGraw-Hill, pp. 230–248, 1997.Google Scholar
  19. 19.
    F. Nack and A. Parkes, “The application of video semantics and theme representation in automated video editing,” Multimedia Tools and Applications, pp. 57–83, 1997.Google Scholar
  20. 20.
    M.R. Naphade, T. Kristjansson, B. Frey, and T.S. Huang, “Probabilistic multimedia objects (multijects): A novel approach to video indexing and retrieval in multimedia systems,” in Proc. of ICIP, 1998, pp. 536–540.Google Scholar
  21. 21.
    V.E. Ogle and M. Stonebraker, “CHABOT: Retrieval from a relational database of images,” IEEE Computer, pp. 40–48, September 1995.Google Scholar
  22. 22.
    K. Otsuji and Y. Tonomura, “Projection-detecting filter for video cut detection,” Multimedia Systems, Vol. 1, pp. 205–210, 1994.Google Scholar
  23. 23.
    M. Pazzani, “An interative improvement approach for the discretization of numeric attributes in bayesian classifiers,” International Conference on Knowledge Discovery and Data Mining (KDD), pp. 228–233, 1995.Google Scholar
  24. 24.
    M. Pazzani, C. Merz, K. Ali, and T. Hume, “Reducing misclassification costs,” International Conference on Machine Learning, 1994.Google Scholar
  25. 25.
    J. Pearl, “Probabilistic Reasoning in Intelligent Systems,” Morgan Kaufmann, 1988.Google Scholar
  26. 26.
    Y. Peng and J.A. Reggia, “Abductive Inference Models for Diagnostic Problem-Solving,” Springer-Verlag, 1990.Google Scholar
  27. 27.
    J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.Google Scholar
  28. 28.
    S. Raudys, “How good are support vector machines?” Neural Networks, Vol. 13, pp. 17–19, 2000.Google Scholar
  29. 29.
    X.J. Shannon, M.J. Black, S. Minneman, and D. Kimber, “Analysis of gesture and action in technical talks for video indexing,” IEEE Int. Conf. on Computer Vision and Pattern Recognition, pp. 595–601, 1997.Google Scholar
  30. 30.
    J.R. Smith and S.F. Chang, “VisualSEEK: A Fully Automated Content-Based Image Query System,” ACM Multimedia, Nov. 1996.Google Scholar
  31. 31.
    V. Sobchack, “Toward inhabited spce: The semiotic structure of camera movement in the cinema,” Semotica, pp. 317–335, 1982.Google Scholar
  32. 32.
    M.A. Stricker and M. Swain, “Bounds for the discrimination power of color indexing techiniques,” in Proceedings SPIE Storage and Retreival for Image and Video Databases II, 1994, pp. 15–24.Google Scholar
  33. 33.
    G. Sudhir, C.M. Lee, and A.K. Jain, “Automatic classification of tennis video for high-level content-based retrieval,” in IEEE Workshop on Content-Based Access of Image and Video Databases, 1998.Google Scholar
  34. 34.
    B.S. Todd, R. Stamper, and P. Machpherson, “A probabilistic rule-based expert system,” International Journal of Biomedical computing, pp. 129–148, 1993.Google Scholar
  35. 35.
    N. Vasconcelos and A. Lipman, “Towards semantically meaningful feature spaces for the characterization of video content,” in Proc. of Int. Conf. on Image Processing, 1997.Google Scholar
  36. 36.
    Z. Yang and C.C.J. Kuo, “Asemantic classification and composite indexing approach to robust image retrieval,” International Conference on Image Processing, Vol. 1, 1999.Google Scholar
  37. 37.
    D. Yow, B.L. Yeo, M. Yeung, and B. Liu, “Analysis and presentation of soccer highlights from digital video,” in Second Asian Conf. on Computer Vision(ACCV '95), 1995.Google Scholar
  38. 38.
    R. Zabih, J. Miller, and K. Mai, “Video browsing using edges and motion,” IEEE Int. Conf. on Computer Vision and Pattern Recognition, pp. 439–446, 1996.Google Scholar
  39. 39.
    H. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic partitioning of full motion video,” Multimedia Systems, Vol. 1, pp. 10–28, 1993.Google Scholar
  40. 40.
    H.J. Zhang, Y. Gong, S.W. Smoliar, and S.Y. Tan, “Automatic parsing of news video,” in Proc. of Int. Conf. On Multimedia Computing and Systems, Boston, Massachusetts, USA, May 1994, pp. 45–54.Google Scholar
  41. 41.
    H.J. Zhang, C.Y. Low, S.W. Smoliar, and J.H. Wu, “Video parsing retrieval and browsing: An integrated and content based solution,” in Proc. of Multimedia '95, San Francisco, CA, USA, 1995, pp. 15–24.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Ankush Mittal
  • Loong-Fah Cheong

There are no affiliations available

Personalised recommendations