Abstract
Ever-increasing amount of multimedia available online necessitates the development of new techniques and methods that can overcome the semantic gap problem. The said problem, encountered due to major disparities between inherent representational characteristics of multimedia and its semantic content sought by the user, has been a prominent research direction addressed by a great number of semantic augmentation approaches originating from such areas as machine learning, statistics, natural language processing, etc. In this paper, we review several of these recently developed techniques that bring together low-level representation of multimedia and its semantics in order to improve the efficiency of access and retrieval. We also present a distance-based discriminant analysis (DDA) method that defines the design of a basic building block classifier for distinguishing among a selected number of semantic categories. In addition to that, we demonstrate how a set of DDA classifiers can be grouped into a hierarchical ensemble for prediction of an arbitrary set of semantic classes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Santini, S., Jain, R.: Beyond query by example. ACM Multimedia, 345–350 (1998)
Rui, Y., Huang, T., Ortega, M., Mehrotra, S.: Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Transactions on Circuits and Video Technology 8, 644–655 (1998)
Salton, G., Wang, A., Yang, C.: A vector space model for information retrieval. Journal of the American Society for Information Science 18, 613–620 (1975)
Rocchio, J.: Relevance feedback information retrieval. In: Salton, G. (ed.) The Smart retrieval system experiments in automatic document processing, pp. 313–323. Prentice- Hall, Englewood Cliffs (1971)
Squire, D.M., Müller, W., Müller, H., Pun, T.: Content-based query of image databases: inspirations from text retrieval. In: Ersboll, B.K., Johansen, P. (eds.) Pattern Recognition Letters (Selected Papers from The 11th Scandinavian Conference on Image Analysis SCIA 1999), vol. 21, pp. 1193–1198 (2000)
Müller, H., Pun, T., Squire, D.M.: Learning from user behavior in image retrieval: Application of the market basket analysis. International Journal of Computer Vision 56, 65–66 (2004)
Resnick, P., Iacovou, N., Suchak, M., Bergstorm, P., Riedl, J.: GroupLens: An open architecture for collaborative filtering of netnews. In: Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work, Chapel Hill, North Carolina, ACM, pp. 175–186 (1994)
Kohrs, A., Merialdo, B.: Clustering for collaborative filtering applications. In: Proceedings of Computational Intelligence for Modelling, Control & Automation, IOS Press, Amsterdam (1999)
Su, Z., Zhang, H., Ma, S.: Using Bayesian classifier in relevant feedback of image retrieval. In: 12th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2000), Vancouver, British Columbia, Canada, p. 258 (2000)
S:A: Solla, T.L., Müller, K.R. (eds.): Learning from user feedback in image retrieval systems. In S:A: Solla, T.L., Müller, K.R. (eds.): Advances in Neural Information Processing Systems 12, MIT Press, New York (2000)
Cox, I., Miller, M., Minka, T., Papathornas, T., Yianilos, P.: PicHunter: Theory, implementation, and psychophysical experiments. IEEE Transactions on Image Processing 9, 20–37 (2000)
Tong, S., Chang, E.Y.: Support vector machine active learning for image retrieval. In: ACM Multimedia, Ottawa, Ontario, Canada, pp. 107–118 (2001)
Chen, Y., Zhou, X., Huang, T.: One-class svm for learning in image retrieval. In: IEEE International Conf. on Image Processing (ICIP 2001), Thessaloniki, Greece (2001)
Jing, F., Li, M., Zhang, H.J., Zhang, B.: Support vector machines for regionbased image retrieval. In: IEEE International Conference on Multimedia & Expo, Baltimore,MD (2003)
Jing, F., Li, M., Zhang, H.J., Zhang, B.: Learning region weighting from relevance feedback in image retrieval. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Orlando, Florida (2002)
Jing, F., Li, M., Zhang, L., Zhang, H.J., Zhang, B.: Learning in region-based image retrieval. In: International Conference on Image and Video Retrieval (2003)
Deng, Y., Manjunath, B.S., Shin, H.: Color image segmentation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 1999), Fort Collins, CO, vol. 2, pp. 446–451 (1999)
Marchand-Maillet, S., Bruno, E.: Exploiting user interaction for semantic contentbased image retrieval. In: Content-based image and video retrieval (Dagstuhl seminar). LNCS, Springer, Heidelberg (2004)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analisys. Journal of the American Society of Information Science, 391–407 (1990)
Landauer, T., Littman, M.: Fully automatic cross-language document retrieval using latent semantic indexing. In: Sixth Annual Conference of the UW Centre for the New Oxford English Dictionary and Text Research, Waterloo, Ontario, UW Centre for the New OED and Text Research, pp. 31–38 (1990)
Praks, P., Dvorsky, J., Snasel, V.: Latent semantic indexing for image retrieval systems. In: Proceedings of the SIAM Conference on Applied Linear Algebra (LA03), Williamsburg, USA, The College of William and Mary (2003)
Monay, F., Gatica-Perez, D.: On image auto-annotation with latent space models. In: Proc. ACM Int. Conf. on Multimedia (ACM MM) (2003)
Zhao, R., Grosky, W.: Narrowing the semantic gap - Improved text-based document web document retrieval using visual features. IEEE Trans. on Multimedia 4, 189–200 (2002)
Zhao, R., Grosky, W.: From features to semantics: Some preliminary results. In: IEEE International Conference on Multimedia and Expo (II), pp. 679–682 (2000)
Barnard, K., Duygulu, P., Forsyth, D.: Recognition as translating images into text. Internet Imaging IX, Electronic Imaging 2003 (2003) (invited paper)
Barnard, K., Duygulu, P., de Freitas, N., Forsyth, D., Blei, D., Jordan, M.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal statistical Society 39, 1–38 (1977)
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 119–126. ACM Press, New York (2003)
Blei, D.M., Jordan, M.I.: Modeling annotated data. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 127–134. ACM Press, New York (2003)
Vinokourov, A., Hardoon, D., Shawe-Taylor, J.: Learning the semantics of multimedia content with application to web image retrieval and classification. In: Fourth International Symposium on Independent Component Analysis and Blind Source Separation, Nara, Japan (2003)
Vinokourov, A., Shawe-Taylor, J., Cristianini, N.: Inferring a semantic representation of text via cross-language correlation analysis. In: Advances in Neural Information Processing Systems, vol. 15, pp. 1473–1480. MIT Press, Cambridge (2003)
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis an overview with application to learning methods. Technical Report CSD-TR-03-02, Royal Holloway University of London (2003)
Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: First International Workshop on Multimedia Intelligent Storage and Retrieval Management (MISRM 1999) (1999)
Li, J., Gray, R.M., Olshen, R.A.: Joint image compression and classification with vector quantization and a two dimensional hidden markov model. In: Data Compression Conference, pp. 23–32 (1999)
Li, J., Wang, J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Pattern Anal. Mach. Intell. 25, 1075–1088 (2003)
Kosinov, S.: Visual object recognition using distance-based discriminant analysis. Technical Report 03.07, Computer Vision and Multimedia Laboratory, Computing Centre, University of Geneva, Rue Général Dufour, 24, CH-1211, Geneva, Switzerland (2003)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Wiley, Chichester (1973)
Fisher, R.A.: The use of multiple measures in taxonomic problems. Ann. Eugenics 7, 179–188 (1936)
Fukunaga, K., Mantock, J.: Nonparametric discriminant analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 671–678 (1983)
Hastie, T., Tibshirani, R.: Discriminant adaptive nearest neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 18, 607–616 (1996)
Arkadev, A., Braverman, E.: Computers and Patter Recognition. Thompson, Washington, D.C (1966)
Fix, E., Hodges, J.: Discriminatory analysis: Nonparametric discrimination: Consistency properties. Technical Report 4, USAF School of Aviation Medicine (1951)
Huber, P.: Robust estimation of a location parameter. Annals of Mathematical Statistics 35, 73–101 (1964)
Heiser, W.: Convergent computation by iterative majorization: Theory and applications in multidimensional data analysis. Recent advances in descriptive multivariate analysis, 157–189 (1995)
Borg, I., Groenen, P.: Modern Multidimensional Scaling. Springer, New York (1997)
Deun, K.V., Groenen, P.: Majorization algorithms for inspecting circles, ellipses, squares, rectangles, and rhombi. Technical report, Econometric Institute Report EI 2003-35 (2003)
Leibe, B., Schiele, B.: Analyzing appearance and contour based methods for object categorization. In: International Conference on Computer Vision and Pattern Recognition (CVPR, Madison, Wisconsin, pp. 409–415 (2003)
Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Li, Y., Shapiro, L.G.: Object recognition for content-based image retrieval. LNCS. Springer, Heidelberg (to appear, 2004)
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38, 39–41 (1995)
Fasulo, D.: An analysis of recent work on clustering algorithms. Technical report, University of Washington (1999)
Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6, 181–214 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kosinov, S., Marchand-Maillet, S. (2004). Overview of Approaches to Semantic Augmentation of Multimedia Databases for Efficient Access and Content Retrieval. In: Nürnberger, A., Detyniecki, M. (eds) Adaptive Multimedia Retrieval. AMR 2003. Lecture Notes in Computer Science, vol 3094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25981-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-25981-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22163-0
Online ISBN: 978-3-540-25981-7
eBook Packages: Springer Book Archive