Abstract
From relevant textual information to improve visual content understanding and representation is an effective way for deeply understanding web image content. However, the description of images is usually imprecise at the semantic level, which is caused by the noisy and redundancy information in both text (such as surrounding text in HTML pages) and visual (such as intra-class diversity) aspects. This paper considers the solution from the association analysis for image content and presents a Bidirectional- Isomorphic Manifold learning strategy to optimize both visual feature space and textual space, in order to achieve more accurate comprehension for image semantics and relationships. To achieve this optimization between two different models, Bidirectional-Isomorphic Manifold Learning utilizes a novel algorithm to unify adjustments in both models together to a topological structure, which is called the reversed Manifold mapping. We also demonstrate its correctness and convergence from a mathematical perspective. Image annotation and keywords correlation analysis are applied. Two groups of experiments are conducted: The first group is carried on the Corel 5000 image database to validate our method’s effectiveness by comparing with state-of-the-art Generalized Manifold Ranking Based Image Retrieval and SVM, while the second group carried on a web-downloaded Flickr dataset with over 6,000 images to testify the proposed method’s effectiveness in real-world application. The promising results show that our model attains a significant improvement over state-of-the-art algorithms.
Similar content being viewed by others
References
Barnard K, Duygulu P, Forsyth D, Blei D, Jordan M (2003) Matching words and pictures. J Mach Learning Res vol. 3
Blei DM, Jordan MI (2003) Modeling annotated data. In Proceedings of ACM SIGIR Conference 2003, pp. 127–134
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet Allocation. In J Mach Learning Res 3:1532–4435
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with Co-Training. In Proceedings of Computational Learning Theory, pp. 92~100
Cao L, Luo J, Kautz H, Huang TS (2009) Image annotation within the context of personal photo collections using hierarchical event and scene models. IEEE Transactions on Multimedia 11(2):208–219
Culp M, Michailidis G (2007) Graph-based semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 2(10):856–860
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences and trends of the new age. ACM Computer Survey 40(2):1–60
Fellbaum C (1998) WordNet: An electronic lexical database, Bradford Book, May
Freedman D (2002) Efficient simplicial reconstructions of manifolds from their samples. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(10):1349–1357
Golder S, Huberman BA (2006) Usage patterns of collaborative tagging systems. Journal of Information Science 32(2):198–208
Goldman S, Zhou Y (2000) Enhancing supervised learning with unlabeled data. In Proceedings of ACM International Conference on Machine Learning, pp. 327–334
Guan H, Turk M (2007) The hierarchical isometric self-organizing map for manifold representation. IEEE Conference on Computer Vision and Pattern Recognition, 17–22 June 2007, Page 1–8
Haralick RM, Shanmugam K, Dinstein I (1973) Texture features for image classification. IEEE Transaction on Systems Man and Cybernetics 3(11):610–621
He J, Li M, Zhang H-J, Tong H, Zhang C (2004) Manifold-ranking based image retrieval. In Proceedings of ACM International Conference on Multimedia, pp. 9–16
He J, Li M, Zhang H-J, Tong H, Zhang C (2006) Generalized manifold-ranking-based image retrieval. IEEE Transactions on Image Processing 15(10):3170–3177
Jarvelin K, Kekalainen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20:422–446
Ji R, Yao H (2007) Visual & textual fusion for region retrieval from both Bayesian reasoning and fuzzy matching aspects. In Proceedings of ACM International Workshop on Multimedia Information Retrieval
Ji R, Yao H, Xu P, Sun X, Liu X (2008) Real-time image annotation by manifold-based biased fisher discriminate learning. In Proceedings of Visual Communications and Image Processing
Jing F, Li M, Zhang H, Zhang B (2000) A unified framework for image retrieval using keyword and visual features. IEEE Transactions on Image Processing 14(7):979–989
Joachims T (2003) Transductive learning via spectral graph partitioning. In Proceedings of ACM International Conference on Machine Learning, 2003
Klema V, Laub A (1980) The singular value decomposition: Its computation and some applications. IEEE Transactions on Automatic Control, pp. 164–176, April
Lang S (1996) Differential and riemannian manifolds. Springer- Verlag, 1996
Lee JM (2000) Introduction to topological manifolds. Springer- Verlag, 2000
Liu J, Li M, Ma W-Y, Liu Q, Lu H (2006) An adaptive graph model for automatic image annotation. ACM SIGMM Workshop on Multimedia Information Retrieval, pp. 61–70
Liu X, Yao H, Ji R, Xu P, Sun X (2009) What is a complete set of keywords for image description & annotation on the web. In Proceedings of ACM International Conference on Multimedia
Liu D, Hua XS, Yang L, Wang M (2009) Tag ranking. In Proceedings of ACM International Conference on World Wide Web, pp. 351–360
Nigam K, Ghani R (2000) Analyzing the effectiveness and applicability of co-training. In Proceedings International Conference on Information and Knowledge Management, Page 86–93
Rui X, Li M, Li Z, Ma W, Yu N (2007) Bipartite graph reinforcement model for web image annotation. In Proceedings ACM International Conference on Multimedia, 2007, pp. 585–594
Salton G, Buckley C (1998) Term-weighting approaches in automatic text retrieval. Information Processing and Management 24:513–523
Seeger M (2002) Learning with labeled and unlabeled data. Inst. for Adaptive and Neural Computation, technical report
Sigurbjorsnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In Proceedings of International Conference on World Wide Web Conference, pp. 327–336
Teh YW, Jordan MI, Beal MJ, Blei DM (2006) Hierarchical dirichlet processes. In Journal of American Statistical Association, 101(476):1566–1581
Tenenbaum JB, Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290:2319–2323
Wang X, Ma W, Xue G, Li X (2004) Multi-model similarity propagation and its application for web image retrieval. In Proceedings of ACM International Conference on Multimedia, 2004, pp. 944–951
Weinberger K, Slaney M, van Zwol R (2008) Resolving tag ambiguity. In Proceedings of ACM International Conference on Multimedia, pp. 111–120
Zhang Z, Zha H (2005) Principal Manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM Journal of Scientific Computing 26(1):313–338
Zhou ZH, Li M (2005) Semi-supervised regression with co-training. In Proceedings of International Joint Conference on Artificial Intelligence, pp. 908–913
Zhou ZH, Chen K-J, Dai H-B (2006) Enhancing relevance feedback in image retrieval using unlabeled data. ACM Transactions on Information System 24(2):219–244
Zhu X (2006) Semi-supervised learning literature survey. Computer Science, University of Wisconsin-Madison
Acknowledgement
The work was supported in part by the National Science Foundation of China No. 61071180, and Key Program Grant of National Science Foundation of China No. 61133003.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, X., Yao, H., Ji, R. et al. Bidirectional-isomorphic manifold learning at image semantic understanding & representation. Multimed Tools Appl 64, 53–76 (2013). https://doi.org/10.1007/s11042-011-0947-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-011-0947-2