Abstract
With the bag-of-visual-words image representation, we can use the text analysis methods, such as pLSA and LDA, to solve the visual objects clustering and classification problems. However the previous works only used a fixed visual vocabulary, which is formed by vector quantizing SIFT like region descriptors, and so the learned visual topic models are also only based on the fixed vocabulary. This paper presents a novel approach to cluster visual objects in an incremental manner. Given a new batch of images, we firstly expand the visual vocabulary to include the new visual words, and then adjust the objects clustering model to absorb these new words, and finally give the clustering result. We achieve our goal by adapting to the visual domain of the incremental pLSA model previously used for text analysis. Experimental results demonstrate the feasibility and stability of the growing vocabulary tree and the clustering performance using the images from seven categories in a dynamic environment.
Similar content being viewed by others
References
Blei D, Ng A, Jordan M (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Cai D, He X, Li Z, Ma WY, Wen JR (2004) Hierarchical clustering of WWW image search results using visual, textual and link information. In: ACM multimedia
Chakrabarti D, Kumar R, Tomkins A (2006) Evolutionary clustering. In: Proc. ACM SIGKDD
Chou TC, Chen MC (2008) Using incremental plsa for threshold resilient online event anlysis. IEEE Trans Knowl Data Eng 20:289–299
Fei-Fei L, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: Proc. CVPR
Gao B, Liu TY, Qin T, Zheng X, Cheng QS, Ma WY (2005) Web image clustering by consistent utilization of visual features and surrounding texts. In: ACM multimedia
Grauman K, Darrell T (2005) The pyramid match kernel: discriminative classification with sets of image features. In: Proc. ICCV
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proc. SIGIR
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 43:177–196
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR
Lepetit V, Fua P (2006) Keypoint recognition using randomized trees. In: PAMI, pp 1465–1479
Li L, Wang G, Fei-Fei L (2007) Optimol: automatic online picture collection via incremental model learning. In: Proc. CVPR
Lowe D (2004) Distinctive image features from scale-invariant keypoints. IJCV 60:91–110
Matas J, Chum O, Martin U, Pajdla T (2002) Robust wide baseline stereo from maximally stable extremal regions. In: Proc. BMVC, vol 1, pp 384–393
Mikolajczyk K, Schmid C (2004) Scale and affine invariant interest point detectors. IJCV 60:63–86
Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. PAMI 27:1615–1630
Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. PAMI 9:1632–1646
Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: Proc. CVPR
Reddy KK, Liu J, Shah M (2009) Incremental action recognition using feature-tree. In: ICCV
Sivic J, Russell BC, Efros AA, Zisserman A, Freeman WT (2005) Discovering objects and their location in images. In: Proc. ICCV, pp 370–377
Slobodan I (2008) Object labeling for recognition using vocabulary trees. In: ICPR
Yeh T, Darrell T (2008) Dynamic visual category learning. In: CVPR
Yeh T, Lee J, Darrell T (2007) Adaptive vocabulary forests for dynamic indexing and category learning. In: Proc. ICCV
Zheng X, Cai D, He X, Ma WY, Lin X (2004) Locality preserving clustering for image database. In: ACM multimedia
Acknowledgements
This work was supported by the National High Technology Research and Development Program of China (No. 2008AA02Z310), Shanghai Committee of Science and Technology (No. 08411951200, No. 08JG05002), 973 (2009CB320901) and NLPR (09-4-1).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fu, Z., Lu, H. & Li, W. Incremental visual objects clustering with the growing vocabulary tree. Multimed Tools Appl 56, 535–552 (2012). https://doi.org/10.1007/s11042-010-0616-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0616-x