Abstract
The key problems in visual object classification are: learning discriminative feature to distinguish between two or more visually similar categories (e.g. dogs and cats), modeling the variation of visual appearance within instances of the same class (e.g. Dalmatian and Chihuahua in the same category of dogs), and tolerate imaging distortion (3D pose). These account to within and between class variance in machine learning terminology, but in recent works these additional pieces of information, latent dependency, have been shown to be beneficial for the learning process. Latent attribute space was recently proposed and verified to capture the latent dependent correlation between classes. Attributes can be annotated manually, but more attempting is to extract them in an unsupervised manner. Clustering is one of the popular unsupervised approaches, and the recent literature introduces similarity measures that help to discover visual attributes by clustering. However, the latent attribute structure in real life is multi-relational, e.g. two different sport cars in different poses vs. a sport car and a family car in the same pose - what attribute can dominate similarity? Instead of clustering, a network (graph) containing multiple connections is a natural way to represent such multi-relational attributes between images. In the light of this, we introduce an unsupervised framework for network construction based on pairwise visual similarities and experimentally demonstrate that the constructed network can be used to automatically discover multiple discrete (e.g. sub-classes) and continuous (pose change) latent attributes. Illustrative examples with publicly benchmarking datasets can verify the effectiveness of capturing multi- relation between images in the unsupervised style by our proposed network.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The codes can be downloaded from: https://bitbucket.org/kamarainen/imgalign/code.
- 2.
References
Aghazadeh, O., Azizpour, H., Sullivan, J., Carlsson, S.: Mixture component identification and learning for visual recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 115–128. Springer, Heidelberg (2012)
Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for attribute-based classification. In: CVPR (2013)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Dong, J., Xia, W., Chen, Q., Feng, J., Huang, Z., Yan, S.: Subcategory-aware object classification. In: CVPR (2013)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Comput. Vis. 61(1), 55–79 (2005)
Ferrari, V., Zisserman, A.: Learning visual attributes. In: Advances in Neural Information Processing Systems (NIPS) (2007)
Gavves, E., Fernando, B., Snoek, C.G.M., Smeulders, A.W.M., Tuytelaars, T.: Fine-grained categorization by alignments. In: ICCV (2013)
Kim, G., Faloutsos, C., Hebert, M.: Unsupervised modeling of object categories using link analysis techniques. In: CVPR (2008)
Kinnunen, T., Kamarainen, J.-K., Lensu, L., Kälviäinen, H.: Unsupervised object discovery via self-organisation. Pattern Recogn. Lett. 33(16), 2102–2112 (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
Kumar, M., Zisserman, A., Torr, P.: Efficient discriminative learning of parts-based models. In: ICCV (2009)
Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2014)
Lankinen, J., Kamarainen, J.-K.: Local feature based unsupervised alignment of object class images. In: BMVC (2011)
Malisiewicz, T., Efors, A.: Beyond categories: the visual memex model for reasoning about object relationships. In: NIPS (2009)
Malisiewicz, T., Gupta, A., Efors, A.: Ensemble of exemplar-SVMs for object detection and beyond. In: ICCV (2011)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE PAMI 27(10), 1615–1630 (2005)
Myeong, H., Chang, J.Y., Lee, K.M.: Learning object relationships via graph-based context model. In: CVPR (2012)
Ozuysal, M., Lepetit, V., Fua, P.: Pose estimation for category specific multiview object localization. In: CVPR (2009)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)
Philbin, J., Sivic, J., Zisserman, A.: Geometric latent dirichlet allocation on a matching graph for large-scale image datasets. Int. J. Comput. Vis. 95(2), 138–153 (2011)
Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: Kutulakos, K.N. (ed.) ECCV 2010 Workshops, Part I. LNCS, vol. 6553, pp. 1–14. Springer, Heidelberg (2012)
Savarese, S., Li, F.-F.: 3d generic object categorization, localization and pose estimation. In: ICCV, pp. 1–8 (2007)
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep fisher networks for large-scale image classification. In: NIPS (2013)
Tuytelaars, T., Gool, L.V.: Wide baseline stereo matching based on local, affinely invariant regions. In: BMVC (2000)
Tuytelaars, T., Lampert, C., Blaschko, M., Buntine, W.: Unsupervised object discovery: a comparison. Int. J. Comput. Vis. 88(2), 284–302 (2010)
Vedaldi, A., Fulkerson, B.: VLFeat: an open and portable library of computer vision algorithms (2008). http://www.vlfeat.org/
Xia, S., Hancock, E.R.: Incrementally discovering object classes using similarity propagation and graph clustering. In: Zha, H., Taniguchi, R., Maybank, S. (eds.) ACCV 2009, Part III. LNCS, vol. 5996, pp. 373–383. Springer, Heidelberg (2010)
Zhu, X., Loy, C., Gong, S.: Constructing robust affinity graph for spectral clustering. In: CVPR (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Shokrollahi Yancheshmeh, F., Kämäräinen, JK., Chen, K. (2015). Discovering Multi-relational Latent Attributes by Visual Similarity Networks. In: Jawahar, C., Shan, S. (eds) Computer Vision - ACCV 2014 Workshops. ACCV 2014. Lecture Notes in Computer Science(), vol 9010. Springer, Cham. https://doi.org/10.1007/978-3-319-16634-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-16634-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16633-9
Online ISBN: 978-3-319-16634-6
eBook Packages: Computer ScienceComputer Science (R0)