Abstract
Encoding an object essence in terms of self-similarities between its parts is becoming a popular strategy in Computer Vision. In this paper, a new similarity-based descriptor, dubbed Structural Similarity Cross-Covariance Tensor is proposed, aimed to encode relations among different regions of an image in terms of cross-covariance matrices. The latter are calculated between low-level feature vectors extracted from pairs of regions. The new descriptor retains the advantages of the widely used covariance matrix descriptors [1], extending their expressiveness from local similarities inside a region to structural similarities across multiple regions. The new descriptor, applied on top of HOG, is tested on object and scene classification tasks with three datasets. The proposed method always outclasses baseline HOG and yields significant improvement over a recently proposed self-similarity descriptor in the two most challenging datasets.
Chapter PDF
Similar content being viewed by others
References
Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. IEEE Trans. PAMI, 1713–1727 (2008)
Tosato, D., Spera, M., Cristani, M., Murino, V.: Block Characterizing humans on riemannian manifolds. IEEE Trans. PAMI, 2–15 (2013)
San Biagio, M., Crocco, M., Cristani, M., Martelli, S., Murino, V.: Heterogeneous Auto-Similarities of Characteristics (HASC): Exploiting relational information for classification. In: Proc. ICCV (2013)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. ICCV, vol. 2, pp. 1150–1157 (1999)
Wang, X., Han, T.X., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: Proc. ICCV, pp. 32–39 (2009)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. CVPR, vol. 1, pp. 886–893 (2005)
Martelli, S., Cristani, M., Bazzani, L., Tosato, D., Murino, V.: Joining feature-based and similarity-based pattern description paradigms for object detection. In: Proc. ICPR (2012)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. CVIU 106(1), 59–70 (2007)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Tech. Rep. 7694, California Institute of Technology (2007)
Perina, A., Jojic, N.: Spring lattice counting grids: Scene recognition using deformable positional constraints. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 837–851. Springer, Heidelberg (2012)
Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27 (2011), http://www.csie.ntu.edu.tw/cjlin/libsvm
Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: Proc. ICCV, pp. 606–613 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
San Biagio, M., Martelli, S., Crocco, M., Cristani, M., Murino, V. (2013). Encoding Classes of Unaligned Objects Using Structural Similarity Cross-Covariance Tensors. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2013. Lecture Notes in Computer Science, vol 8258. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41822-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-41822-8_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41821-1
Online ISBN: 978-3-642-41822-8
eBook Packages: Computer ScienceComputer Science (R0)