Efficient Feature Coding Based on Auto-encoder Network for Image Classification

Xie, Guo-Sen; Zhang, Xu-Yao; Liu, Cheng-Lin

doi:10.1007/978-3-319-16865-4_41

Guo-Sen Xie⁵,
Xu-Yao Zhang⁵ &
Cheng-Lin Liu⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9003))

Included in the following conference series:

Asian Conference on Computer Vision

2171 Accesses
7 Citations

Abstract

Local descriptor coding is one crucial step in traditional Bag of Words (BoW) framework for image categorization. However, the slow coding speed of previous methods is one limitation for applications in large scale problems. Recently, neural network based models have been widely applied in various classification tasks. Using neural network models for descriptor coding is straightforward and efficient due to their fast forward propagation. In this paper, we propose to use the Auto-Encoder (AE) network as a local descriptor coding block, and further embed AE network in the BoW framework for the purpose of image classification. To make the hidden activities of AE network to be both selective and sparse, we add an efficient and effective regularization term into the learning process of AE network, which can promote sparsity of the hidden layer for each input descriptor as well as the selectivity for each hidden node. By incorporating the AE network coding with the BoW framework, we can achieve better results and faster speeds than other state-of-the-art feature coding methods on Caltech101, Scene15 and UIUC 8-Sports databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We utilized lib-linear toolkit [32] in this paper for SVM training.

References

Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: ECCV, Workshop on Statistical Learning in Computer Vision, pp. 1–22 (2004)
Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.S.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR, pp. 1794–1801 (2009)
Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, pp. 3360–3367 (2010)
Google Scholar
Liu, L., Lei, W., Liu, X.: In defense of soft-assignment coding. In: ICCV, pp. 2486–2493 (2011)
Google Scholar
Huang, Y., Huang, K., Yu, Y., Tan, T.: Salient coding for image classification. In: CVPR, pp. 1753–1760 (2011)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, vol. 1, pp. 886–893 (2005)
Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)
Chapter Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)
Google Scholar
Huang, Y., Wu, Z., Wang, L., Tan, T.: Feature coding in image classification: A comprehensive study. IEEE Trans. Pattern Anal. Mach. Intell. 36, 493–506 (2014)
Article Google Scholar
van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel codebooks for scene categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)
Chapter Google Scholar
Gao, S., Tsang, I.W.H., Chia, L.T.: Laplacian sparse coding, hypergraph laplacian sparse coding, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35, 92–104 (2013)
Article Google Scholar
Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Morioka, N., Satoh, S.: Compact correlation coding for visual object categorization. In: ICCV, pp. 1639–1646 (2011)
Google Scholar
Su, Y., Jurie, F.: Visual word disambiguation by semantic contexts. In: ICCV, pp. 311–318 (2011)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Article MathSciNet Google Scholar
Goh, H., Thome, N., Cord, M., Lim, J.-H.: Unsupervised and supervised visual codes with restricted boltzmann machines. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 298–311. Springer, Heidelberg (2012)
Chapter Google Scholar
Sohn, K., Jung, D.Y., Lee, H., Hero, A.O.: Efficient learning of sparse, distributed, convolutional feature representations for object recognition. In: ICCV, pp. 2643–2650 (2011)
Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002)
Article Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Rumelhart, D., Hintont, G., Williams, R.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article Google Scholar
Ngiam, J., Koh, P.W., Chen, Z., Bhaskar, S.A., Ng, A.Y.: Sparse filtering. In: NIPS, pp. 1125–1133 (2011)
Google Scholar
Wu, Z., Huang, Y., Wang, L., Tan, T.: Group encoding of local features in image classification. In: ICPR, pp. 1505–1508 (2012)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Perronnin, F., Dance, C.R.: Fisher kernels on visual vocabularies for image categorization. In: CVPR (2007)
Google Scholar
Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image classification using super-vector coding of local image descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010)
Chapter Google Scholar
Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: NIPS, pp. 2223–2231 (2009)
Google Scholar
Lehky, S.R., Sejnowski, T.J., Desimone, R.: Selectivity and sparseness in the responses of striate complex cells. Vis. Res. 45, 57–73 (2005)
Article Google Scholar
Ng, A.: Sparse autoencoder. CS294 A Lecture notes (2011)
Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer-Verlag New York Inc., New York (1995)
Book Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples an incremental bayesian approach tested on 101 object categories. In: Proceedings of the Workshop on Generative-Model Based Vision (2004)
Google Scholar
Li, L.J., Li, F.F.: What, where and who? classifying events by scene and object recognition. In: ICCV, pp. 1–8 (2007)
Google Scholar
Jiang, Z., Lin, Z., Davis, L.S.: Label consistent k-svd: Learning a discriminative dictionary for recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2651–2664 (2013)
Article Google Scholar

Download references

Acknowledgement

This work has been supported in part by the National Basic Research Program of China (973 Program) Grant 2012CB316302 and the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant XDA06040102).

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Guo-Sen Xie, Xu-Yao Zhang & Cheng-Lin Liu

Authors

Guo-Sen Xie
View author publications
You can also search for this author in PubMed Google Scholar
Xu-Yao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Lin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guo-Sen Xie .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xie, GS., Zhang, XY., Liu, CL. (2015). Efficient Feature Coding Based on Auto-encoder Network for Image Classification. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_41

Download citation

DOI: https://doi.org/10.1007/978-3-319-16865-4_41
Published: 16 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics