Fast Approximations to Structured Sparse Coding and Applications to Object Classification
Abstract
We describe a method for fast approximation of sparse coding. A given input vector is passed through a binary tree. Each leaf of the tree contains a subset of dictionary elements. The coefficients corresponding to these dictionary elements are allowed to be nonzero and their values are calculated quickly by multiplication with a precomputed pseudoinverse. The tree parameters, the dictionary, and the subsets of the dictionary corresponding to each leaf are learned. In the process of describing this algorithm, we discuss the more general problem of learning the groups in group structured sparse modeling. We show that our method creates good sparse representations by using it in the object recognition framework of [1,2]. Implementing our own fast version of the SIFT descriptor the whole system runs at 20 frames per second on 321 ×481 sized images on a laptop with a quad-core cpu, while sacrificing very little accuracy on the Caltech 101, Caltech 256, and 15 scenes benchmarks.
Keywords
Sparse Code Dictionary Learn Sift Descriptor Fast Approximation Group LassoReferences
- 1.Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR 2006 (2006) 1, 2, 8, 9, 11Google Scholar
- 2.Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR 2009 (2009) 1, 2, 8, 9, 11Google Scholar
- 3.Olshausen, B., Field, D.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996) 1CrossRefGoogle Scholar
- 4.Aharon, M., Elad, M., Bruckstein, A.: K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing 54, 4311–4322 (2006) 1, 2CrossRefGoogle Scholar
- 5.Kavukcuoglu, K., Ranzato, M., LeCun, Y.: Fast inference in sparse coding algorithms with applications to object recognition. Technical Report CBLL-TR-2008-12-01, Computational and Biological Learning Lab, Courant Institute, NYU (2008) 1, 7Google Scholar
- 6.Yang, J., Yu, K., Huang, T.: Efficient Highly Over-Complete Sparse Coding Using a Mixture Model. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 113–126. Springer, Heidelberg (2010) 1, 2, 7CrossRefGoogle Scholar
- 7.Boureau, Y., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: Proc. International Conference on Computer Vision and Pattern Recognition (CVPR 2010). IEEE (2010) 1, 11Google Scholar
- 8.Boureau, Y., Roux, N.L., Bach, F., Ponce, J., LeCun, Y.: Ask the locals: multi-way local pooling for image recognition. In: International Conference on Computer Vision (2011) 1, 2, 7, 10, 11Google Scholar
- 9.Pati, Y.C., Rezaiifar, R., Rezaiifar, Y.C.P.R., Krishnaprasad, P.S.: Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27 th Annual Asilomar Conference on Signals, Systems, and Computers, pp. 40–44 (1993) 2, 4Google Scholar
- 10.Jenatton, R., Mairal, J., Obozinski, G., Bach, F.: Proximal methods for sparse hierarchical dictionary learning. In: International Conference on Machine Learning, ICML (2010) 2, 6Google Scholar
- 11.Kim, S., Xing, E.P.: Tree-guided group lasso for multi-task regression with structured sparsity. In: ICML, pp. 543–550 (2010) 2, 6Google Scholar
- 12.Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 433–440. ACM, New York (2009) 2, 6Google Scholar
- 13.Baraniuk, R.G., Cevher, V., Duarte, M.F., Hegde, C.: Model-Based Compressive Sensing (2009) 2, 6Google Scholar
- 14.Lloyd, S.P.: Least squares quantization in pcm. IEEE Transactions on Information Theory 28, 129–137 (1982) 2, 3MathSciNetzbMATHCrossRefGoogle Scholar
- 15.Gilbert, A.C., Strauss, M.J., Tropp, J.A.: Simultaneous Sparse Approximation via Greedy Pursuit. IEEE Trans. Acoust. Speech Signal Process. 5, 721–724 (2005) 2, 4Google Scholar
- 16.Tropp, J.: Topics in Sparse Approximation. PhD thesis, University of Texas at Austin, Computational and Applied Mathematics (2004) 5Google Scholar
- 17.Ostrovsky, R., Rabani, Y., Schulman, L., Swamy, C.: The effectiveness of lloyd-type methods for the k-means problem. In: FOCS 2006 (2006) 6Google Scholar
- 18.Dasgupta, S., Freund, Y.: Random projection trees and low dimensional manifolds. In: STOC 2008 (2008) 6Google Scholar
- 19.Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing, pp. 518–529 (1999) 6Google Scholar
- 20.Huang, J., Zhang, T., Metaxas, D.N.: Learning with structured sparsity. In: ICML, p. 53 (2009) 6Google Scholar
- 21.Vidal, R.: Subspace clustering. IEEE Signal Processing Magazine 28, 52–68 (2011) 6CrossRefGoogle Scholar
- 22.Wang, F., Lee, N., Sun, J., Hu, J., Ebadollahi, S.: In: Burgard, W., Roth, D. (eds.) AAAI. AAAI Press (2011) 6Google Scholar
- 23.Ramírez, I., Sprechmann, P., Sapiro, G.: Classification and clustering via dictionary learning with structured incoherence and shared features. In: CVPR, pp. 3501–3508 (2010) 6Google Scholar
- 24.Allard, W., Chen, G., Maggioni, M.: Multiscale geometric methods for data sets II: Geometric multi-resolution analysis. Applied and Computational Harmonic Analysis 32, 435–462 (2012) 7MathSciNetzbMATHCrossRefGoogle Scholar
- 25.Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Non-local sparse models for image restoration. In: International Conference on Computer Vision (2009) 7Google Scholar
- 26.Gregor, K., LeCun, Y.: Learning fast approximations of sparse coding. In: International Conference on Machine Learning, ICML (2010) 7Google Scholar
- 27.Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874 (2008) 10zbMATHGoogle Scholar
- 28.Fei-Fei, L., Fergus, R., Perona, P.: Caltech 101, http://www.vision.caltech.edu/Image_Datasets/Caltech101/ 10
- 29.Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007) 10Google Scholar
- 30.Lazebnik, S., Schmid, C., Ponce, J., Li, F., Oliva, A.: 15 scenes, http://www-cvr.ai.uiuc.edu/ponce_grp/data/ 10
- 31.Mairal, J.: SPAMS sparse coding toolbox, http://www.di.ens.fr/willow/SPAMS/ 10, 12
- 32.Gao, S., Tsang, I.W.-H., Zhao, L.T.C., Local, P.: Local features are not lonely-laplacian sparse coding for image classification. In: CVPR 2010 (2010) 10Google Scholar