Learning Sparse Features with an Auto-Associator

Rebecchi, Sébastien; Paugam-Moisy, Hélène; Sebag, Michèle

doi:10.1007/978-3-642-55337-0_4

Learning Sparse Features with an Auto-Associator

Sébastien Rebecchi⁵,
Hélène Paugam-Moisy⁵ &
Michèle Sebag^5,6

Chapter
First Online: 01 January 2014

1684 Accesses

Part of the book series: Studies in Computational Intelligence ((SCI,volume 557))

Abstract

A major issue in statistical machine learning is the design of a representation, or feature space, facilitating the resolution of the learning task at hand. Sparse representations in particular facilitate discriminant learning: On the one hand, they are robust to noise. On the other hand, they disentangle the factors of variation mixed up in dense representations, favoring the separability and interpretation of data. This chapter focuses on auto-associators (AAs), i.e. multi-layer neural networks trained to encode/decode the data and thus de facto defining a feature space. AAs, first investigated in the 80s, were recently reconsidered as building blocks for deep neural networks. This chapter surveys related work about building sparse representations, and presents a new non-linear explicit sparse representation method referred to as Sparse Auto-Associator (SAA), integrating a sparsity objective within the standard auto-associator learning criterion. The comparative empirical validation of SAAs on state-of-art handwritten digit recognition benchmarks shows that SAAs outperform standard auto-associators in terms of classification performance and yield similar results as denoising auto-associators. Furthermore, SAAs enable to control the representation size to some extent, through a conservative pruning of the feature space.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Original MNIST database: http://yann.lecun.com/exdb/mnist/.
2.
MNIST variants site: http://www.iro.umontreal.ca/~lisa/twiki/bin/view.cgi/Public/Mnist Vari-ations.
3.
The probabilistic sparsification heuristics has been experimented too and found to yield similar results (omitted for the sake of brevity).
4.
All statistical tests are heteroscedastic bilateral T tests. A difference is considered significant if the p-value is less than \(0.001\).
5.
Complementary experiments, varying the pruning threshold in a range around 0, yield same performance (results omitted for brevity).

References

M. Aharon, M. Elad, A. Bruckstein, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54, 4311–4322 (2006)
Google Scholar
P. Baldi, K. Hornik, Neural networks and principal component analysis: learning from examples without local minima. Neural Networks 2, 53–58 (1989)
Article Google Scholar
Y. Bengio, Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)
Article MATH Google Scholar
Y. Bengio, P. Lamblin, V. Popovici, H. Larochelle, in Neural Information Processing Systems (NIPS). Greedy Layer-wise Training of Deep Networks (2007), pp. 1–8
Google Scholar
H. Bourlard, Y. Kamp, Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59, 291–294 (1988)
Article MATH MathSciNet Google Scholar
E.J. Candès, The restricted isometry property and its implications for compressed sensing. Comptes Rendus de l’Académie des Sci. 346, 589–592 (2008)
Article MATH Google Scholar
S.S. Chen, D.L. Donoho, M.A. Saunders, Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998)
Article MathSciNet Google Scholar
A. Coates, A.Y. Karpathy, A. ans Ng, in Neural Information Processing Systems (NIPS). Emergence of Object-Selective Features in Unsupervised Feature Learning (2012)
Google Scholar
D.L. Donoho, M. Elad, Optimally sparse representation in general (nonorthogonal) dictionaries via \(\ell ^1\) minimization. Proc. Nat. Acad. Sci. U.S.A. 100, 2197–2202 (2003)
Article MATH MathSciNet Google Scholar
X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectifier neural networks. in International Conference on Artificial Intelligence and Statistics (AISTATS) (2011), pp. 315–323
Google Scholar
H. Goh, N. Thome, M. Cord, Biasing restricted Boltzmann machines to manipulate latent selectivity and sparsity. in Neural Information Processing Systems (NIPS): Workshop on Deep Learning and Unsupervised Feature, Learning (2010), pp. 1–8
Google Scholar
K. Gregor, Y. LeCun, Learning fast approximations of sparse coding. in International Conference on Machine Learning (ICML) (2010), pp. 399–406
Google Scholar
G.E. Hinton, Connectionist learning procedures. Artif. Intell. 40, 185–234 (1989)
Article Google Scholar
G.E. Hinton, A practical guide to training restricted Boltzmann machines. Technical Report UTML TR 2010–003, University of Toronto (2010)
Google Scholar
G.E. Hinton, S. Osindero, Y.-W. Teh, A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)
Article MATH MathSciNet Google Scholar
G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, R.R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors. in Neural information processing systems (NIPS) (2012) arxiv.org/abs/1207.0580v1 [cs.NE] 3 July 2012
Google Scholar
N. Japkowicz, S.J. Hanson, M.A. Gluck, Nonlinear autoassociation is not equivalent to PCA. Neural Comput. 12, 531–545 (2000)
Article Google Scholar
K. Kavukcuoglu, M.A. Ranzato, Y. LeCun, Fast inference in sparse coding algorithms with applications to object recognition. in Neural Information Processing Systems (NIPS): Workshop on Optimization for Machine Learning (2008) arXiv:1010.3467v1 [cs.CV] 18 Oct 2010
H. Larochelle, Y. Bengio, J. Louradour, P. Lamblin, Exploring strategies for training deep neural networks. J. Mach. Learn. Res. 10, 1–40 (2009)
MATH Google Scholar
H. Larochelle, D. Erhan, A. Courville, J. Bergstra, Y. Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In International conference on machine learning (ICML) (2007), pp. 473–480
Google Scholar
Y. LeCun, Learning invariant feature hierarchies. in European Conference in Computer Vision (ECCV). Lecture Notes in Computer Science, vol. 7583. (Springer, New York, 2012), pp. 496–505
Google Scholar
Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Y. LeCun, L. Bottou, G.B. Orr, K.-R. Müller, Efficient backprop. in Neural Networks: Tricks of the Trade (1998). pp. 9–50
Google Scholar
H. Lee, C. Ekanadham, A.Y. Ng, Sparse deep belief net model for visual area V2. in Neural Information Processing Systems (NIPS) (2007), pp. 873–880
Google Scholar
H. Lee, R. Grosse, R. Ranganath, A.Y. Ng, Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. in International Conference on Machine Learning (ICML) (2009), p. 77
Google Scholar
J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online dictionary learning for sparse coding. in International Conference on Machine Learning (ICML) (2009), pp. 689–696
Google Scholar
A. Rakotomamonjy, Surveying and comparing simultaneous sparse approximation (or group-LASSO) algorithms. Signal Process. 91, 1505–1526 (2011)
Article MATH Google Scholar
M.A. Ranzato, F.-J. Huang, Y.-L. Boureau, Y. LeCun, Unsupervised learning of invariant feature hierarchies with applications to object recognition. in Computer Vision and Pattern Recognition (CVPR) (2007), pp. 1–8
Google Scholar
D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article Google Scholar
J.A. Tropp, Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50, 2231–2242 (2004)
Article MathSciNet Google Scholar
P.E. Utgoff, D.J. Stracuzzi, Many-layered learning. Neural Comput. 14, 2497–2539 (2002)
Article MATH Google Scholar
V.N. Vapnik, Statistical Learning Theory (Wiley, New York, 1998)
Google Scholar
P. Vincent, H. Larochelle, Y. Bengio, P.-A. Manzagol, Extracting and composing robust features with denoising autoencoders. in International Conference on Machine Learning (ICML) (2008), pp. 1096–1103
Google Scholar

Download references

Acknowledgments

This work was supported by ANR (the French National Research Agency) as part of the ASAP project under grant ANR_09_EMER_001_04.

Author information

Authors and Affiliations

CNRS, LRI UMR 8623, TAO, INRIA Saclay, Université Paris-Sud 11, 91405, Orsay, France
Sébastien Rebecchi, Hélène Paugam-Moisy & Michèle Sebag
CNRS, LIRIS UMR 5205, Université Lumière Lyon 2, 69676, Bron, France
Michèle Sebag

Authors

Sébastien Rebecchi
View author publications
You can also search for this author in PubMed Google Scholar
Hélène Paugam-Moisy
View author publications
You can also search for this author in PubMed Google Scholar
Michèle Sebag
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hélène Paugam-Moisy .

Editor information

Editors and Affiliations

CNRS, Institut des Systèmes Complexes - Paris Île-de-France, Paris, France
Taras Kowaliw
Institute of Intelligent Systems and Robotics, CNRS UMR 7222, Université Pierre et Marie Curie, Paris, France
Nicolas Bredeche
School of Biomedical Engineering, Drexel University, Philadelphia, Pennsylvania, USA
René Doursat

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Rebecchi, S., Paugam-Moisy, H., Sebag, M. (2014). Learning Sparse Features with an Auto-Associator. In: Kowaliw, T., Bredeche, N., Doursat, R. (eds) Growing Adaptive Machines. Studies in Computational Intelligence, vol 557. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55337-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-55337-0_4
Published: 05 June 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55336-3
Online ISBN: 978-3-642-55337-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics