Abstract
The problem of embedding arises in many machine learning applications with the assumption that there may exist a small number of variabilities which can guarantee the “semantics” of the original high-dimensional data. Most of the existing embedding algorithms perform to maintain the locality-preserving property. In this study, inspired by the remarkable success of representation learning and deep learning, we propose a framework of embedding with autoencoder regularization (EAER for short), which incorporates embedding and autoencoding techniques naturally. In this framework, the original data are embedded into the lower dimension, represented by the output of the hidden layer of the autoencoder, thus the resulting data can not only maintain the locality-preserving property but also easily revert to their original forms. This is guaranteed by the joint minimization of the embedding loss and the autoencoder reconstruction error. It is worth mentioning that instead of operating in a batch mode as most of the previous embedding algorithms conduct, the proposed framework actually generates an inductive embedding model and thus supports incremental embedding efficiently. To show the effectiveness of EAER, we adapt this joint learning framework to three canonical embedding algorithms, and apply them to both synthetic and real-world data sets. The experimental results show that the adaption of EAER outperforms its original counterpart. Besides, compared with the existing incremental embedding algorithms, the results demonstrate that EAER performs incremental embedding with more competitive efficiency and effectiveness.
Chapter PDF
Similar content being viewed by others
References
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. In: Neural Computation, pp. 1373–1396 (2003)
Bengio, Y., Courville, A., Vincent, P.: Unsupervised feature learning and deep learning: A review and new perspectives. In: CoRR (2012)
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems, pp. 153–160 (2007)
Blake, C., Merz, C.: Uci repository of machine learning databases (1998)
Cox, T., Cox, M.: Multidimensional scaling. Chapman & Hall, London (1994)
Greene, W., Zhang, C.: Econometric analysis. Prentice Hall, Upper Saddle River (1997)
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1735–1742 (2006)
He, X., Cai, D., Yan, S., Zhang, H.: Neighborhood preserving embedding. In: Tenth IEEE International Conference on Computer Vision, pp. 1208–1213 (2005)
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science, 504–507 (2006)
Hinton, G.: A practical guide to training restricted boltzmann machines. Momentum (2010)
Jolliffe, I.: Principal component analysis. Springer, New York (1986)
Law, M., Jain, A.: Incremental nonlinear dimensionality reduction by manifold learning. IEEE Trans. Pattern Analysis and Machine Intelligence, 377–391 (2006)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE, 2278–2324 (1998)
Min, R., van der Maaten, L., Yuan, Z., Bonner, A., Zhang, Z.: Deep supervised t-distributed embedding. In: Proceedings of the 27th International Conference on Machine Learning (2010)
Narayanan, H., Mitter, S.: Sample complexity of testing the manifold hypothesis. In: Advances in Neural Information Processing Systems, pp. 1786–1794 (2010)
Ng, A.: Cs294a lecture notes: Sparse autoencoder. Stanford University (2010)
Niyogi, X.: Locality preserving projections. In: Advances in Neural Information Processing Systems, pp. 153–160 (2004)
Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: Explicit invariance during feature extraction. In: Proceedings of the 28th International Conference on Machine Learning (2011)
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science, 2323–2326 (2000)
Rumelhart, D., Hintont, G., Williams, R.: Learning representations by back-propagating errors. Nature, 533–536 (1986)
Salakhutdinov, R., Hinton, G.: Learning a nonlinear embedding by preserving class neighbourhood structure. In: AI and Statistics (2007)
Saul, L., Roweis, S.: Think globally, fit locally: unsupervised learning of low dimensional manifolds. The Journal of Machine Learning Research, 119–155 (2003)
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 151–161 (2011)
Tenenbaum, J., De Silva, V., Langford, J.: A global geometric framework for nonlinear dimensionality reduction. Science 2319–2323 (2000)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning (2008)
Weston, J., Ratle, F., Collobert, R.: Deep learning via semi-supervised embedding. In: Proceedings of the 25th International Conference on Machine Learning (2008)
Williams, C.: On a connection between kernel pca and metric multidimensional scaling. In: Machine Learning, pp. 11–19. Springer (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, W., Zeng, G., Luo, P., Zhuang, F., He, Q., Shi, Z. (2013). Embedding with Autoencoder Regularization. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2013. Lecture Notes in Computer Science(), vol 8190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40994-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-40994-3_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40993-6
Online ISBN: 978-3-642-40994-3
eBook Packages: Computer ScienceComputer Science (R0)