C. Aggarwal. Outlier analysis. Springer, 2017.
Google Scholar
Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. NIPS Conference, 19, 153, 2007.
Google Scholar
Y. Bengio, N. Le Roux, P. Vincent, O. Delalleau, and P. Marcotte. Convex neural networks. NIPS Conference, pp. 123–130, 2005.
Google Scholar
Y. Bengio, J. Louradour, R. Collobert, and J. Weston. Curriculum learning. ICML Conference, 2009.
Google Scholar
Y. Bengio, L. Yao, G. Alain, and P. Vincent. Generalized denoising auto-encoders as generative models. NIPS Conference, pp. 899–907, 2013.
Google Scholar
C. M. Bishop. Training with noise is equivalent to Tikhonov regularization. Neural computation, 7(1),pp. 108–116, 1995.
CrossRef
Google Scholar
L. Breiman. Bagging predictors. Machine Learning, 24(2), pp. 123–140, 1996.
MathSciNet
MATH
Google Scholar
P. Bühlmann and B. Yu. Analyzing bagging. Annals of Statistics, pp. 927–961, 2002.
Google Scholar
Y. Burda, R. Grosse, and R. Salakhutdinov. Importance weighted autoencoders. arXiv:1509.00519, 2015.https://arxiv.org/abs/1509.00519
N. Chawla, K. Bowyer, L. Hall, and W. Kegelmeyer. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, pp. 321–357, 2002.
CrossRef
Google Scholar
J. Chen, S. Sathe, C. Aggarwal, and D. Turaga. Outlier detection with autoencoder ensembles. SIAM Conference on Data Mining, 2017.
Google Scholar
Y. Chen and M. Zaki. KATE: K-Competitive Autoencoder for Text. ACM KDD Conference, 2017.
Google Scholar
C. Doersch. Tutorial on variational autoencoders. arXiv:1606.05908, 2016.https://arxiv.org/abs/1606.05908
H. Drucker and Y. LeCun. Improving generalization performance using double backpropagation. IEEE Transactions on Neural Networks, 3(6), pp. 991–997, 1992.
CrossRef
Google Scholar
J. Elman. Learning and development in neural networks: The importance of starting small. Cognition, 48, pp. 781–799, 1993.
CrossRef
Google Scholar
D. Erhan, Y. Bengio, A. Courville, P. Manzagol, P. Vincent, and S. Bengio. Why does unsupervised pre-training help deep learning?. Journal of Machine Learning Research, 11, pp. 625–660, 2010.
MathSciNet
MATH
Google Scholar
Y. Freund and R. Schapire. A decision-theoretic generalization of online learning and application to boosting. Computational Learning Theory, pp. 23–37, 1995.
Google Scholar
K. Greff, R. K. Srivastava, and J. Schmidhuber. Highway and residual networks learn unrolled iterative estimation. arXiv:1612.07771, 2016.https://arxiv.org/abs/1612.07771
L. K. Hansen and P. Salamon. Neural network ensembles. IEEE TPAMI, 12(10), pp. 993–1001, 1990.
CrossRef
Google Scholar
B. Hassibi and D. Stork. Second order derivatives for network pruning: Optimal brain surgeon. NIPS Conference, 1993.
Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2009.
Google Scholar
T. Hastie, R. Tibshirani, and M. Wainwright. Statistical learning with sparsity: the lasso and generalizations. CRC Press, 2015.
Google Scholar
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016.
Google Scholar
G. Hinton. To recognize shapes, first learn to generate images. Progress in Brain Research, 165, pp. 535–547, 2007.
CrossRef
Google Scholar
G. Hinton, S. Osindero, and Y. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7), pp. 1527–1554, 2006.
MathSciNet
CrossRef
Google Scholar
G. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580, 2012.https://arxiv.org/abs/1207.0580
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8), pp. 1735–1785, 1997.
CrossRef
Google Scholar
F. Khan, B. Mutlu, and X. Zhu. How do humans teach: On curriculum learning and teaching dimension. NIPS Conference, pp. 1449–1457, 2011.
Google Scholar
D. Kingma and M. Welling. Auto-encoding variational bayes. arXiv:1312.6114, 2013.https://arxiv.org/abs/1312.6114
S. Kirkpatrick, C. Gelatt, and M. Vecchi. Optimization by simulated annealing. Science, 220, pp. 671–680, 1983.
MathSciNet
CrossRef
Google Scholar
R. Kohavi and D. Wolpert. Bias plus variance decomposition for zero-one loss functions. ICML Conference, 1996.
Google Scholar
E. Kong and T. Dietterich. Error-correcting output coding corrects bias and variance. ICML Conference, pp. 313–321, 1995.
Google Scholar
A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. NIPS Conference, pp. 1097–1105. 2012.
Google Scholar
Q. Le, J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, and A. Ng, On optimization methods for deep learning. ICML Conference, pp. 265–272, 2011.
Google Scholar
Q. Le, W. Zou, S. Yeung, and A. Ng. Learning hierarchical spatio-temporal features for action recognition with independent subspace analysis. CVPR Conference, 2011.
Google Scholar
Y. LeCun, J. Denker, and S. Solla. Optimal brain damage. NIPS Conference, pp. 598–605, 1990.
Google Scholar
H. Lee, C. Ekanadham, and A. Ng. Sparse deep belief net model for visual area V2. NIPS Conference, 2008.
Google Scholar
J. Ma, R. P. Sheridan, A. Liaw, G. E. Dahl, and V. Svetnik. Deep neural nets as a method for quantitative structure-activity relationships. Journal of Chemical Information and Modeling, 55(2), pp. 263–274, 2015.
CrossRef
Google Scholar
A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, and B. Frey. Adversarial autoencoders. arXiv:1511.05644, 2015.https://arxiv.org/abs/1511.05644
H. Mobahi and J. Fisher. A theoretical analysis of optimization by Gaussian continuation. AAAI Conference, 2015.
Google Scholar
A. Ng. Sparse autoencoder. CS294A Lecture notes, 2011. https://nlp.stanford.edu/~socherr/sparseAutoencoder_2011new.pdf
https://web.stanford.edu/class/cs294a/sparseAutoencoder_2011new.pdf
S. Nowlan and G. Hinton. Simplifying neural networks by soft weight-sharing. Neural Computation, 4(4), pp. 473–493, 1992.
CrossRef
Google Scholar
B. Poole, J. Sohl-Dickstein, and S. Ganguli. Analyzing noise in autoencoders and deep networks. arXiv:1406.1831, 2014.https://arxiv.org/abs/1406.1831
M.’ A. Ranzato, Y-L. Boureau, and Y. LeCun. Sparse feature learning for deep belief networks. NIPS Conference, pp. 1185–1192, 2008.
Google Scholar
A. Rasmus, M. Berglund, M. Honkala, H. Valpola, and T. Raiko. Semi-supervised learning with ladder networks. NIPS Conference, pp. 3546–3554, 2015.
Google Scholar
S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis. ICML Conference, pp. 1060–1069, 2016.
Google Scholar
S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio. Contractive auto-encoders: Explicit invariance during feature extraction. ICML Conference, pp. 833–840, 2011.
Google Scholar
S. Rifai, Y. Dauphin, P. Vincent, Y. Bengio, and X. Muller. The manifold tangent classifier. NIPS Conference, pp. 2294–2302, 2011.
Google Scholar
D. Rezende, S. Mohamed, and D. Wierstra. Stochastic backpropagation and approximate inference in deep generative models. arXiv:1401.4082, 2014.https://arxiv.org/abs/1401.4082
T Sanger. Neural network learning control of robot manipulators using gradually increasing task difficulty. IEEE Transactions on Robotics and Automation, 10(3), 1994.
Google Scholar
H. Schwenk and Y. Bengio. Boosting neural networks. Neural Computation, 12(8), pp. 1869–1887, 2000.
CrossRef
Google Scholar
G. Seni and J. Elder. Ensemble methods in data mining: Improving accuracy through combining predictions. Morgan and Claypool, 2010.
Google Scholar
J. Sietsma and R. Dow. Creating artificial neural networks that generalize. Neural Networks, 4(1), pp. 67–79, 1991.
CrossRef
Google Scholar
K. Sohn, H. Lee, and X. Yan. Learning structured output representation using deep conditional generative models. NIPS Conference, 2015.
Google Scholar
R. Solomonoff. A system for incremental learning based on algorithmic probability. Sixth Israeli Conference on Artificial Intelligence, Computer Vision and Pattern Recognition, pp. 515–527, 1994.
Google Scholar
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), pp. 1929–1958, 2014.
MathSciNet
MATH
Google Scholar
R. K. Srivastava, K. Greff, and J. Schmidhuber. Highway networks. arXiv:1505.00387, 2015.https://arxiv.org/abs/1505.00387
F. Strub and J. Mary. Collaborative filtering with stacked denoising autoencoders and sparse inputs. NIPS Workshop on Machine Learning for eCommerce, 2015.
Google Scholar
A. Tikhonov and V. Arsenin. Solution of ill-posed problems. Winston and Sons, 1977.
Google Scholar
H. Valpola. From neural PCA to deep unsupervised learning. Advances in Independent Component Analysis and Learning Machines, pp. 143–171, Elsevier, 2015.
Google Scholar
P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol. Extracting and composing robust features with denoising autoencoders. ICML Confererence, pp. 1096–1103, 2008.
Google Scholar
J. Walker, C. Doersch, A. Gupta, and M. Hebert. An uncertain future: Forecasting from static images using variational autoencoders. European Conference on Computer Vision, pp. 835–851, 2016.
Google Scholar
L. Wan, M. Zeiler, S. Zhang, Y. LeCun, and R. Fergus. Regularization of neural networks using dropconnect. ICML Conference, pp. 1058–1066, 2013.
Google Scholar
S. Wang, C. Aggarwal, and H. Liu. Using a random forest to inspire a neural network and improving on it. SIAM Conference on Data Mining, 2017.
Google Scholar
Y. Wu, C. DuBois, A. Zheng, and M. Ester. Collaborative denoising auto-encoders for top-n recommender systems. Web Search and Data Mining, pp. 153–162, 2016.
Google Scholar
Z. Wu. Global continuation for distance geometry problems. SIAM Journal of Optimization, 7, pp. 814–836, 1997.
MathSciNet
CrossRef
Google Scholar
C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals. Understanding deep learning requires rethinking generalization. arXiv:1611.03530.https://arxiv.org/abs/1611.03530
Z.-H. Zhou. Ensemble methods: Foundations and algorithms. CRC Press, 2012.
Google Scholar
Z.-H. Zhou, J. Wu, and W. Tang. Ensembling neural networks: many could be better than all. Artificial Intelligence, 137(1–2), pp. 239–263, 2002.
MathSciNet
CrossRef
Google Scholar
http://scikit-learn.org/
https://github.com/caglar/autoencoders
https://github.com/y0ast
https://github.com/fastforwardlabs/vae-tf/tree/master
https://archive.ics.uci.edu/ml/datasets.html
https://github.com/wiseodd/generative-models