Advanced Model Initialization Techniques

Yu, Dong; Deng, Li

doi:10.1007/978-1-4471-5779-3_5

Dong Yu³ &
Li Deng⁴

Part of the book series: Signals and Communication Technology ((SCT))

13k Accesses

Abstract

In this chapter, we introduce several advanced deep neural network (DNN) model initialization or pretraining techniques. These techniques have played important roles in the early days of deep learning research and continue to be useful under many conditions. We focus our presentation of pretraining DNNs on the following topics: the restricted Boltzmann machine (RBM), which by itself is an interesting generative model, the deep belief network (DBN), the denoising autoencoder, and the discriminative pretraining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Proceedings of the Neural Information Processing Systems (NIPS), pp. 153–160 (2006)
Google Scholar
Bottou, L.: Online learning and stochastic approximations. On-line Learn. Neural Netw. 17, 9 (1998)
Google Scholar
Coates, A., Ng, A.Y., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 215–223 (2011)
Google Scholar
Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio, Speech Lang. Process. 20(1), 30–42 (2012)
Article Google Scholar
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. (JMLR) 11, 625–660 (2010)
MATH MathSciNet Google Scholar
Erhan, D., Manzagol, P.A., Bengio, Y., Bengio, S., Vincent, P.: The difficulty of training deep architectures and the effect of unsupervised pre-training. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 153–160 (2009)
Google Scholar
Hinton, G.: A practical guide to training restricted Boltzmann machines. Technical Report UTML TR 2010-003, University of Toronto (2010)
Google Scholar
Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)
Article MATH MathSciNet Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Article MATH MathSciNet Google Scholar
Hinton, G.E., Dayan, P., Frey, B.J., Neal, R.M.: The wake-sleep algorithm for unsupervised neural networks. SCIENCE-NEW YORK THEN WASHINGTON- pp. 1158–1158 (1995)
Google Scholar
Hinton, G.E., Salakhutdinov, R.: Replicated softmax: an undirected topic model. In: Proceedings of the Neural Information Processing Systems (NIPS), pp. 1607–1614 (2009)
Google Scholar
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
Larochelle, H., Bengio, Y.: Classification using discriminative restricted Boltzmann machines. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 536–543 (2008)
Google Scholar
Ling, Z.H., Deng, L., Yu, D.: Modeling spectral envelopes using restricted Boltzmann machines and deep belief networks for statistical parametric speech synthesis. IEEE Trans. Audio, Speech Lang. Process. 21(10), 2129–2139 (2013)
Article Google Scholar
Sainath, T., Kingsbury, B., Ramabhadran, B.: Improving training time of deep belief networks through hybrid pre-training and larger batch sizes. In: Proceedings of the Neural Information Processing Systems (NIPS) Workshop on Log-linear Models (2012)
Google Scholar
Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted boltzmann machines for collaborative filtering. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 791–798 (2007)
Google Scholar
Saul, L.K., Jaakkola, T., Jordan, M.I.: Mean field theory for sigmoid belief networks. J. Artif. Intell. Res. (JAIR) 4, 61–76 (1996)
MATH Google Scholar
Seide, F., Li, G., Chen, X., Yu, D.: Feature engineering in context-dependent deep neural networks for conversational speech transcription. In: Proceedings of the IEEE Workshop on Automfatic Speech Recognition and Understanding (ASRU), pp. 24–29 (2011)
Google Scholar
Seide, F., Li, G., Yu, D.: Conversational speech transcription using context-dependent deep neural networks. In: Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH), pp. 437–440 (2011)
Google Scholar
Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. Department of Computer Science, University of Colorado, Boulder (1986)
Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 1096–1103 (2008)
Google Scholar
Yu, D., Deng, L., Dahl, G.: Roles of pre-training and fine-tuning in context-dependent DBN-HMMs for real-world speech recognition. In: Proceedings of the Neural Information Processing Systems (NIPS) Workshop on Deep Learning and Unsupervised Feature Learning (2010)
Google Scholar
Zhang, S., Bao, Y., Zhou, P., Jiang, H., Li-Rong, D.: Improving deep neural networks for LVCSR using dropout and shrinking structure. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6899–6903 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research, Bothell, USA
Dong Yu
Microsoft Research, Redmond, WA, USA
Li Deng

Authors

Dong Yu
View author publications
You can also search for this author in PubMed Google Scholar
Li Deng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dong Yu .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yu, D., Deng, L. (2015). Advanced Model Initialization Techniques. In: Automatic Speech Recognition. Signals and Communication Technology. Springer, London. https://doi.org/10.1007/978-1-4471-5779-3_5

Download citation

DOI: https://doi.org/10.1007/978-1-4471-5779-3_5
Published: 12 November 2014
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5778-6
Online ISBN: 978-1-4471-5779-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics