Fast Structure Learning for Deep Feedforward Networks via Tree Skeleton Expansion

Chen, Zhourong; Li, Xiaopeng; Tian, Zhiliang; Zhang, Nevin L.

doi:10.1007/978-3-030-29765-7_23

Zhourong Chen¹⁰,
Xiaopeng Li¹⁰,
Zhiliang Tian¹⁰ &
…
Nevin L. Zhang¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11726))

Included in the following conference series:

European Conference on Symbolic and Quantitative Approaches with Uncertainty

564 Accesses

Abstract

Despite the popularity of deep learning, structure learning for deep models remains a relatively under-explored area. In contrast, structure learning has been studied extensively for probabilistic graphical models (PGMs). In particular, an efficient algorithm has been developed for learning a class of tree-structured PGMs called hierarchical latent tree models (HLTMs), where there is a layer of observed variables at the bottom and multiple layers of latent variables on top. In this paper, we propose a simple unsupervised method for learning the structures of feedforward neural networks (FNNs) based on HLTMs. The idea is to expand the connections in the tree skeletons from HLTMs and to use the resulting structures for FNNs. Our method is very fast and it yields deep structures of virtually the same quality as those produced by the very time-consuming grid search method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Adams, R.P., Wallach, H.M., Ghahramani, Z.: Learning the structure of deep sparse graphical models. In: AISTATS (2010)
Google Scholar
Ash, T.: Dynamic node creation in backpropagation networks. Connection Sci. 1(4), 365–375 (1989)
Article Google Scholar
Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: ICLR (2017)
Google Scholar
Bello, M.G.: Enhanced training algorithms, and integrated training/architecture selection for multilayer perceptron networks. IEEE Trans. Neural Networks 3, 864–875 (1992)
Article Google Scholar
Chen, P., Zhang, N.L., Liu, T., Poon, L.K., Chen, Z., Khawar, F.: Latent tree models for hierarchical topic detection. Artif. Intell. 250, 105–124 (2017)
Article MathSciNet Google Scholar
Chen, Z., Zhang, N.L., Yeung, D.Y., Chen, P.: Sparse Boltzmann machines with structure learning as applied to text analysis. In: AAAI (2017)
Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: AISTATS (2011)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). http://www.deeplearningbook.org
MATH Google Scholar
Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: NIPS (2015)
Google Scholar
Hinton, G.E., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig. Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. In: NIPS (2017)
Google Scholar
Knott, M., Bartholomew, D.J.: Latent Variable Models and Factor Analysis (1999)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Kwok, T.Y., Yeung, D.Y.: Constructive algorithms for structure learning in feedforward neural networks for regression problems. IEEE Trans. Neural Networks 8(3), 630–645 (1997)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. In: ICLR (2017)
Google Scholar
Liu, J., Gong, M., Miao, Q., Wang, X., Li, H.: Structure learning for deep neural networks based on multiobjective optimization. IEEE Trans. Neural Networks Learn. Syst. 29, 2450–2463 (2017)
Article MathSciNet Google Scholar
Liu, T., Zhang, N.L., Chen, P.: Hierarchical latent tree analysis for topic detection. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8725, pp. 256–272. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44851-9_17
Chapter Google Scholar
Mayr, A., Klambauer, G., Unterthiner, T., Hochreiter, S.: Deeptox: toxicity prediction using deep learning. Front. Environ. Sci. 3, 80 (2016)
Article Google Scholar
Mikolov, T., Deoras, A., Povey, D., Burget, L., Černockỳ, J.: Strategies for training large scale neural network language models. In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 196–201 (2011)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: ICML (2010)
Google Scholar
Real, E., Moore, S., Selle, A., Saxena, S., Suematsu, Y.L., Le, Q., Kurakin, A.: Large-scale evolution of image classifiers. In: ICML (2017)
Google Scholar
Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Article MathSciNet Google Scholar
Srinivas, S., Babu, R.V.: Data-free parameter pruning for deep neural networks. In: Proceedings of the British Machine Vision Conference (2015)
Google Scholar
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: NIPS (2016)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: HLT-NAACL (2016)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: NIPS (2015)
Google Scholar
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: ICLR (2017)
Google Scholar

Download references

Acknowledgments

Research on this article was supported by Hong Kong Research Grants Council under grants 16212516.

Author information

Authors and Affiliations

The Hong Kong University of Science and Technology, Hong Kong, China
Zhourong Chen, Xiaopeng Li, Zhiliang Tian & Nevin L. Zhang

Authors

Zhourong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaopeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhiliang Tian
View author publications
You can also search for this author in PubMed Google Scholar
Nevin L. Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zhourong Chen or Nevin L. Zhang .

Editor information

Editors and Affiliations

Technische Universität Dortmund, Dortmund, Germany
Gabriele Kern-Isberner
Mathematical Institute of the Serbian Academy of Sciences and Arts, Belgrade, Serbia
Zoran Ognjanović

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Z., Li, X., Tian, Z., Zhang, N.L. (2019). Fast Structure Learning for Deep Feedforward Networks via Tree Skeleton Expansion. In: Kern-Isberner, G., Ognjanović, Z. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2019. Lecture Notes in Computer Science(), vol 11726. Springer, Cham. https://doi.org/10.1007/978-3-030-29765-7_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-29765-7_23
Published: 04 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29764-0
Online ISBN: 978-3-030-29765-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics