Skip to main content

Estimating and Factoring the Dropout Induced Distribution with Gaussian Mixture Model

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation (ICANN 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11727))

Included in the following conference series:

  • 2937 Accesses

Abstract

The analytical method to capture the dropout induced distribution of forwarding output in a neural network as Gaussian mixture model (GMM) was proposed. In dropout Bayesian DNN, if the network is dropout-trained and a test data is dropout-forwarded for inference, then its output, usually approximated as a single mode Gaussian, becomes a posterior whose variance tells uncertainty of its inference [1]. Here, the proposed method can capture the arbitrary distribution analytically with high accuracy without Monte Carlo (MC) method for any network equipped with dropout and fully connected (FC) layers. Therefore, it is applicable to the general non-Gaussian posterior case for a better uncertainty estimate. The proposed method also has the advantage to provide a multimodal analysis in distribution by factoring which can be tuned with a user defined expressibility parameter while a MC estimate provides only a “flat” image. This helps to understand how the FC layer tries to code a dropout injected highly multimodal data into a single mode Gaussian while the unknown data becomes a complicated distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gal, Y.: Uncertainty in Deep Learning. PhD thesis, University of Cambridge (2016)

    Google Scholar 

  2. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  3. Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In: Balcan, M.F., Kilian Q.W. (eds), Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 1050–1059, New York, USA, 20–22 Jun 2016. PMLR

    Google Scholar 

  4. Leibig, C., Allken, V., Berens, P., Wahl, S.: Leveraging uncertainty information from deep neural networks for disease detection. bioRxiv (2016)

    Google Scholar 

  5. Louizos, C., Welling, M.: Multiplicative normalizing flows for variational Bayesian neural networks. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, vol. 70, pp. 2218–2227. JMLR.org (2017)

    Google Scholar 

  6. Wang, S.I., Manning, C.D.: Fast dropout training. In: Proceedings of the 30th International Conference on International Conference on Machine Learning, ICML 2013, vol. 28, pp. II-118-II-126. JMLR.org (2013)

    Google Scholar 

  7. Tahir, M.H., Ghazali, S.S.A., Gilani, G.M.: On the variance of the sample mean from finite population, approach iii (2005)

    Google Scholar 

  8. Wikipedia. Rectified Gaussian distribution – Wikipedia, the free encyclopedia. https://en.wikipedia.org/wiki/Rectified_Gaussian_distribution. Accessed 01 Jul 2019

  9. Manjunath, B.G., Wilhelm, S.: Moments calculation for the double truncated multivariate normal density. SSRN Electron. J. (2009)

    Google Scholar 

  10. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS 2012, vol. 1, pp. 1097–1105, USA. Curran Associates Inc. (2012)

    Google Scholar 

  11. ImageNet. http://www.image-net.org/

  12. BVLC caffe AlexNet. https://github.com/BVLC/caffe/tree/master/models/bvlc_reference_caffenet

  13. THE MNIST DATABASE. http://yann.lecun.com/exdb/mnist/

  14. NotMNIST Dataset. https://www.kaggle.com/lubaroli/notmnist/

  15. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  16. Hershey, J.R., Olsen, P.A.: Approximating the kullback leibler divergence between Gaussian mixture models. In: 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, April 2007, ICASSP 2007. IEEE (2007)

    Google Scholar 

  17. Daunizeau, J.: Semi-analytical approximations to statistical moments of sigmoid and softmax mappings of normal variables (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingo Adachi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Adachi, J. (2019). Estimating and Factoring the Dropout Induced Distribution with Gaussian Mixture Model. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation. ICANN 2019. Lecture Notes in Computer Science(), vol 11727. Springer, Cham. https://doi.org/10.1007/978-3-030-30487-4_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30487-4_60

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30486-7

  • Online ISBN: 978-3-030-30487-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics