Skip to main content

Post-training Quantization of Deep Neural Network Weights

  • Conference paper
  • First Online:
Advances in Neural Computation, Machine Learning, and Cognitive Research III (NEUROINFORMATICS 2019)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 856))

Included in the following conference series:

Abstract

The paper considers the quantization of weights as a tool for reducing the original size of an already trained neural net without having to perform the retraining. We have examined the methods based on uniform and exponential weight quantization and compared the results. Besides, we demonstrate the use of the quantization algorithm in three neural nets: VGG16, VGG19 and ResNet50.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). https://arxiv.org/abs/1409.1556

  2. ImageNet – huge image dataset. http://www.image-net.org

  3. Zhu, C., Han, S., Mao, H., Dally, W.J.: Trained ternary quantization. https://arxiv.org/abs/1612.01064

  4. Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., Zou, Y.: Dorefa-net: training low bitwidth convolutional neural networks with low bitwidth gradients. https://arxiv.org/pdf/1606.06160.pdf

  5. Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. CoRR. https://arxiv.org/abs/1010.00149, February 2015

  6. Cai, J., Takemoto, M., Nakajo, H.: A deep look into logarithmic quantization of model parameters in neural networks. In: The 10th International Conference on Advances in Information Technology (IAIT2018), Bangkok, Thailand, 10–13 December 2018, 8 pages. ACM, New York (2018). https://doi.org/10.1145/3291280.3291800

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Yu. Malsagov .

Editor information

Editors and Affiliations

Ethics declarations

The work financially supported by State Program of SRISA RAS No. 0065-2019-0003 (AAA-A19-119011590090-2).

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khayrov, E.M., Malsagov, M.Y., Karandashev, I.M. (2020). Post-training Quantization of Deep Neural Network Weights. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V., Tiumentsev, Y. (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research III. NEUROINFORMATICS 2019. Studies in Computational Intelligence, vol 856. Springer, Cham. https://doi.org/10.1007/978-3-030-30425-6_27

Download citation

Publish with us

Policies and ethics