Skip to main content

CEG4N: Counter-Example Guided Neural Network Quantization Refinement

  • Conference paper
  • First Online:
Software Verification and Formal Methods for ML-Enabled Autonomous Systems (NSV 2022, FoMLAS 2022)

Abstract

Neural networks are essential components of learning-based software systems. However, deploying neural networks in low-resource domains is challenging because of their high computing, memory, and power requirements. For this reason, neural networks are often quantized before deployment, but existing quantization techniques tend to degrade network accuracy. We propose Counter-Example Guided Neural Network Quantization Refinement (CEG4N). This technique combines search-based quantization and equivalence verification: the former minimizes the computational requirements, while the latter guarantees that the network’s output does not change after quantization. We evaluate CEG4N on a diverse set of benchmarks, including large and small networks. Our technique successfully quantizes the networks in our evaluation while producing models with up to 72% better accuracy than state-of-the-art techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The pre-trained weight for the ACAS Xu benchmarks can be found in the following repository: https://github.com/stanleybak/vnncomp2021.

References

  1. Abate, A., et al.: Sound and automated synthesis of digital stabilizing controllers for continuous plants. In: Frehse, G., Mitra, S. (eds.) Proceedings of the 20th International Conference on Hybrid Systems: Computation and Control, HSCC 2017, Pittsburgh, PA, USA, 18–20 April 2017, pp. 197–206. ACM (2017). https://doi.org/10.1145/3049797.3049802

  2. Abiodun, O.I., Jantan, A., Omolara, A.E., Dada, K.V., Mohamed, N.A., Arshad, H.: State-of-the-art in artificial neural network applications: a survey. Heliyon. 4(11), e00938 (2018). https://doi.org/10.1016/j.heliyon.2018.e00938, https://www.sciencedirect.com/science/article/pii/S2405844018332067

  3. Albarghouthi, A.: Introduction to neural network verification. arXiv:2109.10317 (2021)

  4. Bai, J., Lu, F., Zhang, K., et al.: ONNX: Open neural network exchange (2019). https://github.com/onnx/onnx

  5. Bak, S., Liu, C., Johnson, T.: The second international verification of neural networks competition (VNN-COMP 2021): Summary and results (2021)

    Google Scholar 

  6. Bojarski, M., et al.: End to end learning for self-driving cars. arXiv:1604.07316 (2016)

  7. Kleine Büning, M., Kern, P., Sinz, C.: Verifying equivalence properties of neural networks with ReLU activation functions. In: Simonis, H. (ed.) CP 2020. LNCS, vol. 12333, pp. 868–884. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58475-7_50

    Chapter  Google Scholar 

  8. Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P.A., Łukasik, S., Żak, S.: Complete gradient clustering algorithm for features analysis of x-ray images. In: Piȩtka, E., Kawa, J. (eds) Information Technologies in Biomedicine. AINSC, vol. 69, pp. 15–24. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13105-9_2

  9. Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. arXiv:1710.09282 (2017)

  10. Eleftheriadis, C., Kekatos, N., Katsaros, P., Tripakis, S.: On neural network equivalence checking using SMT solvers. arXiv:2203.11629 (2022)

  11. Esser, S.K., Appuswamy, R., Merolla, P., Arthur, J.V., Modha, D.S.: Backpropagation for energy-efficient neuromorphic computing. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015). https://proceedings.neurips.cc/paper/2015/file/10a5ab2db37feedfdeaab192ead4ac0e-Paper.pdf

  12. Farabet, C., et al.: Large-scale FPGA-based convolutional networks (2011)

    Google Scholar 

  13. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936)

    Article  Google Scholar 

  14. Gadelha, M.R., Menezes, R.S., Cordeiro, L.C.: ESBMC 6.1: automated test case generation using bounded model checking. Int. J. Softw. Tools Technol. Transf. 23(6), 857–861 (2020). https://doi.org/10.1007/s10009-020-00571-2

    Article  Google Scholar 

  15. Gadelha, M.R., Monteiro, F.R., Morse, J., Cordeiro, L.C., Fischer, B., Nicole, D.A.: ESBMC 5.0: an industrial-strength C model checker. In: 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 888–891 (2018). https://doi.org/10.1145/3238147.3240481

  16. Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv:2103.13630 (2022)

  17. Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. arXiv:1506.02626 (2015)

  18. Hooker, S., Courville, A.C., Dauphin, Y., Frome, A.: Selective brain damage: measuring the disparate impact of model pruning. arXiv:1911.05248 (2019)

  19. Huang, X., et al.: Safety and trustworthiness of deep neural networks: a survey. arXiv:1812.08342 (2018)

  20. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized neural networks: training neural networks with low precision weights and activations. arXiv:1609.07061 (2017)

  21. IEEE: IEEE standard for floating-point arithmetic. IEEE Std. 754–2019 (Revision of IEEE 754–2008), pp. 1–84 (2019). https://doi.org/10.1109/IEEESTD.2019.8766229

  22. Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. CoRR abs/1712.05877 (2017). arxiv:1712.05877

  23. Julian, K.D., Lopez, J., Brush, J.S., Owen, M.P., Kochenderfer, M.J.: Policy compression for aircraft collision avoidance systems. In: 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC), pp. 1–10 (2016). https://doi.org/10.1109/DASC.2016.7778091

  24. Kirchhoffer, H., et al.: Overview of the neural network compression and representation (NNR) standard. IEEE Trans. Circ. Syst. Video Technol. 1 (2021). https://doi.org/10.1109/TCSVT.2021.3095970

  25. Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: a whitepaper. CoRR abs/1806.08342 (2018). arxiv:1806.08342

  26. LeCun, Y., Cortes, C.: The MNIST database of handwritten digits (2005)

    Google Scholar 

  27. Liu, C., Arnon, T., Lazarus, C., Barrett, C.W., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optim. 4, 244–404 (2021)

    Article  Google Scholar 

  28. Narodytska, N., Kasiviswanathan, S.P., Ryzhyk, L., Sagiv, S., Walsh, T.: Verifying properties of binarized deep neural networks. In: AAAI (2018)

    Google Scholar 

  29. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

  30. Teuber, S., Buning, M.K., Kern, P., Sinz, C.: Geometric path enumeration for equivalence verification of neural networks. In: 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), November 2021. https://doi.org/10.1109/ictai52525.2021.00035, http://dx.doi.org/10.1109/ICTAI52525.2021.00035

  31. Zhang, J., Zhou, Y., Saab, R.: Post-training quantization for neural networks with provable guarantees. arXiv preprint arXiv:2201.11113 (2022)

Download references

Acknowledgment

The work is partially funded by EPSRC grant EP/T026995/1 entitled “EnnCore: End-to-End Conceptual Guarding of Neural Architectures” under Security for all in an AI-enabled society.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Batista P. Matos Jr. .

Editor information

Editors and Affiliations

A Appendices

A Appendices

1.1 A.1 Implementation of NNs in Python.

The NNs were built and trained using the Pytorch library [29]. Weights of the trained models were then exported to the ONNX [4] format, which can be interpreted by Pytorch and used to run predictions without any compromise in the NNs performance.

1.2 A.2 Implementation of NNs abstract models in C.

In the present work, we use the C language to implement the abstract representation of the NNs. It allows us to explicitly model the NN operations in their original and quantized forms and apply existing software verification tools (e.g., ESBMC [14]). The operational C-abstraction models perform double-precision arithmetic. Although, we must notice that the original and quantized only diverge on the precision of the weight and bias vectors that are embedded in the abstractions code.

1.3 A.3 Encoding of Equivalence Properties

Suppose, a NN  F, for which \(x\in \mathcal {H}\) is a safe input and \(y \in \mathcal {G}\) is the expected output of f the input. We now show how one can specify the equivalence properties. For this example, consider that the function f can produce the outputs of F in floating-point arithmetic, while fq produces the outputs of F in fixed-point arithmetic (i.e. quantization). First, the concrete NN  input x is replaced by a non-deterministic one, which is achieved using the command nondet_float from the ESBMC.

figure d
figure e
figure f

1.4 A.4 Genetic Algorithm Parameters Definition

In Table 3, we report a summary of experiments conducted to tune the parameters of the Genetic Algorithm, more precisely, the number of generations. For example, a NN with 2 layers would require a brute force algorithm to search for \(52^2\) combinations of bits widths for the quantization. Similarly, a NN with 7 layers would require a brute force algorithm to search for \(52^7\) combinations of bits widths. We conducted a set of experiments where we ran the GA one hundred times with a different number of generations options ranging from 50 to 1000. In addition, we fixed the population size to 5. From our findings, the GA needs about 100 to 110 generations per layer to find the optimal bit width solution for each run.

Table 3. Summary of experiments for tuning Genetic Algorithm Parameters.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Matos, J.B.P., Bessa, I., Manino, E., Song, X., Cordeiro, L.C. (2022). CEG4N: Counter-Example Guided Neural Network Quantization Refinement. In: Isac, O., Ivanov, R., Katz, G., Narodytska, N., Nenzi, L. (eds) Software Verification and Formal Methods for ML-Enabled Autonomous Systems. NSV FoMLAS 2022 2022. Lecture Notes in Computer Science, vol 13466. Springer, Cham. https://doi.org/10.1007/978-3-031-21222-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21222-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21221-5

  • Online ISBN: 978-3-031-21222-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics