Skip to main content

Memory-Efficient High-Speed Implementation of Kyber on Cortex-M4

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11627))

Abstract

This paper presents an optimized software implementation of the module-lattice-based key-encapsulation mechanism Kyber for the ARM Cortex-M4 microcontroller. Kyber is one of the round-2 candidates in the NIST post-quantum project. In the center of our work are novel optimization techniques for the number-theoretic transform (NTT) inside Kyber, which make very efficient use of the computational power offered by the “vector” DSP instructions of the target architecture. We also present results for the recently updated parameter sets of Kyber which equally benefit from our optimizations.

As a result of our efforts we present software that is 18% faster than an earlier implementation of Kyber optimized for the Cortex-M4 by the Kyber submitters. Our NTT is more than twice as fast as the NTT in that software. Our software runs at about the same speed as the latest speed-optimized implementation of the other module-lattice based round-2 NIST PQC candidate Saber. However, for our Kyber software, this performance is achieved with a much smaller RAM footprint. Kyber needs less than half of the RAM of what the considerably slower RAM-optimized version of Saber uses. Our software does not make use of any secret-dependent branches or memory access and thus offers state-of-the-art protection against timing attacks.

This work has been supported by the European Commission through the ERC Starting Grant 805031 (EPOQUE). Date: May 10, 2019.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    see https://www.safecrypto.eu/pqclounge/round-2-candidates/.

  2. 2.

    the second round merger of NTRU-HRSS-KEM [19] and NTRUEncrypt [33].

  3. 3.

    We also benchmarked our code using the February 2019 release of arm-none-eabi-gcc (8.3.0) which produced the same results.

  4. 4.

    \(2n-1\) coefficients of 2 bytes each.

References

  1. Alagic, G., et al.: Status report on the first round of the NIST post-quantum cryptography standardization process. National Institute of Standards and Technology Internal Report 8240 (2019). https://doi.org/10.6028/NIST.IR.8240

  2. Alkim, E., et al.: NewHope: algorithm specification and supporting documentation. Submission to the NIST Post-Quantum Cryptography Standardization Project (2017). https://cryptojedi.org/papers/#newhopenist

  3. Alkim, E., Ducas, L., Pöppelmann, T., Schwabe, P.: Post-quantum key exchange – a new hope. In: Holz, T., Savage, S. (eds.) Proceedings of the 25th USENIX Security Symposium. USENIX Association (2016). https://eprint.iacr.org/2015/1092

  4. Alkim, E., Jakubeit, P., Schwabe, P.: NewHope on ARM cortex-M. In: Carlet, C., Hasan, M.A., Saraswat, V. (eds.) SPACE 2016. LNCS, vol. 10076, pp. 332–349. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49445-6_19. http://cryptojedi.org/papers/#newhopearm

    Chapter  Google Scholar 

  5. Avanzi, R., et al.: ARM Cortex-M4 optimized implementation of Kyber. https://github.com/pq-crystals/kyber/tree/cm4/cm4. Accessed 07 Mar 2019

  6. Avanzi, R., et al.: CRYSTALS-Kyber: algorithm specification and supporting documentation. Submission to the NIST Post-Quantum Cryptography Standardization Project (2017). https://pq-crystals.org/kyber

  7. Avanzi, R., et al.: CRYSTALS-Kyber: algorithm specification and supporting documentation (version 2.0). Submission to the NIST Post-Quantum Cryptography Standardization Project (2019). https://pq-crystals.org/kyber

  8. Bertoni, G., Daemen, J., Peeters, M., Assche, G.V.: The Keccak reference. Submission to the NIST SHA-3 competition (round 3) (2011). https://keccak.team/files/Keccak-reference-3.0.pdf

  9. Bos, J.W., et al.: CRYSTALS – kyber: A cca-secure module-lattice-based KEM. In: 2018 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 353–367. IEEE (2018). https://eprint.iacr.org/2017/634

  10. Bos, J.W., Friedberger, S., Martinoli, M., Oswald, E., Stam, M.: Fly, you fool! Faster Frodo for the ARM Cortex-M4. Cryptology ePrint Archive, Report 2018/1116 (2018). https://eprint.iacr.org/2018/1116

  11. de Clercq, R., Roy, S.S., Vercauteren, F., Verbauwhede, I.: Efficient software implementation of ring-LWE encryption. In: Design, Automation & Test in Europe Conference & Exhibition, DATE 2015, pp. 339–344. EDA Consortium (2015). http://eprint.iacr.org/2014/725

  12. Cook, S.: On the Minimum Computation Time of Functions. Ph.D. thesis, Harvard University (1966)

    Google Scholar 

  13. Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex fourier series. Math. Comput. 19(90), 297–301 (1965). https://www.jstor.org/stable/2003354

    Article  MathSciNet  Google Scholar 

  14. Daemen, J., Hoffert, S., Peeters, M., Assche, G.V., Keer, R.V.: eXtended Keccak Code Package. https://github.com/XKCP/XKCP. Accessed 07 Mar 2019

  15. Fujisaki, E., Okamoto, T.: Secure integration of asymmetric and symmetric encryption schemes. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 537–554. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48405-1_34

    Chapter  Google Scholar 

  16. Goldstine, H.H.: A History of Numerical Analysis from the 16th through the 19th Century. Springer, New York (1977). https://doi.org/10.1007/978-1-4684-9472-3

    Book  MATH  Google Scholar 

  17. Güneysu, T., Oder, T., Pöppelmann, T., Schwabe, P.: Software speed records for lattice-based signatures. In: Gaborit, P. (ed.) PQCrypto 2013. LNCS, vol. 7932, pp. 67–82. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38616-9_5. Document ID: d67aa537a6de60813845a45505c313, http://cryptojedi.org/papers/#lattisigns

    Chapter  Google Scholar 

  18. Heideman, M.T., Johnson, D.H., Burrus, C.S.: Gauss and the history of the fast fourier transform. IEEE ASSP Mag. 1(4) (1984). http://www.cis.rit.edu/class/simg716/Gauss_History_FFT.pdf

    Article  Google Scholar 

  19. Hülsing, A., Rijneveld, J., Schanck, J.M., Schwabe, P.: NTRU-KEM-HRSS: algorithm specification and supporting documentation. Submission to the NIST Post-Quantum Cryptography Standardization Project (2017). https://ntru-hrss.org

  20. Kannwischer, M.J., Rijneveld, J., Schwabe, P.: Faster multiplication in \(\mathbb{Z}_{2^m}[x]\) on Cortex-M4 to speed up NIST PQC candidates (2018). https://eprint.iacr.org/2018/1018

  21. Kannwischer, M.J., Rijneveld, J., Schwabe, P., Stoffelen, K.: PQM4: post-quantum crypto library for the ARM Cortex-M4. https://github.com/mupq/pqm4. Accessed 07 Mar 2019

  22. Karatsuba, A., Ofman, Y.: Multiplication of multidigit numbers on automata. Sov. Phys. Dokl. 7, 595–596 (1963). Translated from Doklady Akademii Nauk SSSR, vol. 145, no. 2, pp. 293–294, July 1962. Scanned version on http://cr.yp.to/bib/1963/karatsuba.html

    Google Scholar 

  23. Karmakar, A., Mera, J.M.B., Roy, S.S., Verbauwhede, I.: Saber on ARM CCA-secure module lattice-based key encapsulation on ARM. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2018(3), 243–266 (2018). https://eprint.iacr.org/2018/682

    Google Scholar 

  24. Lyubashevsky, V., Seiler, G.: NTTRU: Truly fast NTRU using NTT. Cryptology ePrint Archive, Report 2019/040 (2019). https://eprint.iacr.org/2019/040

  25. Montgomery, P.L.: Modular multiplication without trial division. Math. Comput. 44(170), 519–521 (1985). http://www.ams.org/journals/mcom/1985-44-170/S0025-5718-1985-0777282-X/S0025-5718-1985-0777282-X.pdf

    Article  MathSciNet  Google Scholar 

  26. National Institute for Standards and Technology: Submission requirements and evaluation criteria for the post-quantum cryptography standardization process (2017). https://csrc.nist.gov/csrc/media/projects/post-quantum-cryptography/documents/call-for-proposals-final-dec-2016.pdf

  27. Oder, T., Pöppelmann, T., Güneysu, T.: Beyond ECDSA and RSA: lattice-based digital signatures on constrained devices. In: 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1–6. ACM (2014). https://www.sha.rub.de/media/attachments/files/2014/06/bliss_arm.pdf

  28. Pöppelmann, T., Oder, T., Güneysu, T.: High-performance ideal lattice-based cryptography on 8-bit ATxmega microcontrollers. In: Lauter, K., Rodríguez-Henríquez, F. (eds.) LATINCRYPT 2015. LNCS, vol. 9230, pp. 346–365. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22174-8_19. Extended version, https://eprint.iacr.org/2015/382

    Chapter  Google Scholar 

  29. Saarinen, M.J.O., Bhattacharya, S., Garcia-Morchon, O., Rietman, R., Tolhuizen, L., Zhang, Z.: Shorter messages and faster post-quantum encryption with Round5 on Cortex M. Cryptology ePrint Archive, Report 2018/723 (2018). https://eprint.iacr.org/2018/723

  30. Seiler, G.: Faster AVX2 optimized NTT multiplication for Ring-LWE lattice cryptography. Cryptology ePrint Archive, Report 2018/039 (2018). https://eprint.iacr.org/2018/039

  31. Reference manual for STM32F405/415, STM32F407/417, STM32F427/437, and STM32F429/439 advanced ARM-based 32-bit MCUs (2019). https://www.st.com/resource/en/reference_manual/dm00031020.pdf

  32. Toom, A.L.: The complexity of a scheme of functional elements realizing the multiplication of integers. Sov. Math. Dokl. 3, 714–716 (1963). www.de.ufpe.br/~toom/my-articles/engmat/MULT-E.PDF

  33. Zhang, Z., Chen, C., Hoffstein, J., Whyte, W.: NTRUEncrypt: algorithm specification and supporting documentation. Submission to the NIST Post-Quantum Cryptography Standardization Project (2017). https://csrc.nist.gov/projects/post-quantum-cryptography/round-1-submissions

Download references

Acknowledgments

The authors would like to thank Pedro Massolino, Joost Rijneveld, and Ko Stoffelen for their help with obtaining reasonable cycle counts on the ARM Cortex-M4.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Leon Botros , Matthias J. Kannwischer or Peter Schwabe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Botros, L., Kannwischer, M.J., Schwabe, P. (2019). Memory-Efficient High-Speed Implementation of Kyber on Cortex-M4. In: Buchmann, J., Nitaj, A., Rachidi, T. (eds) Progress in Cryptology – AFRICACRYPT 2019. AFRICACRYPT 2019. Lecture Notes in Computer Science(), vol 11627. Springer, Cham. https://doi.org/10.1007/978-3-030-23696-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23696-0_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23695-3

  • Online ISBN: 978-3-030-23696-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics