Optimized RISC Architecture for Multiple-Precision Modular Arithmetic

  • Johann Großschädl
  • Guy-Armand Kamendje
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2802)


Public-key cryptosystems normally spend most of their execution time in a small fraction of the program code, typically in an inner loop. The performance of these critical code sections can be significantly improved by customizing the processor’s instruction set and microarchitecture, respectively. This paper shows the advantages of instruction set extensions to accelerate the processing of cryptographic workloads such as long integer modular arithmetic. We define two custom instructions for performing multiply-and-add operations on unsigned integers (single-precision words). Both instructions can be efficiently executed by a (32 × 32 + 32 + 32)-bitmultiply/accumulate (MAC) unit. Thus, the proposed extensions are simple to integrate into standard 32-bitRISC cores like the MIPS32 4Km. We present an optimized Assembly routine for fast multiple-precision multiplication with ”finely” integrated Montgomery reduction (FIOS method). Simulation results demonstrate that the custom instructions double the processor’s arithmetic performance compared to a standard MIPS32 core.


RSA Algorithm Montgomery Multiplication Finely IntegratedOperand Scanning (FIOS) Multi-Application Smart Cards 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rivest, R.L., Shamir, A., Adleman, L.M.: A method for obtaining digital signatures and public key cryptosystems. Communications of the ACM 21, 120–126 (1978)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Diffie, W., Hellman, M.E.: New directions in cryptography. IEEE Transactions on Information Theory 22, 644–654 (1976)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    National Institute of Standards and Technology (NIST): Digital Signature Standard (DSS). Federal Information Processing Standards Publication 186-2 (2000)Google Scholar
  4. 4.
    Blake, I.F., Seroussi, G., Smart, N.P.: Elliptic Curves in Cryptography. Cambridge University Press, Cambridge (1999)zbMATHGoogle Scholar
  5. 5.
    Tenca, A.F., Koç, Ç.K.: A scalable architecture for Montgomery multiplicatio. In: Koç, Ç.K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 94–108. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  6. 6.
    Knuth, D.E.: Seminumerical Algorithms, 3rd edn. The Art of Computer Programming, vol. 2. Addison-Wesley, Reading (1998)zbMATHGoogle Scholar
  7. 7.
    Montgomery, P.L.: Modular multiplication without trial division. Mathematics of Computation 44, 519–521 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Solinas, J.A.: Generalized Mersenne numbers. Technical Report CORR-99-39, University of Waterloo, Waterloo, Canada (1999)Google Scholar
  9. 9.
    Dhem, J.F., Feyt, N.: Hardware and software symbiosis helps smart card evolution. IEEEMicro 21, 14–25 (2001)Google Scholar
  10. 10.
    ARM Limited: ARM SecurCore Solutions. Product brief, available for download at (2002),$File/SecurCores.pdf
  11. 11.
    Dhem, J.F.: Design of an efficient public-key cryptographic library for RISC-based smart cards. Ph.D. Thesis, Université Catholique de Louvain, Louvain-la-Neuve, Belgium (1998)Google Scholar
  12. 12.
    De Micheli, G., Gupta, R.K.: Hardware/software co-design. Proceedings of the IEEE 85, 349–365 (1997)CrossRefGoogle Scholar
  13. 13.
    The Open SystemC Initiative (OSCI): SystemC Version 2.0 User’s Guide. Available for download (2002), at
  14. 14.
    Küc¸ ükc¸akar, K.: An ASIP design methodology for embedded systems. In: Proceedings of the 7th Int. Symposium on Hardware/Software Codesign (CODES 1999), pp. 17–21. ACM Press, New York (1999)CrossRefGoogle Scholar
  15. 15.
    Gonzalez, R.E.: Xtensa: A configurable and extensible processor. IEEE Micro 20, 60–70 (2000)CrossRefGoogle Scholar
  16. 16.
    Gschwind, M.: Instruction set selection for ASIP design. In: Proceedings of the 7th Int. Symposium on Hardware/Software Codesign (CODES 1999), pp. 7–11. ACM Press, New York (1999)CrossRefGoogle Scholar
  17. 17.
    Wang, A., Killian, E., Maydan, D.E., Rowen, C.: Hardware/software instruction set configurability for system-on-chip processors. In: Proceedings of the 38th Design Automation Conference (DAC 2001), pp. 184–188. ACM Press, New York (2001)CrossRefGoogle Scholar
  18. 18.
    Lee, R.B.: Multimedia extensions for general-purpose processors. In: Proceedings of the 1997 IEEE Workshop on Signal Processing Systems (SiPS 1997), pp. 9–23. IEEE, Los Alamitos (1997)Google Scholar
  19. 19.
    Lee, R.B.: Accelerating multimedia with enhanced microprocessors. IEEEMicro 15, 22–32 (1995)Google Scholar
  20. 20.
    Burke, J., McDonald, J., Austin, T.M.: Architectural support for fast symmetric-key cryptography. In: Proceedings of the 9th Int. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2000), pp. 178–189. ACM Press, New York (2000)CrossRefGoogle Scholar
  21. 21.
    Lee, R.B., Shi, Z., Yang, X.: Efficient permutation instructions for fast software cryptography. IEEE Micro 21, 56–69 (2001)CrossRefGoogle Scholar
  22. 22.
    Phillips, B.J., Burgess, N.: Implementing 1,024-bit RSA exponentiation on a 32-bit processor core. In: Proceedings of the 12th IEEE Int. Conference on Application-specific Systems, Architectures and Processors (ASAP 2000), pp. 127–137. IEEE Computer Society Press, Los Alamitos (2000)CrossRefGoogle Scholar
  23. 23.
    Moore, S.F.: Enhancing security performance through IA-64 architecture. Technical presentation at the 9th Annual RSA Conference (RSA 2000). Presentation slides are available for download (2000), at
  24. 24.
    MIPS Technologies, Inc.: SmartMIPS Architecture Smart Card Extensions. Product brief, available for download (2001), at
  25. 25.
    STMicroelectronics: ST22 SmartJ Platform Smartcard ICs. Product brief, available for download (2002), at
  26. 26.
    Großschädl, J.: Instruction set extension for long integer modulo arithmetic on RISC-based smart cards. In: Proceedings of the 14th Int. Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2002), pp. 13–19. IEEE Computer Society Press, Los Alamitos (2002)CrossRefGoogle Scholar
  27. 27.
    Koc¸, C.K., Acar, T., Kaliski, B.S.: Analyzing and comparing Montgomery multiplication algorithms. IEEE Micro 16, 26–33 (1996)CrossRefGoogle Scholar
  28. 28.
    Itoh, K., Takenaka, M., Torii, N., Temma, S., Kurihara, Y.: Fast implementation of publickey cryptography on a DSP TMS320C6201. In: Koç, Ç.K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 61–72. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  29. 29.
    Guajardo, J., Paar, C.: Modified squaring algorithm. Unpublished manuscript, available for download (1999), at
  30. 30.
    Menezes, A.J., van Oorschot, P.C., Vanstone, S.A.: Handbook of Applied Cryptography. CRC Press, Boca Raton (1996)CrossRefGoogle Scholar
  31. 31.
    MIPS Technologies, Inc.: MIPS32 TM architecture for programmers, Vol. I & II. Available for download (2001), at
  32. 32.
    MIPS Technologies, Inc.: MIPS32 4KmTMprocessor core family data sheet. Available for download (2001), at
  33. 33.
    Großschädl, J., Kamendje, G.A.: A single-cycle (32×32+32+64)-bit multiply/accumulate unit for digital signal processing and public-key cryptography. In: Accepted for presentation at the 10th IEEE Int. Conference on Electronics, Circuits, and Systems (ICECS 2003),scheduled for Sharjah, UAE, December 14-17 (2003)Google Scholar
  34. 34.
    Gordon, D.M.: A survey of fast exponentiation methods. Journal of Algorithms 27, 129–146 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  35. 35.
    Quisquater, J.J., Couvreur, C.: Fast decipherment algorithm for RSA public-key cryptosystem. Electronics Letters 18, 905–907 (1982)CrossRefGoogle Scholar
  36. 36.
    Walter, C.D.: MIST: An efficient, randomized exponentiation algorithm for resisting power analysis. In: Preneel, B. (ed.) CT-RSA 2002. LNCS, vol. 2271, pp. 53–66. Springer, Heidelberg (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Johann Großschädl
    • 1
  • Guy-Armand Kamendje
    • 1
  1. 1.Institute for Applied Information Processing and CommunicationsGraz University of TechnologyGrazAustria

Personalised recommendations