RNS-Based Embedded Processor Design

  • Pedro Miguens MatutinoEmail author
  • Ricardo Chaves
  • Leonel Sousa


In this chapter a unified architecture providing a generic, programmable, and scalable RNS computation based on {2 n ± k i } moduli channels is described. This architecture allows for the design of RNS with an arbitrarily long moduli set of the form {2 n ± k0, ⋯ , 2 n ± k j }, with \(j \in \mathbb{N}_{0}^{+}\). The considered moduli set allows to arbitrarily increase the number of RNS channels and consequently increasing the Dynamic Range (DR) or reducing the width of the channels, leading to a reduction in delay and area cost of the arithmetic operations, allowing to further exploit the RNS parallelism. The proposed RNS architecture provides not only a programmable processor capable of supporting a wide range of algorithms using RNS, but also a tool for researchers to evaluate new algorithms, moduli sets, and conversion approaches.


Computer arithmetic Residue Number System (RNS) Generic modular arithmetic Parallel computation Binary-to-RNS conversion RNS-to-binary conversion Processor 


  1. 1.
    S.R. Barraclough, M. Sotheran, K. Burgin, A.P. Wise, A. Vadher, W.P. Robbins, R.M. Forsyth, The design and implementation of the IMS A110 image and signal processor, in Proceedings of the IEEE Custom Integrated Circuits Conference, May 1989, pp. 24.5/1–24.5/4Google Scholar
  2. 2.
    W.A. Chren, RNS-based enhancements for direct digital frequency synthesis. IEEE Trans. Circuits Syst. II Analog Digit. Signal Process. 42 (8), 516–524 (1995)CrossRefzbMATHGoogle Scholar
  3. 3.
    F. Piazza, E. Di Claudio, G. Orlandi, Fast combinatorial RNS processor for DSP applications. IEEE Trans. Comput. 44 (5), 624–633 (1995)CrossRefzbMATHGoogle Scholar
  4. 4.
    T.J. Slegel, R.J. Veracca, Design and performance of the IBM enterprise system/9000 type 9121 vector facility. IBM J. Res. Dev. 35, 367–381 (1991)CrossRefGoogle Scholar
  5. 5.
    C.-L. Wang, New bit serial VLSI implementation of RNS FIR digital filters. IEEE Trans. Circuits Syst. II Exp. Briefs 41 (11), 768–772 (1994)CrossRefGoogle Scholar
  6. 6.
    P.M. Matutino, L. Sousa, An RNS based specific processor for computing the minimum sum-of-absolute-differences, in 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools, September 2008, pp. 768–775Google Scholar
  7. 7.
    J. Ostermann, J. Bormans, P. List, D. Marpe, M. Narroschke, F. Pereira, T. Stockhammer, T. Wedi, Video coding with H.264/AVC: tools, performance, and complexity. IEEE Circuits Syst. Mag. 4 (1), 7–28 (first 2004)Google Scholar
  8. 8.
    J. Bajard, L. Imbert, A full RNS implementation of RSA. IEEE Trans. Comput. 53 (6), 769–774 (2004)CrossRefGoogle Scholar
  9. 9.
    Antão, S., Bajard, J.-C., Sousa, L.: RNS based elliptic curve point multiplication for massive parallel architectures. Comput. J. 2011 - Oxf. J. 55 (5), 629–647 (2011)Google Scholar
  10. 10.
    G.C. Cardarilli, A. Nannarelli, M. Re, Residue number system for low-power DSP applications, in Asilomar Conference on Signals, Systems and Computers - ACSSC 2007 (2007), pp. 1412–1416Google Scholar
  11. 11.
    P. Garai, C.B. Dutta, RNS based reconfigurable processor for high speed signal processing, in IEEE Conference TENCON 2014, October 2014, pp. 1–6Google Scholar
  12. 12.
    P.L. Montgomery, Modular multiplication without trial division. Math. Comput. 44 (170), 519–521 (1985)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    J. Bajard, L.-S. Didier, P. Kornerup, An RNS montgomery modular multiplication algorithm, in 13th IEEE Symposium on Computer Arithmetic (ARITH), July 1997, pp. 234–239Google Scholar
  14. 14.
    K.C. Posch, R. Posch, Modulo reduction in residue number systems. IEEE Trans. Parallel Distrib. Syst. 6 (5), 449–454 (1995)CrossRefzbMATHGoogle Scholar
  15. 15.
    Advanced RISC Machines Ltd (ARM), ARM7 Data sheet. Advanced RISC Machines Ltd (ARM), ARM DDi 0020C edition, December 1994Google Scholar
  16. 16.
    H. Nozaki, M. Motoyama, A. Shimbo, S. Kawamura, Implementation of RSA algorithm based on RNS montgomery multiplication, in Cryptographic Hardware and Embedded Systems - CHES 2001, ed. by Ç. Koç, D. Naccache, C. Paar. Lecture Notes in Computer Science, vol. 2162 (Springer, Berlin, Heidelberg, 2001), pp. 364–376Google Scholar
  17. 17.
    P. Miguens Matutino, R. Chaves, L. Sousa, An efficient scalable RNS architecture for large dynamic ranges. J. Signal Process. Syst. 77 (1–2), 191–205 (2014)CrossRefGoogle Scholar
  18. 18.
    P.M. Matutino, R. Chaves, L. Sousa, Arithmetic-based Binary-to-RNS converter modulo {2n ± k} for jn-bit dynamic range. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 23 (3), 603–607 (2015)Google Scholar
  19. 19.
    A. Omondi, B. Premkumar (eds.), Residue Number Systems: Theory and Implementation (Imperial College Press, London, 2007)zbMATHGoogle Scholar
  20. 20.
    Y. Wang, Residue-to-binary converters based on new Chinese remainder theorems. IEEE Trans. Circuits Syst. II: Analog Digit. Signal Process. 47 (3), 197–205 (2000)CrossRefzbMATHGoogle Scholar
  21. 21.
    R. Chaves, L. Sousa, RDSP: A RISC DSP based on Residue Number System, in Proceedings of the Euromicro Symposium on Digital System Design, September 2003, pp. 128–135, ed. by IEEEGoogle Scholar
  22. 22.
    S. Kawamura, M. Koike, F. Sano, A. Shimbo, Cox-Rower architecture for fast parallel montgomery multiplication, in Advances in Cryptology - EUROCRYPT 2000, ed. by B. Preneel. Lecture Notes in Computer Science, vol. 1807 (Springer Berlin, Heidelberg, 2000), pp. 523–538Google Scholar
  23. 23.
    J. Wei, W. Guo, H. Liu, Y. Tan, A unified cryptographic processor for RSA and ECC in RNS, in Computer Engineering and Technology, ed. by W. Xu, L. Xiao, C. Zhang, J. Li, L. Yu. Communications in Computer and Information Science, vol. 396 (Springer, Berlin, Heidelberg, 2013), pp. 19–32Google Scholar
  24. 24.
    R.L. Rivest, A. Shamir, L. Adleman, A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 26 (1), 96–99 (1983)CrossRefzbMATHGoogle Scholar
  25. 25.
    R. Schroeppel, H. Orman, S. O’Malley, O. Spatscheck, Fast key exchange with elliptic curve systems, in Advances in Cryptology – Crypto ´95. Lecture Notes in Computer Science, vol. 963 (Springer, Berlin, 1995), pp. 43–56Google Scholar
  26. 26.
    P.V. Ananda Mohan, Reverse converters for the moduli sets {22N − 1, 2N, 22N + 1} and {2N − 3, 2N + 1, 2N − 1, 2N + 3}, in SPCOM ’04, December 2004, pp. 188–192Google Scholar
  27. 27.
    M.-H. Sheu, S.-H. Lin, C. Chen, S.-W. Yang, An efficient VLSI design for a residue to binary converter for general balance moduli {2n − 3,2n + 1,2n − 1,2n + 3}. IEEE Trans. Circuits Syst. II: Express Briefs 51 (3), 152–155 (2004)CrossRefGoogle Scholar
  28. 28.
    P. Miguens Matutino, R. Chaves, L. Sousa, Arithmetic units for RNS moduli {2n − 3} and {2n + 3} operations, in 13th EUROMICRO Conference on Digital System Design: Architectures, Methods and Tools, September 2010, pp. 243–246Google Scholar
  29. 29.
    A.B. Premkumar, A.P. Vinod, A memoryless reverse converter for the 4-moduli superset {2n − 1, 2n, 2n + 1, 2n+1 − 1}. J. Circuits Syst. Comput. 10 (01n02), 85–99 (2000)Google Scholar
  30. 30.
    B. Cao, C.-H. Chang, T. Srikanthan, An efficient reverse converter for the 4-moduli set {2n − 1, 2n, 2n + 1, 22n + 1} based on the new Chinese Remainder Theorem. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 50 (10), 1296–1303 (2003)Google Scholar
  31. 31.
    B. Cao, C.-H. Chang, T. Srikanthan, A residue-to-binary converter for a new five-moduli set. IEEE Trans. Circuits Syst. I: Regul. Pap. 54 (5), 1041–1049 (2007)CrossRefMathSciNetGoogle Scholar
  32. 32.
    A. Skavantzos, M. Abdallah, T. Stouraitis, D. Schinianakis, Design of a balanced 8-modulus RNS, in 16th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2009, December 2009, pp. 61–64Google Scholar
  33. 33.
    N. Guillermin, A high speed coprocessor for elliptic curve scalar multiplications over Fp, in Proceedings of the 12th International Conference on Cryptographic Hardware and Embedded Systems, CHES’10 (Springer-Verlag, Berlin, Heidelberg, 2010) pp. 48–64Google Scholar
  34. 34.
    N. Guillermin, A coprocessor for secure and high speed modular arithmetic. Cryptology ePrint Archive, Report 2011/354, July 2011Google Scholar
  35. 35.
    J.-C. Bajard, L. Imbert, P.-Y. Liardet, Y. Teglia, Leak resistant arithmetic, in Cryptographic Hardware and Embedded Systems - CHES 2004, ed. by M. Joye, J.-J. Quisquater. Lecture Notes in Computer Science, vol. 3156 (Springer, Berlin, Heidelberg, 2004), pp. 62–75Google Scholar
  36. 36.
    W. Guo, Y. Liu, S. Bai, J. Wei, D. Sun, Hardware architecture for rsa cryptography based on residue number system. Trans. Tianjin Univ. 18 (4), 237–242 (2012)CrossRefGoogle Scholar
  37. 37.
    J.-C. Bajard, L.-S. Didier, P. Kornerup, Modular multiplication and base extensions in residue number systems, in IEEE 15TH Symposium on Computer Arithmetic (IEEE, New York, 2001), pp. 59–65Google Scholar
  38. 38.
    P.V. Ananda Mohan, New reverse converters for the moduli set {2n − 3, 2n − 1, 2n + 1, 2n + 3}. AEU - Int. J. Electron. Commun. 62 (9), 643–658 (2008)Google Scholar
  39. 39.
    H. Pettenghi, R. Chaves, L. Sousa, RNS reverse converters for moduli sets with dynamic ranges up to (8n+1)-bit. IEEE Trans. Circuits Syst. I Regul. Pap. PP (99), 1–14 (2012)Google Scholar
  40. 40.
    G. Jaberipur, H. Ahmadifar, A rom-less reverse RNS converter for moduli set {2q ± 1, 2q ± 3}. IET Comput. Digit. Tech. 8 (1), 11–22 (2014)Google Scholar
  41. 41.
    P.M. Matutino, H. Pettenghi, R. Chaves, L. Sousa, RNS arithmetic units for modulo {2n ± k}, in 15th EUROMICRO Conference on Digital System Design: Architectures, Methods and Tools, September 2012, pp. 795–802Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Pedro Miguens Matutino
    • 1
    Email author
  • Ricardo Chaves
    • 2
  • Leonel Sousa
    • 2
  1. 1.ISEL - IPLINESC-IDLisboaPortugal
  2. 2.IST - Universidade de Lisboa, INESC-IDLisboaPortugal

Personalised recommendations