RNS-Based Embedded Processor Design
- 782 Downloads
Abstract
In this chapter a unified architecture providing a generic, programmable, and scalable RNS computation based on {2 n ± k i } moduli channels is described. This architecture allows for the design of RNS with an arbitrarily long moduli set of the form {2 n ± k0, ⋯ , 2 n ± k j }, with \(j \in \mathbb{N}_{0}^{+}\). The considered moduli set allows to arbitrarily increase the number of RNS channels and consequently increasing the Dynamic Range (DR) or reducing the width of the channels, leading to a reduction in delay and area cost of the arithmetic operations, allowing to further exploit the RNS parallelism. The proposed RNS architecture provides not only a programmable processor capable of supporting a wide range of algorithms using RNS, but also a tool for researchers to evaluate new algorithms, moduli sets, and conversion approaches.
Keywords
Computer arithmetic Residue Number System (RNS) Generic modular arithmetic Parallel computation Binary-to-RNS conversion RNS-to-binary conversion ProcessorReferences
- 1.S.R. Barraclough, M. Sotheran, K. Burgin, A.P. Wise, A. Vadher, W.P. Robbins, R.M. Forsyth, The design and implementation of the IMS A110 image and signal processor, in Proceedings of the IEEE Custom Integrated Circuits Conference, May 1989, pp. 24.5/1–24.5/4Google Scholar
- 2.W.A. Chren, RNS-based enhancements for direct digital frequency synthesis. IEEE Trans. Circuits Syst. II Analog Digit. Signal Process. 42 (8), 516–524 (1995)CrossRefzbMATHGoogle Scholar
- 3.F. Piazza, E. Di Claudio, G. Orlandi, Fast combinatorial RNS processor for DSP applications. IEEE Trans. Comput. 44 (5), 624–633 (1995)CrossRefzbMATHGoogle Scholar
- 4.T.J. Slegel, R.J. Veracca, Design and performance of the IBM enterprise system/9000 type 9121 vector facility. IBM J. Res. Dev. 35, 367–381 (1991)CrossRefGoogle Scholar
- 5.C.-L. Wang, New bit serial VLSI implementation of RNS FIR digital filters. IEEE Trans. Circuits Syst. II Exp. Briefs 41 (11), 768–772 (1994)CrossRefGoogle Scholar
- 6.P.M. Matutino, L. Sousa, An RNS based specific processor for computing the minimum sum-of-absolute-differences, in 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools, September 2008, pp. 768–775Google Scholar
- 7.J. Ostermann, J. Bormans, P. List, D. Marpe, M. Narroschke, F. Pereira, T. Stockhammer, T. Wedi, Video coding with H.264/AVC: tools, performance, and complexity. IEEE Circuits Syst. Mag. 4 (1), 7–28 (first 2004)Google Scholar
- 8.J. Bajard, L. Imbert, A full RNS implementation of RSA. IEEE Trans. Comput. 53 (6), 769–774 (2004)CrossRefGoogle Scholar
- 9.Antão, S., Bajard, J.-C., Sousa, L.: RNS based elliptic curve point multiplication for massive parallel architectures. Comput. J. 2011 - Oxf. J. 55 (5), 629–647 (2011)Google Scholar
- 10.G.C. Cardarilli, A. Nannarelli, M. Re, Residue number system for low-power DSP applications, in Asilomar Conference on Signals, Systems and Computers - ACSSC 2007 (2007), pp. 1412–1416Google Scholar
- 11.P. Garai, C.B. Dutta, RNS based reconfigurable processor for high speed signal processing, in IEEE Conference TENCON 2014, October 2014, pp. 1–6Google Scholar
- 12.P.L. Montgomery, Modular multiplication without trial division. Math. Comput. 44 (170), 519–521 (1985)CrossRefzbMATHMathSciNetGoogle Scholar
- 13.J. Bajard, L.-S. Didier, P. Kornerup, An RNS montgomery modular multiplication algorithm, in 13th IEEE Symposium on Computer Arithmetic (ARITH), July 1997, pp. 234–239Google Scholar
- 14.K.C. Posch, R. Posch, Modulo reduction in residue number systems. IEEE Trans. Parallel Distrib. Syst. 6 (5), 449–454 (1995)CrossRefzbMATHGoogle Scholar
- 15.Advanced RISC Machines Ltd (ARM), ARM7 Data sheet. Advanced RISC Machines Ltd (ARM), ARM DDi 0020C edition, December 1994Google Scholar
- 16.H. Nozaki, M. Motoyama, A. Shimbo, S. Kawamura, Implementation of RSA algorithm based on RNS montgomery multiplication, in Cryptographic Hardware and Embedded Systems - CHES 2001, ed. by Ç. Koç, D. Naccache, C. Paar. Lecture Notes in Computer Science, vol. 2162 (Springer, Berlin, Heidelberg, 2001), pp. 364–376Google Scholar
- 17.P. Miguens Matutino, R. Chaves, L. Sousa, An efficient scalable RNS architecture for large dynamic ranges. J. Signal Process. Syst. 77 (1–2), 191–205 (2014)CrossRefGoogle Scholar
- 18.P.M. Matutino, R. Chaves, L. Sousa, Arithmetic-based Binary-to-RNS converter modulo {2n ± k} for jn-bit dynamic range. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 23 (3), 603–607 (2015)Google Scholar
- 19.A. Omondi, B. Premkumar (eds.), Residue Number Systems: Theory and Implementation (Imperial College Press, London, 2007)zbMATHGoogle Scholar
- 20.Y. Wang, Residue-to-binary converters based on new Chinese remainder theorems. IEEE Trans. Circuits Syst. II: Analog Digit. Signal Process. 47 (3), 197–205 (2000)CrossRefzbMATHGoogle Scholar
- 21.R. Chaves, L. Sousa, RDSP: A RISC DSP based on Residue Number System, in Proceedings of the Euromicro Symposium on Digital System Design, September 2003, pp. 128–135, ed. by IEEEGoogle Scholar
- 22.S. Kawamura, M. Koike, F. Sano, A. Shimbo, Cox-Rower architecture for fast parallel montgomery multiplication, in Advances in Cryptology - EUROCRYPT 2000, ed. by B. Preneel. Lecture Notes in Computer Science, vol. 1807 (Springer Berlin, Heidelberg, 2000), pp. 523–538Google Scholar
- 23.J. Wei, W. Guo, H. Liu, Y. Tan, A unified cryptographic processor for RSA and ECC in RNS, in Computer Engineering and Technology, ed. by W. Xu, L. Xiao, C. Zhang, J. Li, L. Yu. Communications in Computer and Information Science, vol. 396 (Springer, Berlin, Heidelberg, 2013), pp. 19–32Google Scholar
- 24.R.L. Rivest, A. Shamir, L. Adleman, A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 26 (1), 96–99 (1983)CrossRefzbMATHGoogle Scholar
- 25.R. Schroeppel, H. Orman, S. O’Malley, O. Spatscheck, Fast key exchange with elliptic curve systems, in Advances in Cryptology – Crypto ´95. Lecture Notes in Computer Science, vol. 963 (Springer, Berlin, 1995), pp. 43–56Google Scholar
- 26.P.V. Ananda Mohan, Reverse converters for the moduli sets {22N − 1, 2N, 22N + 1} and {2N − 3, 2N + 1, 2N − 1, 2N + 3}, in SPCOM ’04, December 2004, pp. 188–192Google Scholar
- 27.M.-H. Sheu, S.-H. Lin, C. Chen, S.-W. Yang, An efficient VLSI design for a residue to binary converter for general balance moduli {2n − 3,2n + 1,2n − 1,2n + 3}. IEEE Trans. Circuits Syst. II: Express Briefs 51 (3), 152–155 (2004)CrossRefGoogle Scholar
- 28.P. Miguens Matutino, R. Chaves, L. Sousa, Arithmetic units for RNS moduli {2n − 3} and {2n + 3} operations, in 13th EUROMICRO Conference on Digital System Design: Architectures, Methods and Tools, September 2010, pp. 243–246Google Scholar
- 29.A.B. Premkumar, A.P. Vinod, A memoryless reverse converter for the 4-moduli superset {2n − 1, 2n, 2n + 1, 2n+1 − 1}. J. Circuits Syst. Comput. 10 (01n02), 85–99 (2000)Google Scholar
- 30.B. Cao, C.-H. Chang, T. Srikanthan, An efficient reverse converter for the 4-moduli set {2n − 1, 2n, 2n + 1, 22n + 1} based on the new Chinese Remainder Theorem. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 50 (10), 1296–1303 (2003)Google Scholar
- 31.B. Cao, C.-H. Chang, T. Srikanthan, A residue-to-binary converter for a new five-moduli set. IEEE Trans. Circuits Syst. I: Regul. Pap. 54 (5), 1041–1049 (2007)CrossRefMathSciNetGoogle Scholar
- 32.A. Skavantzos, M. Abdallah, T. Stouraitis, D. Schinianakis, Design of a balanced 8-modulus RNS, in 16th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2009, December 2009, pp. 61–64Google Scholar
- 33.N. Guillermin, A high speed coprocessor for elliptic curve scalar multiplications over Fp, in Proceedings of the 12th International Conference on Cryptographic Hardware and Embedded Systems, CHES’10 (Springer-Verlag, Berlin, Heidelberg, 2010) pp. 48–64Google Scholar
- 34.N. Guillermin, A coprocessor for secure and high speed modular arithmetic. Cryptology ePrint Archive, Report 2011/354, July 2011Google Scholar
- 35.J.-C. Bajard, L. Imbert, P.-Y. Liardet, Y. Teglia, Leak resistant arithmetic, in Cryptographic Hardware and Embedded Systems - CHES 2004, ed. by M. Joye, J.-J. Quisquater. Lecture Notes in Computer Science, vol. 3156 (Springer, Berlin, Heidelberg, 2004), pp. 62–75Google Scholar
- 36.W. Guo, Y. Liu, S. Bai, J. Wei, D. Sun, Hardware architecture for rsa cryptography based on residue number system. Trans. Tianjin Univ. 18 (4), 237–242 (2012)CrossRefGoogle Scholar
- 37.J.-C. Bajard, L.-S. Didier, P. Kornerup, Modular multiplication and base extensions in residue number systems, in IEEE 15TH Symposium on Computer Arithmetic (IEEE, New York, 2001), pp. 59–65Google Scholar
- 38.P.V. Ananda Mohan, New reverse converters for the moduli set {2n − 3, 2n − 1, 2n + 1, 2n + 3}. AEU - Int. J. Electron. Commun. 62 (9), 643–658 (2008)Google Scholar
- 39.H. Pettenghi, R. Chaves, L. Sousa, RNS reverse converters for moduli sets with dynamic ranges up to (8n+1)-bit. IEEE Trans. Circuits Syst. I Regul. Pap. PP (99), 1–14 (2012)Google Scholar
- 40.G. Jaberipur, H. Ahmadifar, A rom-less reverse RNS converter for moduli set {2q ± 1, 2q ± 3}. IET Comput. Digit. Tech. 8 (1), 11–22 (2014)Google Scholar
- 41.P.M. Matutino, H. Pettenghi, R. Chaves, L. Sousa, RNS arithmetic units for modulo {2n ± k}, in 15th EUROMICRO Conference on Digital System Design: Architectures, Methods and Tools, September 2012, pp. 795–802Google Scholar