RSA Public Key Acceleration on CUDA GPU
Cryptography is a technique of using number theoretical mathematics for the key generation, encryption and decryption of confidential information. Cryptography has many uses in real-world applications such as Digital Right Management, E-Commerce, Secret Broadcasting and Financial Cryptography, etc. In this paper, we mainly focus on the speedup of the RSA public-key cryptosystem algorithm. We proposed the high-performance parallel RSA algorithms on parallel hardware such as Graphics Processing Units (GPUs). We used NVIDIA GPU Quadro FX 3800 to exploit the many-core parallelism for the implementation of highly parallel and efficient RSA algorithm on Compute Unified Device Architecture (CUDA). The experiments conducted on many-core GPUs show the enhanced speedup of proposed parallel RSA algorithms compared to single CPU RSA algorithm implementation. We observed that the speedup achieved by the GPU dominates the single CPU RSA implementation.
KeywordsCUDA GPU RSA Parallel algorithm
This work was financially supported by the TEQIP-II scheme (2011–2012) for technical quality improvement by Ministry of Human Resource Development (MHRD), Government of India.
- 2.Kunth DE. Seminumerical algorithms: the art of computer programming, 3rd edn. vol. 2. Reading: Addison-Wesley; 2008.Google Scholar
- 3.Gathen J, Gerhard J. Modern computer algebra, 3rd edn. Cambridge University Press: Cambridge; 1999.Google Scholar
- 4.Menezes AJ, Oorschot PC, Vanstone SA. Handbook of applied cryptography, CRC Press: Boca Raton; 1996.Google Scholar
- 5.Buttlar D, Farrell J, Nichols B. Pthreads programming: a POSIX standard for better multiprocessing, O’Reilly Media, 1996.Google Scholar
- 6.OpenMP Application Program Interface, Version 4.0, July 2013.Google Scholar
- 7.Berkeley UPC—Unified Parallel C, http://upc.lbl.gov/.
- 8.Open MPI: Open Source High Performance Computing, http://www.open-mpi.org/.
- 9.Sanders J, Kandrot E, CUDA by example: an introduction to general-purpose GPU programming, 1st edn. Addison-Wesley: Upper Saddle River; 2010.Google Scholar
- 10.Shand M, Vuillemin J. Fast implementations of RSA cryptography. In: 11th symposium on computer arithmetic, Windsor, Ontario, p. 252–9, July 1993.Google Scholar
- 11.Pearson D. A parallel implementation of RSA. Computer Science Department, Cornell University, Ithaca, NY, Technical Report, July 1996.Google Scholar
- 12.Cilardo A, Mazzeo A, Mazzocca N, Romano L. A novel unified architecture for public-key cryptography. In: Proceedings of the design, automation and test in Europe, Italy, 2005, vol. 3, p. 52–7.Google Scholar
- 14.Savas E, Tenca AF, Koc CK. A scalable and unified multiplier architecture for finite field GF(p) and GF(2 m). In: International workshop on cryptographic hardware and embedded systems, Worcester, MA, USA, LNCS, Springer, vol. 1965, pp. 277–292, August 2000.Google Scholar
- 15.Fan W, Chen X, Li X. Parallelization of RSA algorithm based on compute unified device architecture. In: 9th international conference on grid and cooperative computing (GCC-2010), Nanjing, pp. 174–8, Nov 2010.Google Scholar
- 16.Lin Y-S, Lin C-Y, Lou D-C. Efficient parallel RSA decryption algorithm for many-core GPUs with CUDA. In: Proceeding of the international conference on telecommunication systems, modeling and analysis, American telecommunications systems management association, Czech Technical University, Prague, Czech Republic, May 2012.Google Scholar
- 18.NVIDIA Corporation, CUDA C Programming guide, Version 6.0, February 2014.Google Scholar