Abstract
Asymmetric cryptographic algorithms (e.g., RSA and ECC) have been implemented on Graphics Processing Units (GPUs) for several years. These implementations mainly exploit the highly parallel GPU architecture and port the integer-based algorithms for common CPUs to GPUs, offering high performance. However, the great potential cryptographic computing power of GPUs, especially by the more powerful floating-point instructions, has not been comprehensively investigated in fact. In this paper, we try to fully exploit the floating-point computing power of GPUs for RSA, by various designs, including the floating-point-based Montgomery multiplication algorithm, the optimization for the fundamental operations and the utilization of the latest thread data sharing instruction shuffle. The experimental result on NVIDIA GTX Titan of 2048-bit RSA decryption reaches a throughput of 38,975 operations per second, achieves 2.21 times performance of the existing fastest integer-based work and outperforms the previous floating-point-based implementation by a large margin.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Antão, S., Bajard, J.C., Sousa, L.: Elliptic curve point multiplication on GPUs. In: IEEE International Conference on Application-specific Systems Architectures and Processors (ASAP), pp. 192–199. IEEE (2010)
Antão, S., Bajard, J.C., Sousa, L.: RNS-Based elliptic curve point multiplication for massive parallel architectures. The Computer Journal 55(5), 629–647 (2012)
Bernstein, D.J., Chen, H.C., Chen, M.S., Cheng, C.M., Hsiao, C.H., Lange, T., Lin, Z.C., Yang, B.Y.: The billion-mulmod-per-second PC. In: Workshop Record of SHARCS, vol. 9, pp. 131–144 (2009)
Bernstein, D.J., Chen, T.-R., Cheng, C.-M., Lange, T., Yang, B.-Y.: ECM on graphics cards. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 483–501. Springer, Heidelberg (2009)
Bos, J.W.: Low-latency elliptic curve scalar multiplication. International Journal of Parallel Programming 40(5), 532–550 (2012)
IEEE Standards Committee, et al.: 754-2008 IEEE standard for floating-point arithmetic. IEEE Computer Society Std. 2008 (2008)
Granlund, T.: the gmp development team. gnu mp: The gnu multiple precision arithmetic library, 5.1 (2013)
Hankerson, D., Vanstone, S., Menezes, A.J.: Guide to elliptic curve cryptography. Springer (2004)
Harrison, O., Waldron, J.: Efficient acceleration of asymmetric cryptography on graphics hardware. In: Preneel, B. (ed.) AFRICACRYPT 2009. LNCS, vol. 5580, pp. 350–367. Springer, Heidelberg (2009)
Henry, R., Goldberg, I.: Solving discrete logarithms in smooth-order groups with CUDA. In: Workshop Record of SHARCS, pp. 101–118. Citeseer (2012)
Jang, K., Han, S., Han, S., Moon, S., Park, K.: Sslshader: Cheap ssl acceleration with commodity processors. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, p. 1. USENIX Association (2011)
Jeffrey, A., Robinson, B.D.: Fast GPU Based Modular Multiplication, http://on-demand.gputechconf.com/gtc/2014/poster/pdf/P4156_montgomery_multiplication_CUDA_concurrent.pdf
Jonsson, J., Kaliski, B.: Public-key cryptography standards (PKCS)# 1: RSA cryptography specifications version 2.1 (2003)
Knuth, D.E.: The Art of Computer Programming: Seminumerical Algorithms, vol. 2, p. 116. Addison-Wesley, Reading (1981)
Koblitz, N.: Elliptic curve cryptosystems. Mathematics of Computation 48(177), 203–209 (1987)
Koç, C.K.: High-speed RSA implementation. Technical report, RSA Laboratories (1994)
Koç, Ç.K., Acar, T., Kaliski Jr., B.S.: Analyzing and comparing Montgomery multiplication algorithms. IEEE Micro 16(3), 26–33 (1996)
Leboeuf, K., Muscedere, R., Ahmadi, M.: A GPU implementation of the Montgomery multiplication algorithm for elliptic curve cryptography. In: 2013 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2593–2596. IEEE (2013)
Miller, V.S.: Use of elliptic curves in cryptography. In: Williams, H.C. (ed.) CRYPTO 1985. LNCS, vol. 218, pp. 417–426. Springer, Heidelberg (1986)
Montgomery, P.L.: Modular multiplication without trial division. Mathematics of Computation 44(170), 519–521 (1985)
Moss, A., Page, D., Smart, N.P.: Toward acceleration of RSA using 3D graphics hardware. In: Galbraith, S.D. (ed.) Cryptography and Coding 2007. LNCS, vol. 4887, pp. 364–383. Springer, Heidelberg (2007)
Neves, S., Araujo, F.: On the performance of GPU public-key cryptography. In: 2011 IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP), pp. 133–140. IEEE (2011)
NVIDIA: NVIDIA CUDA Math API, http://docs.nvidia.com/cuda/cuda-math-api/index.html#axzz308wmibga
NVIDIA: NVIDIA GeForce Kepler GK110 Writepaper, http://159.226.251.229/videoplayer/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf?ich_u_r_i=e1d64c09bd2771cfc26f9ac8922d9e6d&ich_s_t_a_r_t=0&ich_e_n_d=0&ich_k_e_y=1445068925750663282471&ich_t_y_p_e=1&ich_d_i_s_k_i_d=1&ich_u_n_i_t=1
NVIDIA: CUDA C Programming Guide 5.5 (2013), http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
NVIDIA: Shuffle: Tips and Tricks (2013), http://on-demand.gputechconf.com/gtc/2013/presentations/S3174-Kepler-Shuffle-Tips-Tricks.pdf
Orup, H.: Simplifying quotient determination in high-radix modular multiplication. In: Proceedings of the 12th Symposium on Computer Arithmetic, pp. 193–199. IEEE (1995)
Pu, S., Liu, J.-C.: EAGL: An Elliptic Curve Arithmetic GPU-Based Library for Bilinear Pairing. In: Cao, Z., Zhang, F. (eds.) Pairing 2013. LNCS, vol. 8365, pp. 1–19. Springer, Heidelberg (2014)
Quisquater, J.J., Couvreur, C.: Fast decipherment algorithm for RSA public-key cryptosystem. Electronics Letters 18(21), 905–907 (1982)
Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM 21(2), 120–126 (1978)
Solinas, J.A.: Generalized mersenne numbers. Citeseer (1999)
Szerwinski, R., Güneysu, T.: Exploiting the power of GPUs for asymmetric cryptography. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 79–99. Springer, Heidelberg (2008)
Wikipedia: Wikipedia: List of NVIDIA graphics processing units (2014), http://en.wikipedia.org/wiki/Comparison_of_NVIDIA_Graphics_Processing_Units
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zheng, F., Pan, W., Lin, J., Jing, J., Zhao, Y. (2014). Exploiting the Floating-Point Computing Power of GPUs for RSA. In: Chow, S.S.M., Camenisch, J., Hui, L.C.K., Yiu, S.M. (eds) Information Security. ISC 2014. Lecture Notes in Computer Science, vol 8783. Springer, Cham. https://doi.org/10.1007/978-3-319-13257-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-13257-0_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13256-3
Online ISBN: 978-3-319-13257-0
eBook Packages: Computer ScienceComputer Science (R0)