Speeding up the Montgomery Exponentiation with CMM-SDR Over GPU with Maxwell and Pascal Architecture

Wong, Xian-Fu; Goi, Bok-Min; Lee, Wai-Kong; Phan, Raphael C.-W.

doi:10.1007/978-981-10-5281-1_9

Xian-Fu Wong³,
Bok-Min Goi³,
Wai-Kong Lee⁴ &
…
Raphael C.-W. Phan⁵

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 425))

Included in the following conference series:

International Conference on Mobile and Wireless Technology

1309 Accesses

Abstract

RSA is an algorithm widely used in protecting the key exchange between two parties for secure mobile and wireless communication. Modular exponentiation is the main operation involved in RSA, which is very time consuming when the bit-size is large, usually in the range of 1024-bit to 4096-bit. The speed performance of RSA comes to concerns when thousands or millions of authentication requests are needed to handle by the server at a time, through a massive number of connected mobile and wireless devices. The performance of RSA can be improved by utilizing parallel computing architecture or enhancing existing modular exponentiation algorithm. In this paper, we exploit the massively parallel architecture in GPU to perform RSA computations. Various optimization techniques were proposed in this paper to achieve higher throughput in RSA computation in two GPU platforms. Moreover, we also incorporated signed-digit recoding to further improve the performance. To allow a fair comparison with existing implementation techniques, we proposed to evaluate the speed performance in the best case (least ‘0’ in exponent bits), average case (random exponent bits) and worse case (all ‘1’ in exponent bits). The overall throughput achieved by our implementation is about 12% higher in random exponent bits and 50% higher in all 1’s exponent bits compared to the implementation without signed-digit recoding technique. Our implementation is able to achieve 17713 and 89043 1024-bit modular exponentiation per second on random exponent bits in GTX 960 M and GTX 1080, which represent the two state of the art GPU architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Neves S, Araujo, F (2011) On the performance of GPU public-key cryptography. In: 2nd IEEE international conference on application-specific systems, architectures and processors, ASAP 2011
Google Scholar
Leboeuf K, Muscedere R, Ahmadi M (2013) A GPU implementation of the montgomery multiplication algorithm for elliptic curve cryptography. In: 2013 IEEE international symposium on circuits and systems (ISCAS 2013) (2013)
Google Scholar
Emmart N, Weems C (2015) Pushing the performance envelope of modular exponentiation across multiple generations of GPUs. In: 2015 IEEE international parallel and distributed processing symposium
Google Scholar
Emmart N, Luitjens J, Weems C, Woolley C (2016) Optimizing modular multiplication for NVIDIA’s maxwell GPUs. In: 2016 IEEE 23nd symposium on computer arithmetic (ARITH) (2016)
Google Scholar
Wu C-L, Lou D-C, Chang T-J (2008) An efficient montgomery exponentiation algorithm for public-key cryptosystems. In: 2008 IEEE international conference on intelligence and security informatics
Google Scholar
Savas E, Koc C (2000) The montgomery modular inverse-revisited. IEEE Trans Comput 49:763–766
Article MathSciNet Google Scholar
Montgomery P (1985) Modular multiplication without trial division. Math. Comput. 44:519
Article MathSciNet MATH Google Scholar
Koc CK, Acar T, Kaliski B (1996) Analyzing and comparing montgomery multiplication algorithms. IEEE Micro 16:26–33
Article Google Scholar

Download references

Acknowledgements

This research is partially supported by the Malaysia Ministry of Science, Technology & Innovation (MOSTI) eScience fund 01-02-11-SF0201 and 01-02-11-SF0202.

Author information

Authors and Affiliations

Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Sungai Long, Malaysia
Xian-Fu Wong & Bok-Min Goi
Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, Kampar, Malaysia
Wai-Kong Lee
Faculty of Engineering, Multimedia University, Cyberjaya, Malaysia
Raphael C.-W. Phan

Authors

Xian-Fu Wong
View author publications
You can also search for this author in PubMed Google Scholar
Bok-Min Goi
View author publications
You can also search for this author in PubMed Google Scholar
Wai-Kong Lee
View author publications
You can also search for this author in PubMed Google Scholar
Raphael C.-W. Phan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bok-Min Goi .

Editor information

Editors and Affiliations

B-dong Intellige 2, Korea Industry Security Forum, Seongnam-si, Kyonggi-do, Korea (Republic of)
Kuinam J. Kim
modelizeIT Inc., CEO and NYU , Stony Brook, New York, USA
Nikolai Joukov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wong, XF., Goi, BM., Lee, WK., Phan, R.CW. (2018). Speeding up the Montgomery Exponentiation with CMM-SDR Over GPU with Maxwell and Pascal Architecture. In: Kim, K., Joukov, N. (eds) Mobile and Wireless Technologies 2017. ICMWT 2017. Lecture Notes in Electrical Engineering, vol 425. Springer, Singapore. https://doi.org/10.1007/978-981-10-5281-1_9

Download citation

DOI: https://doi.org/10.1007/978-981-10-5281-1_9
Published: 17 June 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5280-4
Online ISBN: 978-981-10-5281-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics