Abstract
The ease of programming offered by the CUDA programming model attracted a lot of programmers to try the platform for acceleration of many non-graphics applications. Cryptography, being no exception, also found its share of exploration efforts, especially block ciphers. In this contribution we present a detailed walk-through of effective mapping of HC-128 and HC-256 stream ciphers on GPUs. Due to inherent inter-S-Box dependencies, intra-S-Box dependencies and a high number of memory accesses per keystream word generation, parallelization of HC series of stream ciphers remains challenging. For the first time, we present various optimization strategies for HC-128 and HC-256 speedup in tune with CUDA device architecture. The peak performance achieved with a single data-stream for HC-128 and HC-256 is 0.95 Gbps and 0.41 Gbps respectively. Although these throughput figures do not beat the CPU performance (10.9 Gbps for HC-128 and 7.5 Gbps for HC-256), our multiple parallel data-stream implementation is benchmarked to reach approximately 31 Gbps for HC-128 and 14 Gbps for HC-256 (with 32768 parallel data-streams). To the best of our knowledge, this is the first reported effort of mapping HC-Series of stream ciphers on GPUs.
This work was done in part while the second author was a summer intern and the third author was an Alexander von Humboldt Fellow at RWTH Aachen, Germany.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Manavski, S.A.: CUDA compatible GPU as an efficient hardware accelerator for AES cryptography. In: International Signal Processing and Communications (ICSPC), pp. 65–68. IEEE (2007)
Biagio, A., Barenghi, A., Agosta, G., Pelosi, G.: Design of a parallel AES for graphics hardware using the CUDA framework. In: International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–8. IEEE (2009)
Iwai, K., Nishikawa, N., Kurokawa, T.: Acceleration of AES encryption on CUDA GPU. International Journal of Networking and Computing 2(1), 131–145 (2012)
Liu, G., An, H., Han, W., Xu, G., Yao, P., Xu, M., Hao, X., Wang, Y.: A Program Behavior Study of Block Cryptography Algorithms on GPGPU. In: Fourth International Conference on Frontier of Computer Science and Technology 2009, FCST 2009, pp. 33–39. IEEE (2009)
Nishikawa, N., Iwai, K., Kurokawa, T.: High-Performance Symmetric Block Ciphers on Multicore CPU and GPUs. International Journal of Networking and Computing 2(2), 251–268 (2012)
Cannire, C.D.: eSTREAM testing framework, http://www.ecrypt.eu.org/stream/perf
Stefan, D.: Analysis and Implementation of eSTREAM and SHA-3 Cryptographic Algorithms (2011), http://hgpu.org/?p=5972
Bauer, M., Cook, H., Khailany, B.: CudaDMA: Optimizing GPU memory bandwidth via warp specialization. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, New York (2011); Article 12
http://stanford-cs193g-sp2010.googlecode.com/svn/trunk/lectures/lecture_4/cuda_memories.pdf
Bernstein, D.: Cache-timing attacks on AES (2005), http://cr.yp.to/papers.html#cachetiming
Boneh, D., DeMillo, R.A., Lipton, R.J.: On the importance of checking cryptographic protocols for faults. In: Fumy, W. (ed.) EUROCRYPT 1997. LNCS, vol. 1233, pp. 37–51. Springer, Heidelberg (1997)
eSTREAM: the ECRYPT Stream Cipher Project, http://www.ecrypt.eu.org/stream
Kircanski, A., Youssef, A.M.: Differential Fault Analysis of HC-128. In: Bernstein, D.J., Lange, T. (eds.) AFRICACRYPT 2010. LNCS, vol. 6055, pp. 261–278. Springer, Heidelberg (2010)
Liu, Y., Qin, T.: The key and IV setup of the stream ciphers HC-256 and HC-128. In: International Conference on Networks Security, Wireless Communications and Trusted Computing, pp. 430–433. IEEE (2009)
Maitra, S., Paul, G., Raizada, S., Sen, S., Sengupta, R.: Some observations on HC-128. Designs, Codes and Cryptography 59(1-3), 231–245 (2011)
Zenner, E.: A Cache Timing Analysis of HC-256. In: Avanzi, R.M., Keliher, L., Sica, F. (eds.) SAC 2008. LNCS, vol. 5381, pp. 199–213. Springer, Heidelberg (2009)
Osvik, D.A., Shamir, A., Tromer, E.: Cache Attacks and Countermeasures: The Case of AES. In: Pointcheval, D. (ed.) CT-RSA 2006. LNCS, vol. 3860, pp. 1–20. Springer, Heidelberg (2006)
Paul, G., Maitra, S., Raizada, S.: A Theoretical Analysis of the Structure of HC-128. In: Iwata, T., Nishigaki, M. (eds.) IWSEC 2011. LNCS, vol. 7038, pp. 161–177. Springer, Heidelberg (2011)
Sekar, G., Preneel, B.: Improved Distinguishing Attacks on HC-256. In: Takagi, T., Mambo, M. (eds.) IWSEC 2009. LNCS, vol. 5824, pp. 38–52. Springer, Heidelberg (2009)
Stankovski, P., Ruj, S., Hell, M., Johansson, T.: Improved distinguishers for HC-128. Designs, Codes and Cryptography 63(2), 225–240 (2012)
Wu, H.: The Stream Cipher HC-128, http://www.ecrypt.eu.org/stream/hcp3.html
Wu, H.: A New Stream Cipher HC-256. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp. 226–244. Springer, Heidelberg (2004), http://eprint.iacr.org/2004/092.pdf
Chattopadhyay, A., Khalid, A., Maitra, S., Raizada, S.: Designing High-Throughput Hardware Accelerator for Stream Cipher HC-128. In: International Symposium on Circuits and systems (ISCAS), pp. 1448–1451. IEEE (2012)
NVIDIA CUDA, http://developer.NVidia.com/object/CUDA.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Khalid, A., Bagchi, D., Paul, G., Chattopadhyay, A. (2013). Optimized GPU Implementation and Performance Analysis of HC Series of Stream Ciphers. In: Kwon, T., Lee, MK., Kwon, D. (eds) Information Security and Cryptology – ICISC 2012. ICISC 2012. Lecture Notes in Computer Science, vol 7839. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37682-5_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-37682-5_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37681-8
Online ISBN: 978-3-642-37682-5
eBook Packages: Computer ScienceComputer Science (R0)