Advertisement

Performance Prediction Model for Block Ciphers on GPU Architectures

  • Naoki Nishikawa
  • Keisuke Iwai
  • Hidema Tanaka
  • Takakazu Kurokawa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7873)

Abstract

This paper presents a proposal of a performance prediction model of block ciphers on GPU architectures. The model comprises three phases: micro-benchmarks, analyzing code, and performance equations. Micro-benchmarks are developed in OpenCL considering scalability for GPU architectures of all kinds. Performance equations are developed, extracting some features of GPU architectures. Overall latencies of AES, Camellia, and SC2000, which covers all types of block ciphers, are inside the range of estimated latencies from the model. Moreover, assuming that out-of-order scheduling by Nvidia GPU works well, the model predicted overall encryption latencies respectively with 2.0 % and 8.8 % error for the best case on Nvidia Geforce GTX 580 and GTX 280. This model supports algebraic and bitslice implementation, although evaluation of the model is conducted in this paper only on table-based implementation.

Keywords

Performance prediction GPU OpenCL AES Camellia SC2000 Micro-benchmark 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cryptography Research and Evaluation Committees, http://www.cryptrec.go.jp/english/index.html
  2. 2.
    New European Schemes for Signatures, Integrity and Encryption, https://www.cosic.esat.kuleuven.be/nessie/
  3. 3.
    NVIDIA Corp.: NVIDIA CUDA Programming Guide 4.2 (2012)Google Scholar
  4. 4.
    NVIDIA Corp.: Profiler User’s Guide (2012)Google Scholar
  5. 5.
    Khronos Group: Open Compute Language, http://www.khronos.org/
  6. 6.
    National Institute of Standards and Technology (NIST): FIPS-197 Advanced Encryption Standard, AES (2001)Google Scholar
  7. 7.
    Aoki, K., Ichikawa, T., Kanda, M., Matsui, M., Moriai, S., Nakajima, J., Tokita, T.: Camellia: A 128-bit block cipher suitable for multiple platforms - design and analysis. In: Stinson, D.R., Tavares, S. (eds.) SAC 2000. LNCS, vol. 2012, pp. 39–56. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  8. 8.
    Shimoyama, T., Yanami, H., Yokoyama, K., Takenaka, M., Itoh, K., Yajima, J., Torii, N., Tanaka, H.: The Block Cipher SC2000. In: Matsui, M. (ed.) FSE 2001. LNCS, vol. 2355, pp. 312–327. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Matsui, M.: How far can we go on the x64 processors? In: Robshaw, M. (ed.) FSE 2006. LNCS, vol. 4047, pp. 341–358. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    NVIDIA Corp.: NVIDIA Nsight Visual Studio Edition 2.2 User Guide (2011)Google Scholar
  11. 11.
    NVIDIA Corp.: cuobjdump Application Note version 03 (2011)Google Scholar
  12. 12.
    Kothapalli, K., Mukherjee, R., Rehman, M.S., Patidar, S., Narayanan, P.J., Srinathan, K.: A performance prediction model for the cuda gpgpu platform. In: Yang, Y., Parashar, M., Muralidhar, R., Prasanna, V.K. (eds.) HiPC, pp. 463–472. IEEE (2009)Google Scholar
  13. 13.
    Guo, P., Wang, L.: Accurate cuda performance modeling for sparse matrix-vector multiplication. In: HPCS, pp. 496–502 (2012)Google Scholar
  14. 14.
    Hong, S., Kim, H.: An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA 2009, pp. 152–163. ACM, New York (2009)CrossRefGoogle Scholar
  15. 15.
    Zhang, Y., Owens, J.D.: A quantitative performance analysis model for gpu architectures. In: HPCA, pp. 382–393 (2011)Google Scholar
  16. 16.
    van der Laan, W.J.: Decuda and Cudasm, the cubin utilities package (2009), https://github.com/laanwj/decuda
  17. 17.
    Collange, S., Daumas, M., Defour, D., Parello, D.: Barra: A parallel functional simulator for gpgpu. In: MASCOTS, pp. 351–360. IEEE (2010)Google Scholar
  18. 18.
    Baghsorkhi, S.S., Delahaye, M., Patel, S.J., Gropp, W.D., Hwu, W.-M.W.: An adaptive performance modeling tool for gpu architectures. In: PPOPP, pp. 105–114 (2010)Google Scholar
  19. 19.
    NVIDIA Corp.: OpenCL Programming Guide for the CUDA Architecture (2012)Google Scholar
  20. 20.
    NVIDIA Corp.: Whitepaper for NVIDIA’s Fermi Architecture (2009)Google Scholar
  21. 21.
    AMD Corp.: Reference Guide of Southern Islands Series Instruction Set Architecture (2012)Google Scholar
  22. 22.
    AMD Corp.: AMD Accelerated Parallel Processing OpenCL Programming Guide rev. 2.4 (2012)Google Scholar
  23. 23.
    The IEEE Security in Storage Working Group: XTS block cipher-based mode (XEX-based tweaked-codebook mode with ciphertext stealing), http://siswg.net/
  24. 24.
    Osvik, D.A., Bos, J.W., Stefan, D., Canright, D.: Fast software AES encryption. In: Hong, S., Iwata, T. (eds.) FSE 2010. LNCS, vol. 6147, pp. 75–93. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  25. 25.
    Biagio, A.D., Barenghi, A., Agosta, G., Pelosi, G.: Design of a parallel AES for graphics hardware using the CUDA framework. In: International Parallel and Distributed Processing Symposium, pp. 1–8 (2009)Google Scholar
  26. 26.
    Resios, A., Holdermans: GPU performance prediction using parametrized models. Master Thesis of Utrecht University (2009)Google Scholar
  27. 27.
    Wong, H., Papadopoulou, M.M., Sadooghi-Alvandi, M., Moshovos, A.: Demystifying gpu microarchitecture through microbenchmarking. In: 2010 IEEE International Symposium on Performance Analysis of Systems Software, ISPASS, pp. 235–246 (2010)Google Scholar
  28. 28.
    Biham, E.: A fast new DES implementation in software. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 260–272. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  29. 29.
    Agosta, G., Barenghi, A., De Santis, F., Pelosi, G.: Record setting software implementation of des using cuda. In: Proceedings of the 2010 Seventh International Conference on Information Technology: New Generations, ITNG 2010, pp. 748–755 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Naoki Nishikawa
    • 1
  • Keisuke Iwai
    • 1
  • Hidema Tanaka
    • 1
  • Takakazu Kurokawa
    • 1
  1. 1.Department of Computer Science and EngineeringNational Defense Academy of JapanYokosuka-shiJapan

Personalised recommendations