Advertisement

Hardware and Software Support for Transposition of Bit Matrices in High-Speed Encryption

Conference paper
  • 2.4k Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10394)

Abstract

Cryptographic applications like symmetric encryption algorithms can be implemented either in bit-slice or word-parallel fashion. The conversion between the two data representations corresponds to transposing a bit-matrix with variables as row vectors. In previous work we have demonstrated that combining the best of both variants, i.e. executing part of the code in bit-slice, and part of the code in word-parallel manner, can improve performance considerably, but most of the advantage is spent for the conversion. Here, we examine the conversion routine closer and deviate different levels of hardware and software support that can accelerate the conversion, ranging from existing but seldom used instructions to completely new instructions that might be implemented in future systems. We quantify the acceleration achieved by each level of support, and provide preliminary experimental results.

Keywords

Bit matrix transposition Bit shuffle instructions High-speed encryption 

References

  1. 1.
    Agosta, G., Barenghi, A., Santis, F.D., Pelosi, G.: Record setting software implementation of DES using CUDA. In: Proceedings of the 7th International Conference on Information Technology: New Generations (ITNG 2010), pp. 748–755. IEEE Computer Society (2010)Google Scholar
  2. 2.
    Biham, E.: A fast new DES implementation in software. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 260–272. Springer, Heidelberg (1997). doi: 10.1007/BFb0052352 CrossRefGoogle Scholar
  3. 3.
    Eitschberger, P., Keller, J.: Optimizing parallel runtime of cryptanalytic algorithms by selecting between word-parallel and bit-serial variants of program parts. PARS-Mitteilungen 33, 22–31 (2016)Google Scholar
  4. 4.
    Grabher, P., Großschädl, J., Page, D.: Light-weight instruction set extensions for bit-sliced cryptography. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 331–345. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-85053-3_21 CrossRefGoogle Scholar
  5. 5.
    Hansson, E., Kessler, C.: Global optimization of execution mode selection for the reconfigurable pram-numa multicore architecture replica. In: Proceedings of the 2nd International Symposium on Computing and Networking (CANDAR 2014), pp. 322–328. IEEE (2014)Google Scholar
  6. 6.
    Hansson, E., Kessler, C.: Optimized variant-selection code generation for loops on heterogeneous multicore systems. In: Proceedings of International Conference on Parallel Computing (ParCo 2015), pp. 103–112. IOS Press (2016)Google Scholar
  7. 7.
    Harris, D.M., Harris, S.L.: Digital Design and Computer Architecture. Morgan Kaufmann (2012)Google Scholar
  8. 8.
    Käsper, E., Schwabe, P.: Faster and timing-attack resistant AES-GCM. In: Clavier, C., Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 1–17. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-04138-9_1 CrossRefGoogle Scholar
  9. 9.
    May, L., Penna, L., Clark, A.: An implementation of bitsliced DES on the pentium MMXTM processor. In: Dawson, E.P., Clark, A., Boyd, C. (eds.) ACISP 2000. LNCS, vol. 1841, pp. 112–122. Springer, Heidelberg (2000). doi: 10.1007/10718964_10 CrossRefGoogle Scholar
  10. 10.
    Rebeiro, C., Selvakumar, D., Devi, A.S.L.: Bitslice implementation of AES. In: Pointcheval, D., Mu, Y., Chen, K. (eds.) CANS 2006. LNCS, vol. 4301, pp. 203–212. Springer, Heidelberg (2006). doi: 10.1007/11935070_14 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Faculty of Mathematics and Computer ScienceFernUniversität in HagenHagenGermany
  2. 2.Faculty of Science and EngineeringÅbo Akademi UniversityTurkuFinland

Personalised recommendations