Skip to main content

Hardware and Software Support for Transposition of Bit Matrices in High-Speed Encryption

Part of the Lecture Notes in Computer Science book series (LNSC,volume 10394)


Cryptographic applications like symmetric encryption algorithms can be implemented either in bit-slice or word-parallel fashion. The conversion between the two data representations corresponds to transposing a bit-matrix with variables as row vectors. In previous work we have demonstrated that combining the best of both variants, i.e. executing part of the code in bit-slice, and part of the code in word-parallel manner, can improve performance considerably, but most of the advantage is spent for the conversion. Here, we examine the conversion routine closer and deviate different levels of hardware and software support that can accelerate the conversion, ranging from existing but seldom used instructions to completely new instructions that might be implemented in future systems. We quantify the acceleration achieved by each level of support, and provide preliminary experimental results.


  • Bit matrix transposition
  • Bit shuffle instructions
  • High-speed encryption

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-64701-2_12
  • Chapter length: 9 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   39.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-64701-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   54.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.


  1. Agosta, G., Barenghi, A., Santis, F.D., Pelosi, G.: Record setting software implementation of DES using CUDA. In: Proceedings of the 7th International Conference on Information Technology: New Generations (ITNG 2010), pp. 748–755. IEEE Computer Society (2010)

    Google Scholar 

  2. Biham, E.: A fast new DES implementation in software. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 260–272. Springer, Heidelberg (1997). doi:10.1007/BFb0052352

    CrossRef  Google Scholar 

  3. Eitschberger, P., Keller, J.: Optimizing parallel runtime of cryptanalytic algorithms by selecting between word-parallel and bit-serial variants of program parts. PARS-Mitteilungen 33, 22–31 (2016)

    Google Scholar 

  4. Grabher, P., Großschädl, J., Page, D.: Light-weight instruction set extensions for bit-sliced cryptography. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 331–345. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85053-3_21

    CrossRef  Google Scholar 

  5. Hansson, E., Kessler, C.: Global optimization of execution mode selection for the reconfigurable pram-numa multicore architecture replica. In: Proceedings of the 2nd International Symposium on Computing and Networking (CANDAR 2014), pp. 322–328. IEEE (2014)

    Google Scholar 

  6. Hansson, E., Kessler, C.: Optimized variant-selection code generation for loops on heterogeneous multicore systems. In: Proceedings of International Conference on Parallel Computing (ParCo 2015), pp. 103–112. IOS Press (2016)

    Google Scholar 

  7. Harris, D.M., Harris, S.L.: Digital Design and Computer Architecture. Morgan Kaufmann (2012)

    Google Scholar 

  8. Käsper, E., Schwabe, P.: Faster and timing-attack resistant AES-GCM. In: Clavier, C., Gaj, K. (eds.) CHES 2009. LNCS, vol. 5747, pp. 1–17. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04138-9_1

    CrossRef  Google Scholar 

  9. May, L., Penna, L., Clark, A.: An implementation of bitsliced DES on the pentium MMXTM processor. In: Dawson, E.P., Clark, A., Boyd, C. (eds.) ACISP 2000. LNCS, vol. 1841, pp. 112–122. Springer, Heidelberg (2000). doi:10.1007/10718964_10

    CrossRef  Google Scholar 

  10. Rebeiro, C., Selvakumar, D., Devi, A.S.L.: Bitslice implementation of AES. In: Pointcheval, D., Mu, Y., Chen, K. (eds.) CANS 2006. LNCS, vol. 4301, pp. 203–212. Springer, Heidelberg (2006). doi:10.1007/11935070_14

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jörg Keller .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Eitschberger, P., Keller, J., Holmbacka, S. (2017). Hardware and Software Support for Transposition of Bit Matrices in High-Speed Encryption. In: Yan, Z., Molva, R., Mazurczyk, W., Kantola, R. (eds) Network and System Security. NSS 2017. Lecture Notes in Computer Science(), vol 10394. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64700-5

  • Online ISBN: 978-3-319-64701-2

  • eBook Packages: Computer ScienceComputer Science (R0)