Skip to main content

Efficient Software Implementation of Modular Multiplication in Prime Fields on TI’s DSP TMS320C6678

  • Conference paper
  • First Online:
Book cover Information Security Applications (WISA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10763))

Included in the following conference series:

  • 1015 Accesses

Abstract

Fast modular multiplication on the state-of-the-art digital signal processor (DSP) is studied in this work. More specifically, Montgomery multiplication over a prime field for an arbitrary 256-bit p is implemented on TMS320C6678 DSP by Texas Instruments. Two implementations optimized for latency and throughput are designed. The implementations are based on the k-bit divided Montgomery modular multiplication algorithm by Kornerup. The algorithm is extended to run two independent Montgomery multiplication in parallel thereby running efficiently on the target DSP by exploiting its symmetric functional units. The proposed implementations are advantageous than the previous implementation proposed by Itoh et al. in terms of latency and throughput. The latency of 0.496 [\(\upmu \)s] of the proposed implementation is only 17% of 2.86 [\(\upmu \)s] for the implementation proposed by Itoh et al. Moreover, the throughput \(4.03 \times 10^6\) [Montgomery multiplication(MM)/s] in the present case is more than \(\times \)10 the value of \(0.37 \times 10^6\) [MM/s] from the previous work.

This work was supported by SECOM Science and Technology Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Many algorithms can support the case that M is odd.

References

  1. Costello, C.: Pairings for beginners. http://www.craigcostello.com.au/pairings/PairingsForBeginners.pdf

  2. Gura, N., Patel, A., Wander, A., Eberle, H., Shantz, S.C.: Comparing elliptic curve cryptography and RSA on 8-bit CPUs. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 119–132. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28632-5_9

    Chapter  MATH  Google Scholar 

  3. Kargl, A., Pyka, S., Seuschek, H.: Fast arithmetic on ATmega128 for elliptic curve cryptography. Cryptology ePrint Archive, Report 2008/442 (2008). http://eprint.iacr.org/2008/442

  4. Szerwinski, R., Güneysu, T.: Exploiting the power of GPUs for asymmetric cryptography. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 79–99. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85053-3_6

    Chapter  Google Scholar 

  5. Gouvêa, C.P.L., Oliveira, L.B., López, J.: Efficient software implementation of public-key cryptography on sensor networks using the MSP430x microcontroller. J. Cryptograph. Eng. 2(1), 19–29 (2012)

    Article  Google Scholar 

  6. Itoh, K., Takenaka, M., Torii, N., Temma, S., Kurihara, Y.: Fast implementation of public-key cryptography on a DSP TMS320C6201. In: Koç, Ç.K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 61–72. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48059-5_7

    Chapter  Google Scholar 

  7. Texas Instruments: C66x CPU and instruction set reference guide. http://www.ti.com/lit/ug/sprugh7/sprugh7.pdf

  8. Texas Instruments: AM572x Sitara processors. http://www.tij.co.jp/jp/lit/ds/symlink/am5728.pdf

  9. Kornerup, P.: High-radix modular multiplication for cryptosystems. In: Proceedings of the 11th Symposium on Computer Arithmetic 1993, pp. 277–283. IEEE (1993)

    Google Scholar 

  10. Texas Instruments: C66x DSP cache user’s guide. http://www.ti.com/lit/ug/sprugy8/sprugy8.pdf

  11. Montgomery, P.L.: Modular multiplication without trial division. Math. Comput. 44(170), 519–521 (1985)

    Article  MathSciNet  Google Scholar 

  12. Kaihara, M.E., Takagi, N.: Bipartite modular multiplication. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 201–210. Springer, Heidelberg (2005). https://doi.org/10.1007/11545262_15

    Chapter  Google Scholar 

  13. Mentens, N., Sakiyama, K., Preneel, B., Verbauwhede, I.: Efficient pipelining for modular multiplication architectures in prime fields. In: Proceedings of the 17th ACM Great Lakes symposium on VLSI, pp. 534–539. ACM (2007)

    Google Scholar 

  14. Fan, J., Sakiyama, K., Verbauwhede, I.: Montgomery modular multiplication algorithm on multi-core systems. In: 2007 IEEE Workshop on Signal Processing Systems, pp. 261–266. IEEE (2007)

    Google Scholar 

  15. Knežević, M., Vercauteren, F., Verbauwhede, I.: Speeding up bipartite modular multiplication. In: Hasan, M.A., Helleseth, T. (eds.) WAIFI 2010. LNCS, vol. 6087, pp. 166–179. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13797-6_12

    Chapter  Google Scholar 

  16. Koç, Ç.K., Acar, T., Kaliski, B.S.: Analyzing and comparing Montgomery multiplication algorithms. IEEE Micro 16(3), 26–33 (1996)

    Article  Google Scholar 

  17. Recommended elliptic curves for federal government use. http://csrc.nist.gov/groups/ST/toolkit/documents/dss/NISTReCur.pdf

  18. Tenca, A.F., Koç, Ç.K.: A scalable architecture for modular multiplication based on Montgomery’s algorithm. IEEE Trans. Comput. 52(9), 1215–1221 (2003)

    Article  Google Scholar 

  19. Brown, M., Hankerson, D., López, J., Menezes, A.: Software implementation of the NIST elliptic curves over prime fields. In: Naccache, D. (ed.) CT-RSA 2001. LNCS, vol. 2020, pp. 250–265. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45353-9_19

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Eito Miyamoto , Takeshi Sugawara or Kazuo Sakiyama .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Miyamoto, E., Sugawara, T., Sakiyama, K. (2018). Efficient Software Implementation of Modular Multiplication in Prime Fields on TI’s DSP TMS320C6678. In: Kang, B., Kim, T. (eds) Information Security Applications. WISA 2017. Lecture Notes in Computer Science(), vol 10763. Springer, Cham. https://doi.org/10.1007/978-3-319-93563-8_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93563-8_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93562-1

  • Online ISBN: 978-3-319-93563-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics