FPGA Implementation of 128-Bit Fused Multiply Add Unit for Crypto Processors

Kakde, Sandeep; Mahindra, Mithilesh; Khobragade, Atish; Shah, Nikit

doi:10.1007/978-3-319-22915-7_8

Sandeep Kakde⁵,
Mithilesh Mahindra⁵,
Atish Khobragade⁵ &
…
Nikit Shah⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 536))

Included in the following conference series:

International Symposium on Security in Computing and Communication

1759 Accesses
1 Citations

Abstract

Fused Multiply Add Block is an important module in high-speed math co-processors and crypto processors. The main contribution of this paper is to reduce the latency. The vital components of Fused Multiply Add (FMA) unit with multi-mode operations are Alignment Shifter, Normalization shifter, Multiplier, Dual Adder by Carry Look Ahead Adder. The major technical challenges in existing FMA architectures are latency and higher precision. In order to reduce the latency, the Multiplier is designed by using reduced complexity Wallace Multiplier and the latency of overall architecture gets reduced up to 15–25 %. In this paper, the total delay of multiplier designed using reduced complexity Wallace Multiplier is found to be 37.673 ns. In order to get higher precision, we design explicitly Alignment Shifter and Normalization Shifter in the FMA unit by using Barrel Shifter as this Alignment Shifter and Normalization Shifter will have less precision, but since replacement of these blocks by Barrel Shifter will result into higher precision and the latency is further reduced by 25–35 % and the total delay of Alignment Shifter and Normalization Shifter using Barrel Shifter is found to be 5.845 ns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Huang, L., Ma, S., Shen, L., Wang, Z., Xiao, N.: Low-cost binary128 floating-point FMA unit design with SIMD support. IEEE Trans. Comput. 61(5), 745–751 (2012)
Article MathSciNet Google Scholar
Chong, Y.J., Parameswaran, S.: Configurable multimode embedded floating-point units for FPGAs. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 19(11), 1063–8210 (2011). © 2010 IEEE
Article Google Scholar
He, J., Zhu Y.: Design and implementation of a quadruple floating-point fused multiply-add unit. In: Proceedings of the 2nd International Conference on Computer Science and Electronics Engineering (ICCSEE 2013)
Google Scholar
Galal, S., Horowitz, M.: Latency sensitive FMA design. In: 2011 20th IEEE Symposium on Computer Arithmetic, 1063-6889/11 © 2011 IEEE. doi:10.1109/ARITH.2011.26
Bruguera, J.D., Lang, T.: Floating-point fused multiply-add: reduced latency for floating-point addition. In: Proceedings of 17th IEEE Symposium Computer Arithmetic, Hyannis, pp. 27–29. June 2005
Google Scholar
IEEE Computer Society. IEEE Standard for Floating Point Arithmetic. IEEE Standard 754-2008, 3 Park Avenue New York, NY10016-5997, USA. 29 August 2008
Google Scholar
Khan, S., Kakde, S., Suryawanshi, Y.: VLSI implementation of reduced complexity Wallace multiplier using energy efficient CMOS Full Adder. In: International Conference on Computational Intelligence and Computing Research-ICCIC 2013, 978-1-4799-1597-2/13©2013 IEEE
Google Scholar
Khan, S., Kakde, S., Suryawanshi, Y.: Performance analysis of reduced complexity Wallace multiplier using energy efficient CMOS full adder. In: IEEE Sponsored International Conference on Renewable Energy and Sustainable Energy – ICRESE 2013, 978-1-4799-2075-4 © 2013 IEEE
Google Scholar
Mahindra, M., Kakde, S., Somulu G.: HDL implementation of 128- bit fused multiply add unit for multi mode SoC. In: Proceedings of ICCSP 2013, pp. 451–454, 978-4799-3357-0 ©2013 IEEE
Google Scholar

Download references

Author information

Authors and Affiliations

Nagpur, 440015, India
Sandeep Kakde, Mithilesh Mahindra, Atish Khobragade & Nikit Shah

Authors

Sandeep Kakde
View author publications
You can also search for this author in PubMed Google Scholar
Mithilesh Mahindra
View author publications
You can also search for this author in PubMed Google Scholar
Atish Khobragade
View author publications
You can also search for this author in PubMed Google Scholar
Nikit Shah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandeep Kakde .

Editor information

Editors and Affiliations

Deakin University, Geelong, Victoria, Australia
Jemal H. Abawajy
IBM Research-India, New Delhi, India
Sougata Mukherjea
Indian Institute of Information Technology and Management, Kerala, India
Sabu M. Thampi
University of Murcia, Espinardo, Murcia, Spain
Antonio Ruiz-Martínez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kakde, S., Mahindra, M., Khobragade, A., Shah, N. (2015). FPGA Implementation of 128-Bit Fused Multiply Add Unit for Crypto Processors. In: Abawajy, J., Mukherjea, S., Thampi, S., Ruiz-Martínez, A. (eds) Security in Computing and Communications. SSCC 2015. Communications in Computer and Information Science, vol 536. Springer, Cham. https://doi.org/10.1007/978-3-319-22915-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-22915-7_8
Published: 08 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22914-0
Online ISBN: 978-3-319-22915-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics