Abstract
Dot-products are one of the essential and recurrent building blocks in scientific computing, and often take-up a large proportion of the scientific acceleration circuitry. The acceleration of dot-products is very well suited for Field Programmable Gate Arrays (FPGAs) since these devices can be configured to employ wide parallelism, deep pipelining and exploit highly efficient datapaths. In this paper we present a dot-product implementation which operates using a hybrid floating-point and fixed-point number system. This design receives floating-point inputs, and generates a floating-point output. Internally it makes use of a configurable word-length fixed-point number system. The internal representation can be tuned to match the desired accuracy. Results using a high-end Xilinx FPGA and an order 150 dot-product demonstrate that, for equivalent accuracy metrics, it is possible to utilize 3.8 times fewer resources, operate at 1.62 times faster clock frequency, and achieve a significant reduction in latency when compared to a direct floating-point core based dot-product. Combining these results and utilizing the spare resources to instantiate more units in parallel, it is possible to achieve an overall speed-up of at least 5 times.
The authors would like to acknowledge the support of the EPSRC (Grant EP/C549481/1 and EP/E00024X/1) and the contributions of Dr. Eric Kerrigan.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Boland, D., Constantinides, G.: An FPGA-based Implementation of the MINRES algorithm. In: Proc. of Field Programmable Logic, pp. 379–384 (2008)
Junaid, K., Ravindrann, G.: FPGA Accelerator For Medical Image Compression System. In: Proc. of International Conference on Biomedical Engineering, pp. 396–399 (2006)
IEEE, 754 Standard for Binary Floating-Point Arithmetic (1985), http://grouper.ieee.org/groups/754/ (accessed on 25/10/2009)
Langhammer, M.: Floating point datapath synthesis for FPGAs. In: Proc. of Field Programmable Logic and Applications, pp. 355–360 (2008)
Roldao, A., Constantinides, G.: High Throughput FPGA-based Floating Point Conjugate Gradient Implementation. In: Proc. of Applied Reconfigurable Computing, pp. 75–86 (2008)
Underwood, K., Hemmert, S.: Closing the gap: CPU and FPGA Trends in sustainable floating-point BLAS Performance. In: Proc. of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 219–228 (2004)
Zhuo, L., Morris, G., Prasanna, V.: High-Performance Reduction Circuits Using Deeply Pipelined Operators on FPGAs. IEEE Transactions on Parallel and Distributed Systems 18, 1377–1392 (2007)
Saleh, H., Swartzlander, E.: A Floating-point Fused Dot-product Unit. In: Proc. of IEEE International Conference on Computer Design, pp. 427–431 (2008)
Dinechin, F.: FloPoCo is a generator of arithmetic cores (Floating-Point Cores), http://www.ens-lyon.fr/LIP/Arenaire/Ware/FloPoCo/ (accessed on 19/10/2009)
Collaborative Project, Multi-precision Floating Point Library, http://www.mpfr.org/ (accessed on 02/01/2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Roldao Lopes, A., Constantinides, G.A. (2010). A Fused Hybrid Floating-Point and Fixed-Point Dot-Product for FPGAs. In: Sirisuk, P., Morgan, F., El-Ghazawi, T., Amano, H. (eds) Reconfigurable Computing: Architectures, Tools and Applications. ARC 2010. Lecture Notes in Computer Science, vol 5992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12133-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-12133-3_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12132-6
Online ISBN: 978-3-642-12133-3
eBook Packages: Computer ScienceComputer Science (R0)