Hardware Implementation of Floating-Point Arithmetic

Muller, Jean-Michel; Brunie, Nicolas; de Dinechin, Florent; Jeannerod, Claude-Pierre; Joldes, Mioara; Lefèvre, Vincent; Melquiond, Guillaume; Revol, Nathalie; Torres, Serge

doi:10.1007/978-3-319-76526-6_8

Hardware Implementation of Floating-Point Arithmetic

Jean-Michel Muller¹⁰,
Nicolas Brunie¹¹,
Florent de Dinechin¹²,
Claude-Pierre Jeannerod¹³,
Mioara Joldes¹⁴,
Vincent Lefèvre¹³,
Guillaume Melquiond¹⁵,
Nathalie Revol¹³ &
…
Serge Torres¹⁶

Chapter
First Online: 03 May 2018

2796 Accesses
2 Citations

Abstract

Chapter 7 has shown that operations on floating-point numbers are naturally expressed in terms of integer or fixed-point operations on the significand and the exponent. For instance, to obtain the product of two floating-point numbers, one basically multiplies the significands and adds the exponents. However, obtaining the correct rounding of the result may require considerable design effort and the use of nonarithmetic primitives such as leading-zero counters and shifters. This chapter details the implementation of these algorithms in hardware, using digital logic.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

E. Abu-Shama and M. Bayoumi. A new cell for low power adders. In International Symposium on Circuits and Systems (ISCAS), pages 1014–1017, 1996.
Google Scholar
L. Aksoy, E. Costa, P. Flores, and J. Monteiro. Optimization of area in digital FIR filters using gate-level metrics. In Design Automation Conference, pages 420–423, 2007.
Google Scholar
Altera Corporation. FFT/IFFT Block Floating Point Scaling, 2005. Application note 404-1.0.
Google Scholar
A. Avizienis. Signed-digit number representations for fast parallel arithmetic. IRE Transactions on Electronic Computers, 10:389–400, 1961. Reprinted in [584].
Article MathSciNet Google Scholar
S. Banescu, F. de Dinechin, B. Pasca, and R. Tudoran. Multipliers for floating-point double precision and beyond on FPGAs. ACM SIGARCH Computer Architecture News, 38:73–79, 2010.
Article Google Scholar
C. Berg. Formal Verification of an IEEE Floating-Point Adder. Master’s thesis, Universität des Saarlandes, Germany, 2001.
Google Scholar
A. D. Booth. A signed binary multiplication technique. Quarterly Journal of Mechanics and Applied Mathematics, 4(2):236–240, 1951. Reprinted in [583].
Google Scholar
N. Boullis and A. Tisserand. Some optimizations of hardware multiplication by constant matrices. IEEE Transactions on Computers, 54(10):1271–1282, 2005.
Article Google Scholar
N. Brisebarre, F. de Dinechin, and J.-M. Muller. Integer and floating-point constant multipliers for FPGAs. In Application-specific Systems, Architectures and Processors, pages 239–244, 2008.
Google Scholar
N. Brisebarre and J.-M. Muller. Correctly rounded multiplication by arbitrary precision constants. IEEE Transactions on Computers, 57(2):165–174, 2008.
Article MathSciNet Google Scholar
N. Brisebarre, J.-M. Muller, and S.-K. Raina. Accelerating correctly rounded floating-point division when the divisor is known in advance. IEEE Transactions on Computers, 53(8):1069–1072, 2004.
Article Google Scholar
J. D. Bruguera and T. Lang. Leading-one prediction with concurrent position correction. IEEE Transactions on Computers, 48(10):1083–1097, 1999.
Article Google Scholar
J. D. Bruguera and T. Lang. Floating-point fused multiply-add: Reduced latency for floating-point addition. In 17th IEEE Symposium on Computer Arithmetic (ARITH-17), Cape Cod, MA, USA, June 2005.
Google Scholar
N. Brunie. Modified FMA for exact low precision product accumulation. In 24th IEEE Symposium on Computer Arithmetic (ARITH-24), pages 106–113, July 2017.
Google Scholar
H. T. Bui, Y. Wang, and Y. Jiang. Design and analysis of low-power 10-transistor full adders using novel XORXNOR gates. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 49(1), 2003.
Google Scholar
F. Y. Busaba, C. A. Krygowski, W. H. Li, E. M. Schwarz, and S. R. Carlough. The IBM z900 decimal arithmetic unit. In 35th Asilomar Conference on Signals, Systems, and Computers, volume 2, pages 1335–1339, November 2001.
Google Scholar
P. R. Cappello and K. Steiglitz. A VLSI layout for a pipelined Dadda multiplier. ACM Transactions on Computer Systems, 1(2):157–174, 1983. Reprinted in [584].
Article Google Scholar
A. Cauchy. Sur les moyens d’éviter les erreurs dans les calculs numériques. Comptes Rendus de l’Académie des Sciences, Paris, 11:789–798, 1840. Republished in: Augustin Cauchy, œuvres complètes, 1ère série, Tome V, pages 431–442.
Google Scholar
K. D. Chapman. Fast integer multipliers fit in FPGAs (EDN 1993 design idea winner). EDN Magazine, 1994.
Google Scholar
M. Cornea, C. Anderson, J. Harrison, P. T. P. Tang, E. Schneider, and C. Tsen. A software implementation of the IEEE 754R decimal floating-point arithmetic using the binary encoding format. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 29–37, June 2007.
Google Scholar
M. Cornea, J. Harrison, C. Anderson, P. T. P. Tang, E. Schneider, and E. Gvozdev. A software implementation of the IEEE 754R decimal floating-point arithmetic using the binary encoding format. IEEE Transactions on Computers, 58(2):148–162, 2009.
Article MathSciNet Google Scholar
M. F. Cowlishaw. Decimal floating-point: algorism for computers. In 16th IEEE Symposium on Computer Arithmetic (ARITH-16), pages 104–111, June 2003.
Google Scholar
M. F. Cowlishaw, E. M. Schwarz, R. M. Smith, and C. F. Webb. A decimal floating-point specification. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 147–154, June 2001.
Google Scholar
O. Creţ, F. de Dinechin, I. Trestian, R. Tudoran, L. Creţ, and L. Vǎcariu. FPGA-based acceleration of the computations involved in transcranial magnetic stimulation. In Southern Programmable Logic Conference, pages 43–48, 2008.
Google Scholar
L. Dadda. Some schemes for parallel multipliers. Alta Frequenza, 34:349–356, 1965. Reprinted in [583].
Google Scholar
L. Dadda. On parallel digital multipliers. Alta Frequenza, 45:574–580, 1976. Reprinted in [583].
Google Scholar
D. Das Sarma and D. W. Matula. Measuring the accuracy of ROM reciprocal tables. IEEE Transactions on Computers, 43(8):932–940, 1994.
Article Google Scholar
D. Das Sarma and D. W. Matula. Faithful bipartite ROM reciprocal tables. In 12th IEEE Symposium on Computer Arithmetic (ARITH-12), pages 17–28, June 1995.
Google Scholar
D. Das Sarma and D. W. Matula. Faithful interpolation in reciprocal tables. In 13th IEEE Symposium on Computer Arithmetic (ARITH-13), pages 82–91, July 1997.
Google Scholar
F. de Dinechin. The price of routing in FPGAs. Journal of Universal Computer Science, 6(2):227–239, 2000.
Google Scholar
F. de Dinechin. Multiplication by rational constants. IEEE Transactions on Circuits and Systems, II, 52(2):98–102, 2012.
Article Google Scholar
F. de Dinechin and L.-S. Didier. Table-based division by small integer constants. In Applied Reconfigurable Computing, pages 53–63, March 2012.
Google Scholar
F. de Dinechin, P. Echeverría, M. López-Vallejo, and B. Pasca. Floating-point exponentiation units for reconfigurable computing. ACM Transactions on Reconfigurable Technology and Systems, 6(1), 2013.
Google Scholar
F. de Dinechin and M. Istoan. Hardware implementations of fixed-point Atan2. In 22nd IEEE Symposium of Computer Arithmetic (ARITH-22), pages 34–41, June 2015.
Google Scholar
F. de Dinechin, M. Istoan, and G. Sergent. Fixed-point trigonometric functions on FPGAs. SIGARCH Computer Architecture News, 41(5):83–88, 2013.
Article Google Scholar
F. de Dinechin, M. Joldeş, and B. Pasca. Automatic generation of polynomial-based hardware architectures for function evaluation. In Application-specific Systems, Architectures and Processors (ASAP), 2010.
Google Scholar
F. de Dinechin, M. Joldeş, B. Pasca, and G. Revy. Multiplicative square root algorithms for FPGAs. In Field-Programmable Logic and Applications, pages 574–577, 2010.
Google Scholar
F. de Dinechin, C. Q. Lauter, and J.-M. Muller. Fast and correctly rounded logarithms in double-precision. Theoretical Informatics and Applications, 41:85–102, 2007.
Article MathSciNet Google Scholar
F. de Dinechin and V. Lefèvre. Constant multipliers for FPGAs. In Parallel and Distributed Processing Techniques and Applications, pages 167–173, 2000.
Google Scholar
F. de Dinechin and B. Pasca. Large multipliers with fewer DSP blocks. In Field Programmable Logic and Applications, pages 250–255, August 2009.
Google Scholar
F. de Dinechin and B. Pasca. Floating-point exponential functions for DSP-enabled FPGAs. In Field Programmable Technologies, pages 110–117, December 2010. Best paper candidate.
Google Scholar
F. de Dinechin, B. Pasca, O. Creţ, and R. Tudoran. An FPGA-specific approach to floating-point accumulation and sum-of-products. In Field-Programmable Technologies, 2008.
Google Scholar
F. de Dinechin and A. Tisserand. Multipartite table methods. IEEE Transactions on Computers, 54(3):319–330, 2005.
Article Google Scholar
A. DeHon and N. Kapre. Optimistic parallelization of floating-point accumulation. In 18th Symposium on Computer Arithmetic (ARITH-18), pages 205–213, June 2007.
Google Scholar
M. deLorimier and A. DeHon. Floating-point sparse matrix-vector multiply for FPGAs. In Field-Programmable Gate Arrays, pages 75–85, 2005.
Google Scholar
J. Demmel and H. D. Nguyen. Parallel reproducible summation. IEEE Transactions on Computers, 64(7):2060–2070, 2015.
Article MathSciNet Google Scholar
A. G. Dempster and M. D. Macleod. Constant integer multiplication using minimum adders. Circuits, Devices and Systems, IEE Proceedings, 141(5):407–413, 1994.
Article Google Scholar
J. Detrey and F. de Dinechin. Table-based polynomials for fast hardware function evaluation. In Application-Specific Systems, Architectures and Processors, pages 328–333, 2005.
Google Scholar
J. Detrey and F. de Dinechin. Floating-point trigonometric functions for FPGAs. In Field-Programmable Logic and Applications, pages 29–34, August 2007.
Google Scholar
J. Detrey and F. de Dinechin. Parameterized floating-point logarithm and exponential functions for FPGAs. Microprocessors and Microsystems, Special Issue on FPGA-based Reconfigurable Computing, 31(8):537–545, 2007.
Article Google Scholar
J. Detrey and F. de Dinechin. A tool for unbiased comparison between logarithmic and floating-point arithmetic. Journal of VLSI Signal Processing, 49(1):161–175, 2007.
Article Google Scholar
J. Detrey, F. de Dinechin, and X. Pujol. Return of the hardware floating-point elementary function. In 18th Symposium on Computer Arithmetic (ARITH-18), pages 161–168, June 2007.
Google Scholar
W. R. Dieter, A. Kaveti, and H. G. Dietz. Low-cost microarchitectural support for improved floating-point accuracy. IEEE Computer Architecture Letters, 6(1):13–16, 2007.
Article Google Scholar
V. Dimitrov, L. Imbert, and A. Zakaluzny. Multiplication by a constant is sublinear. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 261–268, June 2007.
Google Scholar
C. Doss and R. L. Riley, Jr. FPGA-based implementation of a robust IEEE-754 exponential unit. In Field-Programmable Custom Computing Machines, pages 229–238, 2004.
Google Scholar
Y. Dou, S. Vassiliadis, G. K. Kuzmanov, and G. N. Gaydadjiev. 64-bit floating-point FPGA matrix multiplication. In Field-Programmable Gate Arrays, pages 86–95, 2005.
Google Scholar
T. Drane, W.-C. Cheung, and G. Constantinides. Correctly rounded constant integer division via multiply-add. In IEEE International Symposium on Circuits and Systems (ISCAS), pages 1243–1246, Seoul, South Korea, May 2012.
Google Scholar
P. Echeverría and M. López-Vallejo. An FPGA implementation of the powering function with single precision floating-point arithmetic. In 8th Conference on Real Numbers and Computers (RNC-8), pages 17–26, 2008.
Google Scholar
L. Eisen, J. W. Ward, H. W. Tast, N. Mäding, J. Leenstra, S. M. Mueller, C. Jacobi, J. Preiss, E. M. Schwarz, and S. R. Carlough. IBM POWER6 accelerators: VMX and DFU. IBM Journal of Research and Development, 51(6):1–21, 2007.
Article Google Scholar
M. D. Ercegovac and T. Lang. Division and Square Root: Digit-Recurrence Algorithms and Implementations. Kluwer Academic Publishers, Boston, MA, 1994.
MATH Google Scholar
M. D. Ercegovac and T. Lang. Digital Arithmetic. Morgan Kaufmann Publishers, San Francisco, CA, 2004.
Google Scholar
M. D. Ercegovac and J.-M. Muller. Complex division with prescaling of the operands. In 14th IEEE Conference on Application-Specific Systems, Architectures and Processors (ASAP’2003), pages 304–314, June 2003.
Google Scholar
M. A. Erle and M. J. Schulte. Decimal multiplication via carry-save addition. In Application-specific Systems, Architectures and Processors, pages 348–355, 2003.
Google Scholar
M. A. Erle, M. J. Schulte, and B. J. Hickmann. Decimal floating-point multiplication via carry-save addition. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 46–55, June 2007.
Google Scholar
M. A. Erle, M. J. Schulte, and J. M. Linebarger. Potential speedup using decimal floating-point hardware. In 36th Asilomar Conference on Signals, Systems, and Computers, volume 2, pages 1073–1077, November 2002.
Google Scholar
M. A. Erle, E. M. Schwarz, and M. J. Schulte. Decimal multiplication with efficient partial product generation. In 17th IEEE Symposium on Computer Arithmetic (ARITH-17), 2005.
Google Scholar
G. Even and W. J. Paul. On the design of IEEE compliant floating-point units. IEEE Transactions on Computers, 49(5):398–413, 2000.
Article MathSciNet Google Scholar
G. Even and P.-M. Seidel. A comparison of three rounding algorithms for IEEE floating-point multiplication. IEEE Transactions on Computers, 49(7):638–650, 2000.
Article Google Scholar
H. A. H. Fahmy, A. A. Liddicoat, and M. J. Flynn. Improving the effectiveness of floating point arithmetic. In 35th Asilomar Conference on Signals, Systems, and Computers, volume 1, pages 875–879, November 2001.
Google Scholar
G. Gerwig, H. Wetter, E. M. Schwarz, J. Haess, C. A. Krygowski, B. M. Fleischer, and M. Kroener. The IBM eServer z990 floating-point unit. IBM Journal of Research and Development, 48(3.4):311–322, 2004.
Article Google Scholar
A. Guntoro and M. Glesner. High-performance FPGA-based floating-point adder with three inputs. In Field Programmable Logic and Applications, pages 627–630, 2008.
Google Scholar
O. Gustafsson, A. G. Dempster, K. Johansson, and M. D. Macleod. Simplified design of constant coefficient multipliers. Circuits, Systems, and Signal Processing, 25(2):225–251, 2006.
Article MathSciNet Google Scholar
C. He, G. Qin, M. Lu, and W. Zhao. Group-alignment based accurate floating-point summation on FPGAs. In Engineering of Reconfigurable Systems and Algorithms, pages 136–142, 2006.
Google Scholar
E. Hokenek, R. K. Montoye, and P. W. Cook. Second-generation RISC floating point with multiply-add fused. IEEE Journal of Solid-State Circuits, 25(5):1207–1213, 1990.
Article Google Scholar
M. S. Hrishikesh, D. Burger, N. P. Jouppi, S. W. Keckler, K. I. Farkas, and P. Shivakumar. The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays. In 29th Annual International Symposium on Computer Architecture (ISCA), pages 14–24, 2002.
Google Scholar
S.-F. Hsiao, P.-H. Wu, C.-S. Wen, and P. K. Meher. Table size reduction methods for faithfully rounded lookup-table-based multiplierless function evaluation. Transactions on Circuits and Systems II, 62(5):466–470, 2015.
Article Google Scholar
G. Inoue. Leading one anticipator and floating point addition/subtraction apparatus, August 30 1994. US Patent 5,343,413.
Google Scholar
K. Johansson, O. Gustafsson, and L. Wanhammar. A detailed complexity model for multiple constant multiplication and an algorithm to minimize the complexity. In Circuit Theory and Design, pages 465–468, 2005.
Google Scholar
E. Kadric, P. Gurniak, and A. DeHon. Accurate parallel floating-point accumulation. In 21th IEEE Symposium on Computer Arithmetic (ARITH-21), pages 153–162, April 2013.
Google Scholar
A. Knöfel. Fast hardware units for the computation of accurate dot products. In 10th IEEE Symposium on Computer Arithmetic (ARITH-10), pages 70–74, June 1991.
Google Scholar
S. Knowles. A family of adders. In 14th IEEE Symposium on Computer Arithmetic (ARITH-14), pages 30–34, April 1999.
Google Scholar
J. Koenig, D. Biancolin, J. Bachrach, and K. Asanovic. A hardware accelerator for computing an exact dot product. In 24th IEEE Symposium on Computer Arithmetic (ARITH-24), July 2017.
Google Scholar
P. M. Kogge and H. S. Stone. A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Transactions on Computers, 100(8):786–793, 1973.
Article MathSciNet Google Scholar
I. Koren. Computer Arithmetic Algorithms. Prentice-Hall, Englewood Cliffs, NJ, 1993.
MATH Google Scholar
P. Kornerup and J.-M. Muller. Choosing starting values for certain Newton–Raphson iterations. Theoretical Computer Science, 351(1):101–110, 2006.
Article MathSciNet Google Scholar
U. W. Kulisch. Circuitry for generating scalar products and sums of floating-point numbers with maximum accuracy. United States Patent 4622650, 1986.
Google Scholar
U. W. Kulisch. Advanced Arithmetic for the Digital Computer: Design of Arithmetic Units. Springer-Verlag, Berlin, 2002.
Book Google Scholar
U. W. Kulisch. Computer Arithmetic and Validity: Theory, Implementation, and Applications. de Gruyter, Berlin, 2008.
Book Google Scholar
M. Kumm, O. Gustafsson, M. Garrido, and P. Zipf. Optimal single constant multiplication using ternary adders. IEEE Transactions on Circuits and Systems II: Express Briefs, 2016.
Google Scholar
M. Kumm and P. Zipf. Pipelined compressor tree optimization using integer linear programming. In Field Programmable Logic and Applications, 2014.
Google Scholar
T. Lang and J. D. Bruguera. Floating-point multiply-add-fused with reduced latency. IEEE Transactions on Computers, 53(8):988–1003, 2004.
Article Google Scholar
T. Lang and A. Nannarelli. A radix-10 combinational multiplier. In 40th Asilomar Conference on Signals, Systems, and Computers, pages 313–317, October/November 2006.
Google Scholar
M. Langhammer and B. Pasca. Faithful single-precision floating-point tangent for FPGAs. In ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pages 39–42, 2013.
Google Scholar
M. Langhammer and B. Pasca. Floating-point DSP block architecture for FPGAs. In ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pages 117–125, 2015.
Google Scholar
M. Langhammer and B. Pasca. Single precision logarithm and exponential architectures for hard floating-point enabled FPGAs. IEEE Transactions on Computers, 66(12):2031–2043, 2017.
Article MathSciNet Google Scholar
B. Lee and N. Burgess. Parameterisable floating-point operations on FPGA. In 36th Asilomar Conference on Signals, Systems, and Computers, volume 2, pages 1064–1068, November 2002.
Google Scholar
V. Lefèvre. Multiplication by an integer constant. Technical Report RR1999-06, Laboratoire de l’Informatique du Parallélisme, Lyon, France, 1999.
Google Scholar
Y. Li and W. Chu. Implementation of single precision floating-point square root on FPGAs. In FPGAs for Custom Computing Machines, pages 56–65, 1997.
Google Scholar
G. Lienhart, A. Kugel, and R. Männer. Using floating-point arithmetic on FPGAs to accelerate scientific N-body simulations. In FPGAs for Custom Computing Machines, 2002.
Google Scholar
W. B. Ligon, S. McMillan, G. Monn, K. Schoonover, F. Stivers, and K. D. Underwood. A re-evaluation of the practicality of floating-point operations on FPGAs. In FPGAs for Custom Computing Machines, 1998.
Google Scholar
J. Liu, M. Chang, and C.-K. Cheng. An iterative division algorithm for FPGAs. In Field-Programmable Gate Arrays, pages 83–89, 2006.
Google Scholar
A. R. Lopes and G. A. Constantinides. A fused hybrid floating-point and fixed-point dot-product for FPGAs. In 6th International Symposium on Reconfigurable Computing: Architectures, Tools and Applications (ARC), volume 5992 of Lecture Notes in Computer Science, pages 157–168, Bangkok, Thailand, March 2010.
Google Scholar
Z. Luo and M. Martonosi. Accelerated pipelined integer and floating-point accumulations in configurable hardware with delayed addition techniques. IEEE Transactions on Computers, 49(3):208–218, 2000.
Article Google Scholar
D. Lutz. Fused multiply-add microarchitecture comprising separate early-normalizing multiply and add pipelines. In 20th IEEE Symposium on Computer Arithmetic (ARITH-20), pages 123–128, 2011.
Google Scholar
M. V. Manoukian and G. A. Constantinides. Accurate floating point arithmetic through hardware error-free transformations. In Reconfigurable Computing: Architectures, Tools and Applications (ARC), pages 94–101, 2011.
Google Scholar
J. H. Min and E. E. Swartzlander. Fused floating-point two-term sum-of-squares unit. In Application-Specific Systems, Architectures and Processors (ASAP), 2013.
Google Scholar
R. K. Montoye, E. Hokonek, and S. L. Runyan. Design of the IBM RISC System/6000 floating-point execution unit. IBM Journal of Research and Development, 34(1):59–70, 1990.
Article Google Scholar
S. K. Moore. Intel makes a big jump in computer math. IEEE Spectrum, 2008.
Google Scholar
J.-M. Muller. A few results on table-based methods. Reliable Computing, 5(3):279–288, 1999.
Article MathSciNet Google Scholar
M. Müller, C. Rüb, and W. Rülling. Exact accumulation of floating-point numbers. In 10th IEEE Symposium on Computer Arithmetic (ARITH-10), pages 64–69, June 1991.
Google Scholar
A. Munk-Nielsen and J.-M. Muller. Borrow-save adders for real and complex number systems. In Real Numbers and Computers 2, April 1996.
Google Scholar
A. Naini, A. Dhablania, W. James, and D. Das Sarma. 1-GHz HAL SPARC64 dual floating-point unit with RAS features. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 174–183, June 2001.
Google Scholar
R. Nathan, B. Anthonio, S.-L. Lu, H. Naeimi, D. J. Sorin, and X. Sun. Recycled error bits: Energy-efficient architectural support for floating point accuracy. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC ‘14), pages 117–127, 2014.
Google Scholar
H. Neto and M. Véstias. Decimal multiplier on FPGA using embedded binary multipliers. In Field Programmable Logic and Applications, pages 197–202, 2008.
Google Scholar
K. Ng. Method and apparatus for exact leading zero prediction for a floating-point adder, April 20 1993. US Patent 5,204,825.
Google Scholar
H. D. Nguyen, B. Pasca, and T. Preusser. FPGA-specific arithmetic optimizations of short-latency adders. In Field Programmable Logic and Applications, pages 232–237, 2011.
Google Scholar
K. R. Nichols, M. A. Moussa, and S. M. Areibi. Feasibility of floating-point arithmetic in FPGA based artificial neural networks. In Computer Applications in Industry and Engineering (CAINE), pages 8–13, 2002.
Google Scholar
S. F. Oberman, H. Al-Twaijry, and M. J. Flynn. The SNAP project: design of floating-point arithmetic units. In 13th Symposium on Computer Arithmetic (ARITH-13), 1997.
Google Scholar
V. G. Oklobdzija. An algorithmic and novel design of a leading zero detector circuit: Comparison with logic synthesis. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2, 1994.
Article Google Scholar
F. Ortiz, J. Humphrey, J. Durbano, and D. Prather. A study on the design of floating-point functions in FPGAs. In Field Programmable Logic and Applications, volume 2778 of Lecture Notes in Computer Science, pages 1131–1135, Lisbon, Portugal, September 2003.
Chapter Google Scholar
A. Paidimarri, A. Cevrero, P. Brisk, and P. Ienne. Fpga implementation of a single-precision floating-point multiply-accumulator with single-cycle accumulation. In 17th IEEE Symposium on Field Programmable Custom Computing Machines, 2009.
Google Scholar
B. Parhami. On the complexity of table lookup for iterative division. IEEE Transactions on Computers, C-36(10):1233–1236, 1987.
Article MathSciNet Google Scholar
B. Parhami. Computer Arithmetic: Algorithms and Hardware Designs. Oxford University Press, 2000.
Google Scholar
B. Pasca. Correctly rounded floating-point division for DSP-enabled FPGAs. In Field Programmable Logic and Applications, 2012.
Google Scholar
D. Patil, O. Azizi, M. Horowitz, R. Ho, and R. Ananthraman. Robust energy-efficient adder topologies. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 29–37, June 2007.
Google Scholar
R. V. K. Pillai, D. Al-Khalili, and A. J. Al-Khalili. A low power approach to floating point adder design. In International Conference on Computer Design, 1997.
Google Scholar
J. A. Pineiro and J. D. Bruguera. High-speed double-precision computation of reciprocal, division, square root, and inverse square root. IEEE Transactions on Computers, 51(12):1377–1388, 2002.
Article MathSciNet Google Scholar
E. Quinnell, E. E. Swartzlander, and C. Lemonds. Floating-point fused multiply-add architectures. In 41st Asilomar Conference on Signals, Systems, and Computers, pages 331–337, November 2007.
Google Scholar
J. Ramos and A. Bohorquez. Two operand binary adders with threshold logic. IEEE Transactions on Computers, 48(12):1324–1337, 1999.
Article MathSciNet Google Scholar
J. E. Robertson. A new class of digital division methods. IRE Transactions on Electronic Computers, EC-7:218–222, 1958. Reprinted in [583].
Article Google Scholar
E. Roesler and B. Nelson. Novel optimizations for hardware floating-point units in a modern FPGA architecture. In Field Programmable Logic and Applications, volume 2438 of Lecture Notes in Computer Science, pages 637–646, 2002.
MATH Google Scholar
D. M. Russinoff. A case study in formal verification of register-transfer logic with ACL2: The floating point adder of the AMD Athlon processor. Lecture Notes in Computer Science, 1954:3–36, 2000.
Google Scholar
H. H. Saleh and E. E. Swartzlander. A floating-point fused dot-product unit. In International Conference on Computer Design (ICCD), pages 426–431, 2008.
Google Scholar
M. M. Schmookler and K. J. Nowka. Leading zero anticipation and detection – a comparison of methods. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 7–12, June 2001.
Google Scholar
E. M. Schwarz, M. Schmookler, and S. D. Trong. FPU implementations with denormalized numbers. IEEE Transactions on Computers, 54(7):825–836, 2005.
Article Google Scholar
P.-M. Seidel. Multiple path IEEE floating-point fused multiply-add. In 46th International Midwest Symposium on Circuits and Systems, pages 1359–1362, 2003.
Google Scholar
P.-M. Seidel and G. Even. How many logic levels does floating-point addition require. In International Conference on Computer Design, pages 142–149, 1998.
Google Scholar
P.-M. Seidel and G. Even. On the design of fast IEEE floating-point adders. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 184–194, June 2001.
Google Scholar
A. M. Shams and M. A. Bayoumi. A novel high-performance CMOS 1-bit full-adder cell. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 47(5), 2000.
Article Google Scholar
N. Shirazi, A. Walters, and P. Athanas. Quantitative analysis of floating point arithmetic on FPGA based custom computing machine. In FPGAs for Custom Computing Machines, pages 155–162, 1995.
Google Scholar
D. P. Singh, B. Pasca, and T. S. Czajkowski. High-level design tools for floating point FPGAs. In ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA, February 22–24, 2015, pages 9–12, 2015.
Google Scholar
E. Sprangle and D. Carmean. Increasing processor performance by implementing deeper pipelines. In 29th Annual International Symposium on Computer Architecture (ISCA), pages 25–34, 2002.
Google Scholar
J. E. Stine and M. J. Schulte. The symmetric table addition method for accurate function approximation. Journal of VLSI Signal Processing, 21:167–177, 1999.
Article Google Scholar
D. A. Sunderland, R. A. Strauch, S. W. Wharfield, H. T. Peterson, and C. R. Cole. CMOS/SOS frequency synthesizer LSI circuit for spread spectrum communications. IEEE Journal of Solid State Circuits, SC-19(4):497–506, 1984.
Article Google Scholar
A. Svoboda. Adder with distributed control. IEEE Transactions on Computers, C-19(8), 1970. Reprinted in [583].
Google Scholar
N. Takagi. A hardware algorithm for computing the reciprocal square root. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 94–100, June 2001.
Google Scholar
N. Takagi and S. Kuwahara. A VLSI algorithm for computing the Euclidean norm of a 3D vector. IEEE Transactions on Computers, 49(10):1074–1082, 2000.
Article MathSciNet Google Scholar
Y. Tao, G. Deyuan, R. Xianglong, H. Limin, F. Xiaoya, and Y. Lei. A novel floating-point function unit combining MAF and 3-input adder. In Signal Processing, Communication and Computing (ICSPCC), 2012.
Google Scholar
Y. Tao, G. Deyuan, and F. Xiaoya. A multi-path fused add-subtract unit for digital signal processing. In Computer Science and Automation Engineering (CSAE), 2012.
Google Scholar
Y. Tao, G. Deyuan, F. Xiaoya, and J. Nurmi. Correctly rounded architectures for floating-point multi-operand addition and dot-product computation. In Application-Specific Systems, Architectures and Processors (ASAP), 2013.
Google Scholar
Y. Tao, G. Deyuan, F. Xiaoya, and R. Xianglong. Three-operand floating-point adder. In 12th International Conference on Computer and Information Technology, pages 192–196, 2012.
Google Scholar
M. B. Taylor. Is dark silicon useful? Harnessing the four horsemen of the coming dark silicon apocalypse. In 49th Design Automation Conference, pages 1131–1136, 2012.
Google Scholar
A. F. Tenca. Multi-operand floating-point addition. In 19th IEEE Symposium on Computer Arithmetic (ARITH-19), pages 161–168, 2009.
Google Scholar
T. Teufel and M. Baesler. FPGA implementation of a decimal floating-point accurate scalar product unit with a parallel fixed-point multiplier. In Reconfigurable Computing and FPGAs, pages 6–11, 2009.
Google Scholar
D. B. Thomas. A general-purpose method for faithfully rounded floating-point function approximation in FPGAs. In 22nd IEEE Symposium on Computer Arithmetic (ARITH-22), pages 42–49, 2015.
Google Scholar
K. D. Tocher. Techniques of multiplication and division for automatic binary computers. Quarterly Journal of Mechanics and Applied Mathematics, 11(3):364–384, 1958.
Article MathSciNet Google Scholar
W. J. Townsend, E. E. Swartzlander, Jr., and J. A. Abraham. A comparison of Dadda and Wallace multiplier delays. In SPIE’s 48th Annual Meeting on Optical Science and Technology, pages 552–560, 2003.
Google Scholar
S. D. Trong, M. Schmookler, E. M. Schwarz, and M. Kroener. P6 binary floating-point unit. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 77–86, 2007.
Google Scholar
A. Tyagi. A reduced-area scheme for carry-select adders. IEEE Transactions on Computers, 42(10):1163–1170, 1993.
Article Google Scholar
H. F. Ugurdag, A. Bayram, V. E. Levent, and S. Gören. Efficient combinational circuits for division by small integer constants. In 23rd IEEE Symposium on Computer Arithmetic (ARITH-23), pages 1–7, July 2016.
Google Scholar
A. Vázquez. High-Performance Decimal Floating-Point Units. Ph.D. thesis, Universidade de Santiago de Compostela, 2009.
Google Scholar
A. Vázquez, E. Antelo, and P. Montuschi. A new family of high performance parallel decimal multipliers. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 195–204, 2007.
Google Scholar
D. Villeger and V. G. Oklobdzija. Evaluation of Booth encoding techniques for parallel multiplier implementation. Electronics Letters, 29(23):2016–2017, 1993.
Article Google Scholar
Y. Voronenko and M. Püschel. Multiplierless multiple constant multiplication. ACM Trans. Algorithms, 3(2), 2007.
Article MathSciNet Google Scholar
C. S. Wallace. A suggestion for a fast multiplier. IEEE Transactions on Electronic Computers, pages 14–17, 1964. Reprinted in [583].
Google Scholar
L.-K. Wang and M. J. Schulte. Decimal floating-point division using Newton–Raphson iteration. In Application-Specific Systems, Architectures and Processors, pages 84–95, 2004.
Google Scholar
L.-K. Wang, M. J. Schulte, J. D. Thompson, and N. Jairam. Hardware designs for decimal floating-point addition and related operations. IEEE Transactions on Computers, 58(2):322–335, 2009.
Article MathSciNet Google Scholar
X. Wang, S. Braganza, and M. Leeser. Advanced components in the variable precision floating-point library. In Field-Programmable Custom Computing Machines, pages 249–258, 2006.
Google Scholar
M. J. Wirthlin. Constant coefficient multiplication using look-up tables. VLSI Signal Processing, 36(1):7–15, 2004.
Article Google Scholar
S. Xing and W. Yu. FPGA adders: Performance evaluation and optimal design. IEEE Design & Test of Computers, 15:24–29, 1998.
Article Google Scholar
X. Y. Yu, Y.-H. Chan, B. Curran, E. Schwarz, M. Kelly, and B. Fleischer. A 5GHz+ 128-bit binary floating-point adder for the POWER6 processor. In European Solid-State Circuits Conference, pages 166–169, 2006.
Google Scholar
N. Zhuang and H. Wu. A new design of the CMOS full adder. IEEE Journal on Solid-State Circuits, 27:840–844, 1992.
Article Google Scholar
L. Zhuo and V. K. Prasanna. Scalable and modular algorithms for floating-point matrix multiplication on FPGAs. In 18th International Parallel and Distributed Processing Symposium (IPDPS), April 2004.
Google Scholar
L. Zhuo and V. K. Prasanna. High performance linear algebra operations on reconfigurable systems. In Supercomputing, 2005.
Google Scholar
R. Zimmermann. Binary Adder Architectures for Cell-Based VLSI and Their Synthesis. Ph.D. thesis, Swiss Federal Institute of Technology, Zurich, 1997.
Google Scholar
V. Zyuban, D. Brooks, V. Srinivasan, M. Gschwind, P. Bose, P. N. Strenski, and P. G. Emma. Integrated analysis of power and performance for pipelined microprocessors. IEEE Transactions on Computers, 53(8):1004–1016, 2004.
Article Google Scholar

Download references

Author information

Authors and Affiliations

CNRS - LIP, Lyon, France
Jean-Michel Muller
Kalray, Grenoble, France
Nicolas Brunie
INSA-Lyon - CITI, Villeurbanne, France
Florent de Dinechin
Inria - LIP, Lyon, France
Claude-Pierre Jeannerod, Vincent Lefèvre & Nathalie Revol
CNRS - LAAS, Toulouse, France
Mioara Joldes
Inria - LRI, Orsay, France
Guillaume Melquiond
ENS-Lyon - LIP, Lyon, France
Serge Torres

Authors

Jean-Michel Muller
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Brunie
View author publications
You can also search for this author in PubMed Google Scholar
Florent de Dinechin
View author publications
You can also search for this author in PubMed Google Scholar
Claude-Pierre Jeannerod
View author publications
You can also search for this author in PubMed Google Scholar
Mioara Joldes
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Lefèvre
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume Melquiond
View author publications
You can also search for this author in PubMed Google Scholar
Nathalie Revol
View author publications
You can also search for this author in PubMed Google Scholar
Serge Torres
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Muller, JM. et al. (2018). Hardware Implementation of Floating-Point Arithmetic. In: Handbook of Floating-Point Arithmetic. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-76526-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-76526-6_8
Published: 03 May 2018
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-76525-9
Online ISBN: 978-3-319-76526-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics