Skip to main content

Hardware Implementation of Floating-Point Arithmetic

  • Chapter
  • First Online:

Abstract

Chapter 7 has shown that operations on floating-point numbers are naturally expressed in terms of integer or fixed-point operations on the significand and the exponent. For instance, to obtain the product of two floating-point numbers, one basically multiplies the significands and adds the exponents. However, obtaining the correct rounding of the result may require considerable design effort and the use of nonarithmetic primitives such as leading-zero counters and shifters. This chapter details the implementation of these algorithms in hardware, using digital logic.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.opencores.org/.

  2. 2.

    http://www.coe.neu.edu/Research/rcl/projects/floatingpoint/index.html.

  3. 3.

    http://flopoco.gforge.inria.fr/.

References

  1. E. Abu-Shama and M. Bayoumi. A new cell for low power adders. In International Symposium on Circuits and Systems (ISCAS), pages 1014–1017, 1996.

    Google Scholar 

  2. L. Aksoy, E. Costa, P. Flores, and J. Monteiro. Optimization of area in digital FIR filters using gate-level metrics. In Design Automation Conference, pages 420–423, 2007.

    Google Scholar 

  3. Altera Corporation. FFT/IFFT Block Floating Point Scaling, 2005. Application note 404-1.0.

    Google Scholar 

  4. A. Avizienis. Signed-digit number representations for fast parallel arithmetic. IRE Transactions on Electronic Computers, 10:389–400, 1961. Reprinted in [584].

    Article  MathSciNet  Google Scholar 

  5. S. Banescu, F. de Dinechin, B. Pasca, and R. Tudoran. Multipliers for floating-point double precision and beyond on FPGAs. ACM SIGARCH Computer Architecture News, 38:73–79, 2010.

    Article  Google Scholar 

  6. C. Berg. Formal Verification of an IEEE Floating-Point Adder. Master’s thesis, Universität des Saarlandes, Germany, 2001.

    Google Scholar 

  7. A. D. Booth. A signed binary multiplication technique. Quarterly Journal of Mechanics and Applied Mathematics, 4(2):236–240, 1951. Reprinted in [583].

    Google Scholar 

  8. N. Boullis and A. Tisserand. Some optimizations of hardware multiplication by constant matrices. IEEE Transactions on Computers, 54(10):1271–1282, 2005.

    Article  Google Scholar 

  9. N. Brisebarre, F. de Dinechin, and J.-M. Muller. Integer and floating-point constant multipliers for FPGAs. In Application-specific Systems, Architectures and Processors, pages 239–244, 2008.

    Google Scholar 

  10. N. Brisebarre and J.-M. Muller. Correctly rounded multiplication by arbitrary precision constants. IEEE Transactions on Computers, 57(2):165–174, 2008.

    Article  MathSciNet  Google Scholar 

  11. N. Brisebarre, J.-M. Muller, and S.-K. Raina. Accelerating correctly rounded floating-point division when the divisor is known in advance. IEEE Transactions on Computers, 53(8):1069–1072, 2004.

    Article  Google Scholar 

  12. J. D. Bruguera and T. Lang. Leading-one prediction with concurrent position correction. IEEE Transactions on Computers, 48(10):1083–1097, 1999.

    Article  Google Scholar 

  13. J. D. Bruguera and T. Lang. Floating-point fused multiply-add: Reduced latency for floating-point addition. In 17th IEEE Symposium on Computer Arithmetic (ARITH-17), Cape Cod, MA, USA, June 2005.

    Google Scholar 

  14. N. Brunie. Modified FMA for exact low precision product accumulation. In 24th IEEE Symposium on Computer Arithmetic (ARITH-24), pages 106–113, July 2017.

    Google Scholar 

  15. H. T. Bui, Y. Wang, and Y. Jiang. Design and analysis of low-power 10-transistor full adders using novel XORXNOR gates. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 49(1), 2003.

    Google Scholar 

  16. F. Y. Busaba, C. A. Krygowski, W. H. Li, E. M. Schwarz, and S. R. Carlough. The IBM z900 decimal arithmetic unit. In 35th Asilomar Conference on Signals, Systems, and Computers, volume 2, pages 1335–1339, November 2001.

    Google Scholar 

  17. P. R. Cappello and K. Steiglitz. A VLSI layout for a pipelined Dadda multiplier. ACM Transactions on Computer Systems, 1(2):157–174, 1983. Reprinted in [584].

    Article  Google Scholar 

  18. A. Cauchy. Sur les moyens d’éviter les erreurs dans les calculs numériques. Comptes Rendus de l’Académie des Sciences, Paris, 11:789–798, 1840. Republished in: Augustin Cauchy, œuvres complètes, 1ère série, Tome V, pages 431–442.

    Google Scholar 

  19. K. D. Chapman. Fast integer multipliers fit in FPGAs (EDN 1993 design idea winner). EDN Magazine, 1994.

    Google Scholar 

  20. M. Cornea, C. Anderson, J. Harrison, P. T. P. Tang, E. Schneider, and C. Tsen. A software implementation of the IEEE 754R decimal floating-point arithmetic using the binary encoding format. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 29–37, June 2007.

    Google Scholar 

  21. M. Cornea, J. Harrison, C. Anderson, P. T. P. Tang, E. Schneider, and E. Gvozdev. A software implementation of the IEEE 754R decimal floating-point arithmetic using the binary encoding format. IEEE Transactions on Computers, 58(2):148–162, 2009.

    Article  MathSciNet  Google Scholar 

  22. M. F. Cowlishaw. Decimal floating-point: algorism for computers. In 16th IEEE Symposium on Computer Arithmetic (ARITH-16), pages 104–111, June 2003.

    Google Scholar 

  23. M. F. Cowlishaw, E. M. Schwarz, R. M. Smith, and C. F. Webb. A decimal floating-point specification. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 147–154, June 2001.

    Google Scholar 

  24. O. Creţ, F. de Dinechin, I. Trestian, R. Tudoran, L. Creţ, and L. Vǎcariu. FPGA-based acceleration of the computations involved in transcranial magnetic stimulation. In Southern Programmable Logic Conference, pages 43–48, 2008.

    Google Scholar 

  25. L. Dadda. Some schemes for parallel multipliers. Alta Frequenza, 34:349–356, 1965. Reprinted in [583].

    Google Scholar 

  26. L. Dadda. On parallel digital multipliers. Alta Frequenza, 45:574–580, 1976. Reprinted in [583].

    Google Scholar 

  27. D. Das Sarma and D. W. Matula. Measuring the accuracy of ROM reciprocal tables. IEEE Transactions on Computers, 43(8):932–940, 1994.

    Article  Google Scholar 

  28. D. Das Sarma and D. W. Matula. Faithful bipartite ROM reciprocal tables. In 12th IEEE Symposium on Computer Arithmetic (ARITH-12), pages 17–28, June 1995.

    Google Scholar 

  29. D. Das Sarma and D. W. Matula. Faithful interpolation in reciprocal tables. In 13th IEEE Symposium on Computer Arithmetic (ARITH-13), pages 82–91, July 1997.

    Google Scholar 

  30. F. de Dinechin. The price of routing in FPGAs. Journal of Universal Computer Science, 6(2):227–239, 2000.

    Google Scholar 

  31. F. de Dinechin. Multiplication by rational constants. IEEE Transactions on Circuits and Systems, II, 52(2):98–102, 2012.

    Article  Google Scholar 

  32. F. de Dinechin and L.-S. Didier. Table-based division by small integer constants. In Applied Reconfigurable Computing, pages 53–63, March 2012.

    Google Scholar 

  33. F. de Dinechin, P. Echeverría, M. López-Vallejo, and B. Pasca. Floating-point exponentiation units for reconfigurable computing. ACM Transactions on Reconfigurable Technology and Systems, 6(1), 2013.

    Google Scholar 

  34. F. de Dinechin and M. Istoan. Hardware implementations of fixed-point Atan2. In 22nd IEEE Symposium of Computer Arithmetic (ARITH-22), pages 34–41, June 2015.

    Google Scholar 

  35. F. de Dinechin, M. Istoan, and G. Sergent. Fixed-point trigonometric functions on FPGAs. SIGARCH Computer Architecture News, 41(5):83–88, 2013.

    Article  Google Scholar 

  36. F. de Dinechin, M. Joldeş, and B. Pasca. Automatic generation of polynomial-based hardware architectures for function evaluation. In Application-specific Systems, Architectures and Processors (ASAP), 2010.

    Google Scholar 

  37. F. de Dinechin, M. Joldeş, B. Pasca, and G. Revy. Multiplicative square root algorithms for FPGAs. In Field-Programmable Logic and Applications, pages 574–577, 2010.

    Google Scholar 

  38. F. de Dinechin, C. Q. Lauter, and J.-M. Muller. Fast and correctly rounded logarithms in double-precision. Theoretical Informatics and Applications, 41:85–102, 2007.

    Article  MathSciNet  Google Scholar 

  39. F. de Dinechin and V. Lefèvre. Constant multipliers for FPGAs. In Parallel and Distributed Processing Techniques and Applications, pages 167–173, 2000.

    Google Scholar 

  40. F. de Dinechin and B. Pasca. Large multipliers with fewer DSP blocks. In Field Programmable Logic and Applications, pages 250–255, August 2009.

    Google Scholar 

  41. F. de Dinechin and B. Pasca. Floating-point exponential functions for DSP-enabled FPGAs. In Field Programmable Technologies, pages 110–117, December 2010. Best paper candidate.

    Google Scholar 

  42. F. de Dinechin, B. Pasca, O. Creţ, and R. Tudoran. An FPGA-specific approach to floating-point accumulation and sum-of-products. In Field-Programmable Technologies, 2008.

    Google Scholar 

  43. F. de Dinechin and A. Tisserand. Multipartite table methods. IEEE Transactions on Computers, 54(3):319–330, 2005.

    Article  Google Scholar 

  44. A. DeHon and N. Kapre. Optimistic parallelization of floating-point accumulation. In 18th Symposium on Computer Arithmetic (ARITH-18), pages 205–213, June 2007.

    Google Scholar 

  45. M. deLorimier and A. DeHon. Floating-point sparse matrix-vector multiply for FPGAs. In Field-Programmable Gate Arrays, pages 75–85, 2005.

    Google Scholar 

  46. J. Demmel and H. D. Nguyen. Parallel reproducible summation. IEEE Transactions on Computers, 64(7):2060–2070, 2015.

    Article  MathSciNet  Google Scholar 

  47. A. G. Dempster and M. D. Macleod. Constant integer multiplication using minimum adders. Circuits, Devices and Systems, IEE Proceedings, 141(5):407–413, 1994.

    Article  Google Scholar 

  48. J. Detrey and F. de Dinechin. Table-based polynomials for fast hardware function evaluation. In Application-Specific Systems, Architectures and Processors, pages 328–333, 2005.

    Google Scholar 

  49. J. Detrey and F. de Dinechin. Floating-point trigonometric functions for FPGAs. In Field-Programmable Logic and Applications, pages 29–34, August 2007.

    Google Scholar 

  50. J. Detrey and F. de Dinechin. Parameterized floating-point logarithm and exponential functions for FPGAs. Microprocessors and Microsystems, Special Issue on FPGA-based Reconfigurable Computing, 31(8):537–545, 2007.

    Article  Google Scholar 

  51. J. Detrey and F. de Dinechin. A tool for unbiased comparison between logarithmic and floating-point arithmetic. Journal of VLSI Signal Processing, 49(1):161–175, 2007.

    Article  Google Scholar 

  52. J. Detrey, F. de Dinechin, and X. Pujol. Return of the hardware floating-point elementary function. In 18th Symposium on Computer Arithmetic (ARITH-18), pages 161–168, June 2007.

    Google Scholar 

  53. W. R. Dieter, A. Kaveti, and H. G. Dietz. Low-cost microarchitectural support for improved floating-point accuracy. IEEE Computer Architecture Letters, 6(1):13–16, 2007.

    Article  Google Scholar 

  54. V. Dimitrov, L. Imbert, and A. Zakaluzny. Multiplication by a constant is sublinear. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 261–268, June 2007.

    Google Scholar 

  55. C. Doss and R. L. Riley, Jr. FPGA-based implementation of a robust IEEE-754 exponential unit. In Field-Programmable Custom Computing Machines, pages 229–238, 2004.

    Google Scholar 

  56. Y. Dou, S. Vassiliadis, G. K. Kuzmanov, and G. N. Gaydadjiev. 64-bit floating-point FPGA matrix multiplication. In Field-Programmable Gate Arrays, pages 86–95, 2005.

    Google Scholar 

  57. T. Drane, W.-C. Cheung, and G. Constantinides. Correctly rounded constant integer division via multiply-add. In IEEE International Symposium on Circuits and Systems (ISCAS), pages 1243–1246, Seoul, South Korea, May 2012.

    Google Scholar 

  58. P. Echeverría and M. López-Vallejo. An FPGA implementation of the powering function with single precision floating-point arithmetic. In 8th Conference on Real Numbers and Computers (RNC-8), pages 17–26, 2008.

    Google Scholar 

  59. L. Eisen, J. W. Ward, H. W. Tast, N. Mäding, J. Leenstra, S. M. Mueller, C. Jacobi, J. Preiss, E. M. Schwarz, and S. R. Carlough. IBM POWER6 accelerators: VMX and DFU. IBM Journal of Research and Development, 51(6):1–21, 2007.

    Article  Google Scholar 

  60. M. D. Ercegovac and T. Lang. Division and Square Root: Digit-Recurrence Algorithms and Implementations. Kluwer Academic Publishers, Boston, MA, 1994.

    MATH  Google Scholar 

  61. M. D. Ercegovac and T. Lang. Digital Arithmetic. Morgan Kaufmann Publishers, San Francisco, CA, 2004.

    Google Scholar 

  62. M. D. Ercegovac and J.-M. Muller. Complex division with prescaling of the operands. In 14th IEEE Conference on Application-Specific Systems, Architectures and Processors (ASAP’2003), pages 304–314, June 2003.

    Google Scholar 

  63. M. A. Erle and M. J. Schulte. Decimal multiplication via carry-save addition. In Application-specific Systems, Architectures and Processors, pages 348–355, 2003.

    Google Scholar 

  64. M. A. Erle, M. J. Schulte, and B. J. Hickmann. Decimal floating-point multiplication via carry-save addition. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 46–55, June 2007.

    Google Scholar 

  65. M. A. Erle, M. J. Schulte, and J. M. Linebarger. Potential speedup using decimal floating-point hardware. In 36th Asilomar Conference on Signals, Systems, and Computers, volume 2, pages 1073–1077, November 2002.

    Google Scholar 

  66. M. A. Erle, E. M. Schwarz, and M. J. Schulte. Decimal multiplication with efficient partial product generation. In 17th IEEE Symposium on Computer Arithmetic (ARITH-17), 2005.

    Google Scholar 

  67. G. Even and W. J. Paul. On the design of IEEE compliant floating-point units. IEEE Transactions on Computers, 49(5):398–413, 2000.

    Article  MathSciNet  Google Scholar 

  68. G. Even and P.-M. Seidel. A comparison of three rounding algorithms for IEEE floating-point multiplication. IEEE Transactions on Computers, 49(7):638–650, 2000.

    Article  Google Scholar 

  69. H. A. H. Fahmy, A. A. Liddicoat, and M. J. Flynn. Improving the effectiveness of floating point arithmetic. In 35th Asilomar Conference on Signals, Systems, and Computers, volume 1, pages 875–879, November 2001.

    Google Scholar 

  70. G. Gerwig, H. Wetter, E. M. Schwarz, J. Haess, C. A. Krygowski, B. M. Fleischer, and M. Kroener. The IBM eServer z990 floating-point unit. IBM Journal of Research and Development, 48(3.4):311–322, 2004.

    Article  Google Scholar 

  71. A. Guntoro and M. Glesner. High-performance FPGA-based floating-point adder with three inputs. In Field Programmable Logic and Applications, pages 627–630, 2008.

    Google Scholar 

  72. O. Gustafsson, A. G. Dempster, K. Johansson, and M. D. Macleod. Simplified design of constant coefficient multipliers. Circuits, Systems, and Signal Processing, 25(2):225–251, 2006.

    Article  MathSciNet  Google Scholar 

  73. C. He, G. Qin, M. Lu, and W. Zhao. Group-alignment based accurate floating-point summation on FPGAs. In Engineering of Reconfigurable Systems and Algorithms, pages 136–142, 2006.

    Google Scholar 

  74. E. Hokenek, R. K. Montoye, and P. W. Cook. Second-generation RISC floating point with multiply-add fused. IEEE Journal of Solid-State Circuits, 25(5):1207–1213, 1990.

    Article  Google Scholar 

  75. M. S. Hrishikesh, D. Burger, N. P. Jouppi, S. W. Keckler, K. I. Farkas, and P. Shivakumar. The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays. In 29th Annual International Symposium on Computer Architecture (ISCA), pages 14–24, 2002.

    Google Scholar 

  76. S.-F. Hsiao, P.-H. Wu, C.-S. Wen, and P. K. Meher. Table size reduction methods for faithfully rounded lookup-table-based multiplierless function evaluation. Transactions on Circuits and Systems II, 62(5):466–470, 2015.

    Article  Google Scholar 

  77. G. Inoue. Leading one anticipator and floating point addition/subtraction apparatus, August 30 1994. US Patent 5,343,413.

    Google Scholar 

  78. K. Johansson, O. Gustafsson, and L. Wanhammar. A detailed complexity model for multiple constant multiplication and an algorithm to minimize the complexity. In Circuit Theory and Design, pages 465–468, 2005.

    Google Scholar 

  79. E. Kadric, P. Gurniak, and A. DeHon. Accurate parallel floating-point accumulation. In 21th IEEE Symposium on Computer Arithmetic (ARITH-21), pages 153–162, April 2013.

    Google Scholar 

  80. A. Knöfel. Fast hardware units for the computation of accurate dot products. In 10th IEEE Symposium on Computer Arithmetic (ARITH-10), pages 70–74, June 1991.

    Google Scholar 

  81. S. Knowles. A family of adders. In 14th IEEE Symposium on Computer Arithmetic (ARITH-14), pages 30–34, April 1999.

    Google Scholar 

  82. J. Koenig, D. Biancolin, J. Bachrach, and K. Asanovic. A hardware accelerator for computing an exact dot product. In 24th IEEE Symposium on Computer Arithmetic (ARITH-24), July 2017.

    Google Scholar 

  83. P. M. Kogge and H. S. Stone. A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Transactions on Computers, 100(8):786–793, 1973.

    Article  MathSciNet  Google Scholar 

  84. I. Koren. Computer Arithmetic Algorithms. Prentice-Hall, Englewood Cliffs, NJ, 1993.

    MATH  Google Scholar 

  85. P. Kornerup and J.-M. Muller. Choosing starting values for certain Newton–Raphson iterations. Theoretical Computer Science, 351(1):101–110, 2006.

    Article  MathSciNet  Google Scholar 

  86. U. W. Kulisch. Circuitry for generating scalar products and sums of floating-point numbers with maximum accuracy. United States Patent 4622650, 1986.

    Google Scholar 

  87. U. W. Kulisch. Advanced Arithmetic for the Digital Computer: Design of Arithmetic Units. Springer-Verlag, Berlin, 2002.

    Book  Google Scholar 

  88. U. W. Kulisch. Computer Arithmetic and Validity: Theory, Implementation, and Applications. de Gruyter, Berlin, 2008.

    Book  Google Scholar 

  89. M. Kumm, O. Gustafsson, M. Garrido, and P. Zipf. Optimal single constant multiplication using ternary adders. IEEE Transactions on Circuits and Systems II: Express Briefs, 2016.

    Google Scholar 

  90. M. Kumm and P. Zipf. Pipelined compressor tree optimization using integer linear programming. In Field Programmable Logic and Applications, 2014.

    Google Scholar 

  91. T. Lang and J. D. Bruguera. Floating-point multiply-add-fused with reduced latency. IEEE Transactions on Computers, 53(8):988–1003, 2004.

    Article  Google Scholar 

  92. T. Lang and A. Nannarelli. A radix-10 combinational multiplier. In 40th Asilomar Conference on Signals, Systems, and Computers, pages 313–317, October/November 2006.

    Google Scholar 

  93. M. Langhammer and B. Pasca. Faithful single-precision floating-point tangent for FPGAs. In ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pages 39–42, 2013.

    Google Scholar 

  94. M. Langhammer and B. Pasca. Floating-point DSP block architecture for FPGAs. In ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pages 117–125, 2015.

    Google Scholar 

  95. M. Langhammer and B. Pasca. Single precision logarithm and exponential architectures for hard floating-point enabled FPGAs. IEEE Transactions on Computers, 66(12):2031–2043, 2017.

    Article  MathSciNet  Google Scholar 

  96. B. Lee and N. Burgess. Parameterisable floating-point operations on FPGA. In 36th Asilomar Conference on Signals, Systems, and Computers, volume 2, pages 1064–1068, November 2002.

    Google Scholar 

  97. V. Lefèvre. Multiplication by an integer constant. Technical Report RR1999-06, Laboratoire de l’Informatique du Parallélisme, Lyon, France, 1999.

    Google Scholar 

  98. Y. Li and W. Chu. Implementation of single precision floating-point square root on FPGAs. In FPGAs for Custom Computing Machines, pages 56–65, 1997.

    Google Scholar 

  99. G. Lienhart, A. Kugel, and R. Männer. Using floating-point arithmetic on FPGAs to accelerate scientific N-body simulations. In FPGAs for Custom Computing Machines, 2002.

    Google Scholar 

  100. W. B. Ligon, S. McMillan, G. Monn, K. Schoonover, F. Stivers, and K. D. Underwood. A re-evaluation of the practicality of floating-point operations on FPGAs. In FPGAs for Custom Computing Machines, 1998.

    Google Scholar 

  101. J. Liu, M. Chang, and C.-K. Cheng. An iterative division algorithm for FPGAs. In Field-Programmable Gate Arrays, pages 83–89, 2006.

    Google Scholar 

  102. A. R. Lopes and G. A. Constantinides. A fused hybrid floating-point and fixed-point dot-product for FPGAs. In 6th International Symposium on Reconfigurable Computing: Architectures, Tools and Applications (ARC), volume 5992 of Lecture Notes in Computer Science, pages 157–168, Bangkok, Thailand, March 2010.

    Google Scholar 

  103. Z. Luo and M. Martonosi. Accelerated pipelined integer and floating-point accumulations in configurable hardware with delayed addition techniques. IEEE Transactions on Computers, 49(3):208–218, 2000.

    Article  Google Scholar 

  104. D. Lutz. Fused multiply-add microarchitecture comprising separate early-normalizing multiply and add pipelines. In 20th IEEE Symposium on Computer Arithmetic (ARITH-20), pages 123–128, 2011.

    Google Scholar 

  105. M. V. Manoukian and G. A. Constantinides. Accurate floating point arithmetic through hardware error-free transformations. In Reconfigurable Computing: Architectures, Tools and Applications (ARC), pages 94–101, 2011.

    Google Scholar 

  106. J. H. Min and E. E. Swartzlander. Fused floating-point two-term sum-of-squares unit. In Application-Specific Systems, Architectures and Processors (ASAP), 2013.

    Google Scholar 

  107. R. K. Montoye, E. Hokonek, and S. L. Runyan. Design of the IBM RISC System/6000 floating-point execution unit. IBM Journal of Research and Development, 34(1):59–70, 1990.

    Article  Google Scholar 

  108. S. K. Moore. Intel makes a big jump in computer math. IEEE Spectrum, 2008.

    Google Scholar 

  109. J.-M. Muller. A few results on table-based methods. Reliable Computing, 5(3):279–288, 1999.

    Article  MathSciNet  Google Scholar 

  110. M. Müller, C. Rüb, and W. Rülling. Exact accumulation of floating-point numbers. In 10th IEEE Symposium on Computer Arithmetic (ARITH-10), pages 64–69, June 1991.

    Google Scholar 

  111. A. Munk-Nielsen and J.-M. Muller. Borrow-save adders for real and complex number systems. In Real Numbers and Computers 2, April 1996.

    Google Scholar 

  112. A. Naini, A. Dhablania, W. James, and D. Das Sarma. 1-GHz HAL SPARC64 dual floating-point unit with RAS features. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 174–183, June 2001.

    Google Scholar 

  113. R. Nathan, B. Anthonio, S.-L. Lu, H. Naeimi, D. J. Sorin, and X. Sun. Recycled error bits: Energy-efficient architectural support for floating point accuracy. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC ‘14), pages 117–127, 2014.

    Google Scholar 

  114. H. Neto and M. Véstias. Decimal multiplier on FPGA using embedded binary multipliers. In Field Programmable Logic and Applications, pages 197–202, 2008.

    Google Scholar 

  115. K. Ng. Method and apparatus for exact leading zero prediction for a floating-point adder, April 20 1993. US Patent 5,204,825.

    Google Scholar 

  116. H. D. Nguyen, B. Pasca, and T. Preusser. FPGA-specific arithmetic optimizations of short-latency adders. In Field Programmable Logic and Applications, pages 232–237, 2011.

    Google Scholar 

  117. K. R. Nichols, M. A. Moussa, and S. M. Areibi. Feasibility of floating-point arithmetic in FPGA based artificial neural networks. In Computer Applications in Industry and Engineering (CAINE), pages 8–13, 2002.

    Google Scholar 

  118. S. F. Oberman, H. Al-Twaijry, and M. J. Flynn. The SNAP project: design of floating-point arithmetic units. In 13th Symposium on Computer Arithmetic (ARITH-13), 1997.

    Google Scholar 

  119. V. G. Oklobdzija. An algorithmic and novel design of a leading zero detector circuit: Comparison with logic synthesis. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2, 1994.

    Article  Google Scholar 

  120. F. Ortiz, J. Humphrey, J. Durbano, and D. Prather. A study on the design of floating-point functions in FPGAs. In Field Programmable Logic and Applications, volume 2778 of Lecture Notes in Computer Science, pages 1131–1135, Lisbon, Portugal, September 2003.

    Chapter  Google Scholar 

  121. A. Paidimarri, A. Cevrero, P. Brisk, and P. Ienne. Fpga implementation of a single-precision floating-point multiply-accumulator with single-cycle accumulation. In 17th IEEE Symposium on Field Programmable Custom Computing Machines, 2009.

    Google Scholar 

  122. B. Parhami. On the complexity of table lookup for iterative division. IEEE Transactions on Computers, C-36(10):1233–1236, 1987.

    Article  MathSciNet  Google Scholar 

  123. B. Parhami. Computer Arithmetic: Algorithms and Hardware Designs. Oxford University Press, 2000.

    Google Scholar 

  124. B. Pasca. Correctly rounded floating-point division for DSP-enabled FPGAs. In Field Programmable Logic and Applications, 2012.

    Google Scholar 

  125. D. Patil, O. Azizi, M. Horowitz, R. Ho, and R. Ananthraman. Robust energy-efficient adder topologies. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 29–37, June 2007.

    Google Scholar 

  126. R. V. K. Pillai, D. Al-Khalili, and A. J. Al-Khalili. A low power approach to floating point adder design. In International Conference on Computer Design, 1997.

    Google Scholar 

  127. J. A. Pineiro and J. D. Bruguera. High-speed double-precision computation of reciprocal, division, square root, and inverse square root. IEEE Transactions on Computers, 51(12):1377–1388, 2002.

    Article  MathSciNet  Google Scholar 

  128. E. Quinnell, E. E. Swartzlander, and C. Lemonds. Floating-point fused multiply-add architectures. In 41st Asilomar Conference on Signals, Systems, and Computers, pages 331–337, November 2007.

    Google Scholar 

  129. J. Ramos and A. Bohorquez. Two operand binary adders with threshold logic. IEEE Transactions on Computers, 48(12):1324–1337, 1999.

    Article  MathSciNet  Google Scholar 

  130. J. E. Robertson. A new class of digital division methods. IRE Transactions on Electronic Computers, EC-7:218–222, 1958. Reprinted in [583].

    Article  Google Scholar 

  131. E. Roesler and B. Nelson. Novel optimizations for hardware floating-point units in a modern FPGA architecture. In Field Programmable Logic and Applications, volume 2438 of Lecture Notes in Computer Science, pages 637–646, 2002.

    MATH  Google Scholar 

  132. D. M. Russinoff. A case study in formal verification of register-transfer logic with ACL2: The floating point adder of the AMD Athlon processor. Lecture Notes in Computer Science, 1954:3–36, 2000.

    Google Scholar 

  133. H. H. Saleh and E. E. Swartzlander. A floating-point fused dot-product unit. In International Conference on Computer Design (ICCD), pages 426–431, 2008.

    Google Scholar 

  134. M. M. Schmookler and K. J. Nowka. Leading zero anticipation and detection – a comparison of methods. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 7–12, June 2001.

    Google Scholar 

  135. E. M. Schwarz, M. Schmookler, and S. D. Trong. FPU implementations with denormalized numbers. IEEE Transactions on Computers, 54(7):825–836, 2005.

    Article  Google Scholar 

  136. P.-M. Seidel. Multiple path IEEE floating-point fused multiply-add. In 46th International Midwest Symposium on Circuits and Systems, pages 1359–1362, 2003.

    Google Scholar 

  137. P.-M. Seidel and G. Even. How many logic levels does floating-point addition require. In International Conference on Computer Design, pages 142–149, 1998.

    Google Scholar 

  138. P.-M. Seidel and G. Even. On the design of fast IEEE floating-point adders. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 184–194, June 2001.

    Google Scholar 

  139. A. M. Shams and M. A. Bayoumi. A novel high-performance CMOS 1-bit full-adder cell. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 47(5), 2000.

    Article  Google Scholar 

  140. N. Shirazi, A. Walters, and P. Athanas. Quantitative analysis of floating point arithmetic on FPGA based custom computing machine. In FPGAs for Custom Computing Machines, pages 155–162, 1995.

    Google Scholar 

  141. D. P. Singh, B. Pasca, and T. S. Czajkowski. High-level design tools for floating point FPGAs. In ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA, February 22–24, 2015, pages 9–12, 2015.

    Google Scholar 

  142. E. Sprangle and D. Carmean. Increasing processor performance by implementing deeper pipelines. In 29th Annual International Symposium on Computer Architecture (ISCA), pages 25–34, 2002.

    Google Scholar 

  143. J. E. Stine and M. J. Schulte. The symmetric table addition method for accurate function approximation. Journal of VLSI Signal Processing, 21:167–177, 1999.

    Article  Google Scholar 

  144. D. A. Sunderland, R. A. Strauch, S. W. Wharfield, H. T. Peterson, and C. R. Cole. CMOS/SOS frequency synthesizer LSI circuit for spread spectrum communications. IEEE Journal of Solid State Circuits, SC-19(4):497–506, 1984.

    Article  Google Scholar 

  145. A. Svoboda. Adder with distributed control. IEEE Transactions on Computers, C-19(8), 1970. Reprinted in [583].

    Google Scholar 

  146. N. Takagi. A hardware algorithm for computing the reciprocal square root. In 15th IEEE Symposium on Computer Arithmetic (ARITH-15), pages 94–100, June 2001.

    Google Scholar 

  147. N. Takagi and S. Kuwahara. A VLSI algorithm for computing the Euclidean norm of a 3D vector. IEEE Transactions on Computers, 49(10):1074–1082, 2000.

    Article  MathSciNet  Google Scholar 

  148. Y. Tao, G. Deyuan, R. Xianglong, H. Limin, F. Xiaoya, and Y. Lei. A novel floating-point function unit combining MAF and 3-input adder. In Signal Processing, Communication and Computing (ICSPCC), 2012.

    Google Scholar 

  149. Y. Tao, G. Deyuan, and F. Xiaoya. A multi-path fused add-subtract unit for digital signal processing. In Computer Science and Automation Engineering (CSAE), 2012.

    Google Scholar 

  150. Y. Tao, G. Deyuan, F. Xiaoya, and J. Nurmi. Correctly rounded architectures for floating-point multi-operand addition and dot-product computation. In Application-Specific Systems, Architectures and Processors (ASAP), 2013.

    Google Scholar 

  151. Y. Tao, G. Deyuan, F. Xiaoya, and R. Xianglong. Three-operand floating-point adder. In 12th International Conference on Computer and Information Technology, pages 192–196, 2012.

    Google Scholar 

  152. M. B. Taylor. Is dark silicon useful? Harnessing the four horsemen of the coming dark silicon apocalypse. In 49th Design Automation Conference, pages 1131–1136, 2012.

    Google Scholar 

  153. A. F. Tenca. Multi-operand floating-point addition. In 19th IEEE Symposium on Computer Arithmetic (ARITH-19), pages 161–168, 2009.

    Google Scholar 

  154. T. Teufel and M. Baesler. FPGA implementation of a decimal floating-point accurate scalar product unit with a parallel fixed-point multiplier. In Reconfigurable Computing and FPGAs, pages 6–11, 2009.

    Google Scholar 

  155. D. B. Thomas. A general-purpose method for faithfully rounded floating-point function approximation in FPGAs. In 22nd IEEE Symposium on Computer Arithmetic (ARITH-22), pages 42–49, 2015.

    Google Scholar 

  156. K. D. Tocher. Techniques of multiplication and division for automatic binary computers. Quarterly Journal of Mechanics and Applied Mathematics, 11(3):364–384, 1958.

    Article  MathSciNet  Google Scholar 

  157. W. J. Townsend, E. E. Swartzlander, Jr., and J. A. Abraham. A comparison of Dadda and Wallace multiplier delays. In SPIE’s 48th Annual Meeting on Optical Science and Technology, pages 552–560, 2003.

    Google Scholar 

  158. S. D. Trong, M. Schmookler, E. M. Schwarz, and M. Kroener. P6 binary floating-point unit. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 77–86, 2007.

    Google Scholar 

  159. A. Tyagi. A reduced-area scheme for carry-select adders. IEEE Transactions on Computers, 42(10):1163–1170, 1993.

    Article  Google Scholar 

  160. H. F. Ugurdag, A. Bayram, V. E. Levent, and S. Gören. Efficient combinational circuits for division by small integer constants. In 23rd IEEE Symposium on Computer Arithmetic (ARITH-23), pages 1–7, July 2016.

    Google Scholar 

  161. A. Vázquez. High-Performance Decimal Floating-Point Units. Ph.D. thesis, Universidade de Santiago de Compostela, 2009.

    Google Scholar 

  162. A. Vázquez, E. Antelo, and P. Montuschi. A new family of high performance parallel decimal multipliers. In 18th IEEE Symposium on Computer Arithmetic (ARITH-18), pages 195–204, 2007.

    Google Scholar 

  163. D. Villeger and V. G. Oklobdzija. Evaluation of Booth encoding techniques for parallel multiplier implementation. Electronics Letters, 29(23):2016–2017, 1993.

    Article  Google Scholar 

  164. Y. Voronenko and M. Püschel. Multiplierless multiple constant multiplication. ACM Trans. Algorithms, 3(2), 2007.

    Article  MathSciNet  Google Scholar 

  165. C. S. Wallace. A suggestion for a fast multiplier. IEEE Transactions on Electronic Computers, pages 14–17, 1964. Reprinted in [583].

    Google Scholar 

  166. L.-K. Wang and M. J. Schulte. Decimal floating-point division using Newton–Raphson iteration. In Application-Specific Systems, Architectures and Processors, pages 84–95, 2004.

    Google Scholar 

  167. L.-K. Wang, M. J. Schulte, J. D. Thompson, and N. Jairam. Hardware designs for decimal floating-point addition and related operations. IEEE Transactions on Computers, 58(2):322–335, 2009.

    Article  MathSciNet  Google Scholar 

  168. X. Wang, S. Braganza, and M. Leeser. Advanced components in the variable precision floating-point library. In Field-Programmable Custom Computing Machines, pages 249–258, 2006.

    Google Scholar 

  169. M. J. Wirthlin. Constant coefficient multiplication using look-up tables. VLSI Signal Processing, 36(1):7–15, 2004.

    Article  Google Scholar 

  170. S. Xing and W. Yu. FPGA adders: Performance evaluation and optimal design. IEEE Design & Test of Computers, 15:24–29, 1998.

    Article  Google Scholar 

  171. X. Y. Yu, Y.-H. Chan, B. Curran, E. Schwarz, M. Kelly, and B. Fleischer. A 5GHz+ 128-bit binary floating-point adder for the POWER6 processor. In European Solid-State Circuits Conference, pages 166–169, 2006.

    Google Scholar 

  172. N. Zhuang and H. Wu. A new design of the CMOS full adder. IEEE Journal on Solid-State Circuits, 27:840–844, 1992.

    Article  Google Scholar 

  173. L. Zhuo and V. K. Prasanna. Scalable and modular algorithms for floating-point matrix multiplication on FPGAs. In 18th International Parallel and Distributed Processing Symposium (IPDPS), April 2004.

    Google Scholar 

  174. L. Zhuo and V. K. Prasanna. High performance linear algebra operations on reconfigurable systems. In Supercomputing, 2005.

    Google Scholar 

  175. R. Zimmermann. Binary Adder Architectures for Cell-Based VLSI and Their Synthesis. Ph.D. thesis, Swiss Federal Institute of Technology, Zurich, 1997.

    Google Scholar 

  176. V. Zyuban, D. Brooks, V. Srinivasan, M. Gschwind, P. Bose, P. N. Strenski, and P. G. Emma. Integrated analysis of power and performance for pipelined microprocessors. IEEE Transactions on Computers, 53(8):1004–1016, 2004.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Muller, JM. et al. (2018). Hardware Implementation of Floating-Point Arithmetic. In: Handbook of Floating-Point Arithmetic. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-76526-6_8

Download citation

Publish with us

Policies and ethics