Abstract
Since 1990 many floating-point units have been designed using a fused multiply-add dataflow. This type of design has a huge performance advantage over a separate multiplier and adder. With one compound operation, effectively two dependent operations per cycle can be achieved. Even though a fused multiply-add dataflow is now common in today’s microprocessors, there are many details which have never been discussed in papers. This chapter shows the implementation of the different parts of the fused multiply-add dataflow including the counter tree, suppression of sign extension encoding, leading zero anticipation, and end around carry adder design. This chapter illustrates algorithms and implementation details used in today’s floating-point units that have been passed down from designer to designer, becoming the folklore of floating-point unit design.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
“IEEE standard for floating-point arithmetic, ANSI/IEEE Std 754R,” The Institute of Electrical and Electronic Engineers, Inc., In progress, http://754r.ucbtest.org/drafts/754r.pdf.
Knuth, D. “The Art of Computer Programming, Vol. 2: Seminumerical Algorithms, 3rd ed.” Addison-Wesley, Reading, MA, 1998, 467–469.
Montoye, R.K.; Hokenek, E.; Runyon, S.L. “Design of the IBM RISC System/6000 floating-point execution unit”, IBM J. Res. Dev., 1990, 34(1), 59–70.
“Enterprise Systems Architecture/390 Principles of Operation”, Order No. SA22-7201-5, available through IBM branch offices, Sept 1998.
Waser, S.; Flynn, M.J. Introduction to Arithmetic for Digital Systems Designers, Holt, Rinehart, &Winston, 1982.
“IEEE standard for binary floating-point arithmetic, ANSI/IEEE Std 754-1985,” Institute of Electrical and Electronic Engineers, Inc., New York, Aug. 1985.
Intel Corporation, “Intel Itanium Architecture Sofware Developer’s Manual, Volume 1 Application Architecture,” ftp://download.intel.com/design/Itanium/Downloads/24531703s.pdf, Dec. 2001.
Intel Corporation, “IA-32 Intel Architecture Sofware Developer’s Manual, Volume 1: Basic Architecture,” ftp://download.intel.com/design/Pentium4/manuals/24547008.pdf, 1997.
Schwarz, E.; Schmookler, M.; Dao Trong, S. “FPU implementations with denormalized numbers”, IEEE Trans. Computers, 2005, 54(7), 825–836.
Schwarz, E.; Schmookler, M.; Dao Trong, S. “Hardware Implementations of Denormalized Number Handling”, Proc. 16th IEEE Symp. on Computer Arith. Metic, June 2003, 70–78.
Booth, A.D. “Asigned multiplication technique”, Q. J. Mech. Appl. Math., 1951, 4(2), 236–240.
Vassiliadis, S.; Schwarz, E.; Hanrahan, D. “A general proof for overlapped multi-bit scanning multiplications”, IEEE Trans. Computers, 1998, 38(2), 172–183.
Vassiliadis, S.; Schwarz, E.; Sung, B. “Hard-wired multipliers with encoded partial products,” IEEE Trans. Computers, 1991, 40(11), 1181–1197.
Wallace, C.S. “A suggestion for parallel multipliers”, IEEE Trans. Electron. Comput., 1964, EC-13, 14–17.
Dadda, L. “Some schemes for parallel multipliers”, Alta Frequenza, 1965, 34, 349–356.
Weinberger, A. “4:2 carry-save adder module”, IBM Technical Disclosure Bull., 1981, 23, 3811–3814.
Ohkubo N.; et al. “A 4.4 ns CMOS 54 × 54-b multiplier using pass-transistor multiplexer”, IEEE J. Solid-State Circuits, 1995, 30(3), 251–257.
Richards, R.K. Arithmetic operations in digital computers, D. Van Nostrand Co., Inc., New York, 120, 1955, 120.
Beaumont-Smith, A.; Lim, C. “Parallel prefix adder design”, Proc. 15th IEEE Symp. Comp. Arith., Vail, June 2001, 218–225.
Hokenek, E.; Montoye, R.K. “Leading-zero anticipator (LZA) in the IBM RISC System/6000 floating-point execution unit”, IBM J. Res. Dev., 1990, 34(1), 71–77.
Schmookler, M.S.; Nowka, K.J. “Leading zero anticipation and detection — a comparison of methods”, Proc. 15th IEEE Symp Computer Arithmetic, Vail, 11–13 June, 2001.
Oklobdzija, V. “An implementation algorithm and design of a novel leading zero detector circuit”, Proc. 26th Asilomar Conf. on Signals, Systems, and Computers, 1992, 391–395.
Oklobdzija, V. “An algorithmic and novel design of a leading zero detector circuit: comparison with logic synthesis”, IEEE Trans. on VLSI Systems, 1993 2(1), 124–128.
Seidel, P.M. “Multiple path IEEE floating-point fused multiply-add”, Proc. 46th Int. IEEE Midwest Symp. Circuits and Systems (MWS-CAS), 2003.
Bruguera, J.D.; Lang, T. “Floating-point fused mulipy-add: reduced latency for floating-point addition”, Proc. 17th IEEE Symp. Computer Arithmetic, Hyannis, 27–29 June, 2005.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer
About this chapter
Cite this chapter
Schwarz, E.M. (2006). Binary Floating-Point Unit Design. In: Oklobdzija, V.G., Krishnamurthy, R.K. (eds) High-Performance Energy-Efficient Microprocessor Design. Series on Integrated Circuits and Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-34047-0_8
Download citation
DOI: https://doi.org/10.1007/978-0-387-34047-0_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-28594-8
Online ISBN: 978-0-387-34047-0
eBook Packages: EngineeringEngineering (R0)