Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 9 - Floating Point Fall 2004 Reading: HW Due Friday 10/1: 3.7,3.9, 3.10, 3.14 EXAM 1: Monday 10/4 Why did the Ariane 5 Explode? (image source: java.sun.com ) Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved Dave Patterson’s CS 152 Slides - Fall 1997 © UCB Rob Rutenbar’s Slides - Fall 1999 CMU other sources as noted
ECE 313 Fall 2004Lecture 9 - Floating Point2 Outline - Floating Point Motivation and Key Ideas IEEE 754 Floating Point Format Range and precision Floating Point Arithmetic MIPS Floating Point Instructions Rounding & Errors Summary
ECE 313 Fall 2004Lecture 9 - Floating Point3 Floating Point - Motivation Review: n-bit integer representations Unsigned: 0 to 2 n -1 Signed Two’s Complement: - 2 n-1 to 2 n-1 -1 Biased (excess-b):-b to 2 n -b Problem: how do we represent: Very large numbers9,345,524,282,135,672, Very small numbers , Rational numbers2/3 Irrational numberssqrt(2) Transcendental numberse, π
ECE 313 Fall 2004Lecture 9 - Floating Point4 Fixed Point Representation Idea: fixed-point numbers with fractions Decimal point (binary point) marks start of fraction Decimal: = 1 X X X X Binary: = 1 X X X 2 -7 Problems Limited locations for “decimal point” (binary point”) Won’t work for very small or very larger numbers
ECE 313 Fall 2004Lecture 9 - Floating Point5 Another Approach: Scientific Notation Represent a number as a combination of Mantissa (significand): Normalized number AND Exponent (base 10) Example:6.02 X Significand (mantissa) Radix (base) Exponent
ECE 313 Fall 2004Lecture 9 - Floating Point6 Floating Point Key idea: adapt scientific notation to binary Fixed-width binary number for significand Fixed-width binary number for exponent (base 2) Idea: represent a number as 1.xxxxxxx two X 2 yyyy Significand (mantissa) Radix (2) Exponent Leading ‘1’ (Implicit) Important Points: This is a tradeoff between precision and range Arithmetic is approximate - error is inevitable!
ECE 313 Fall 2004Lecture 9 - Floating Point7 Outline - Floating Point Motivation and Key Ideas IEEE 754 Floating Point Format Range and precision Floating Point Arithmetic MIPS Floating Point Instructions Rounding & Errors Summary
ECE 313 Fall 2004Lecture 9 - Floating Point8 IEEE 754 Floating Point Single precision (C/C++/Java float type) Value N = (-1) S X 1.F X 2 E-127 Double precision (C/C++/Java double type) Value N = (-1) S X 1.F X 2 E-1023 Bias
ECE 313 Fall 2004Lecture 9 - Floating Point9 Floating Point Examples 8.75 ten = 1 X X X 2 -2 = X 2 3 Single Precision: Significand: …. (note leading 1 is implied) Exponent: = 130 = two Double Precision: Significand: … Exponent: = 1026 = two
ECE 313 Fall 2004Lecture 9 - Floating Point10 Floating Point Examples ten = 1 X X 2 -3 = 1. 1 X 2 -2 Single Precision: Significand: …. Exponent: = 125 = two Double Precision: Significand: … Exponent: = 1021 = two
ECE 313 Fall 2004Lecture 9 - Floating Point11 Floating Point Examples Q: What is the value of the following single- precision word? Significand = Exponent = = -119 Final Result = ( ) X = 2.36 X
ECE 313 Fall 2004Lecture 9 - Floating Point12 Special Values in IEEE Floating Point exponent - reserved for zero value (all bits zero) “Denormalized numbers” - drop the “1.” Used for “very small” numbers … “gradual underflow” Smallest denormalized number (single precision): X = exponent Infinity exponent, zero significand NaN (Not a Number) exponent, nonzero significand
ECE 313 Fall 2004Lecture 9 - Floating Point13 Outline - Floating Point Motivation and Key Ideas IEEE 754 Floating Point Format Range and precision Floating Point Arithmetic MIPS Floating Point Instructions Rounding & Errors Summary
ECE 313 Fall 2004Lecture 9 - Floating Point14 Floating Point Range and Precision The tradeoff: range in exchange for uniformity “Tiny” example: floating point with: 3 exponent bits 2 signficand bits –– –10– + DenormalizedNormalizedInfinity –1–0.8–0.6–0.4– DenormalizedNormalizedInfinity +0–0 Graphic and Example Source: R. Bryant and D. O’Halloran, Computer Systems: A Programmer’s Perspective, © Prentice Hall, 2002 s expS
ECE 313 Fall 2004Lecture 9 - Floating Point15 Visualizing Floating Point - “Small” FP Representation 8-bit Floating Point Representation the sign bit is in the most significant bit. the next four bits are the exponent, with a bias of 7. the last three bits are the frac Same General Form as IEEE Format normalized, denormalized representation of 0, NaN, infinity) s expsignificand Example Source: R. Bryant and D. O’Halloran, Computer Systems: A Programmer’s Perspective, © Prentice Hall, 2002
ECE 313 Fall 2004Lecture 9 - Floating Point16 Small FP - Values Related to Exponent ExpexpE2 E /64(denorms) / / / / / / n/a(inf, Nan).
ECE 313 Fall 2004Lecture 9 - Floating Point17 Small FP Example - Dynamic Range s exp frac EValue /8*1/64 = 1/ /8*1/64 = 2/512 … /8*1/64 = 6/ /8*1/64 = 7/ /8*1/64 = 8/ /8*1/64 = 9/512 … /8*1/2 = 14/ /8*1/2 = 15/ /8*1 = /8*1 = 9/ /8*1 = 10/8 … /8*128 = /8*128 = n/ainf closest to zero largest denorm smallest norm closest to 1 below closest to 1 above largest norm Denormalized numbers Normalized numbers
ECE 313 Fall 2004Lecture 9 - Floating Point18 Learning from Tiny & Small FP Non-uniform spacing of numbers very small spacing for large negative exponents very large spacing for large positive exponents Exact representation: sums of powers of 2 Approximate representation: everything else
ECE 313 Fall 2004Lecture 9 - Floating Point19 Summary: IEEE Floating Point Values Source: book p. 301
ECE 313 Fall 2004Lecture 9 - Floating Point20 IEEE Floating Point - Interesting Numbers Description expfrac Numeric Value Zero00…0000…000.0 Smallest Pos. Denorm.00…0000…012 – {23,52} X 2 – {126,1022} Single 1.4 X 10 –45 Double 4.9 X 10 –324 Largest Denormalized00…0011…11(1.0 – ) X 2 – {126,1022} Single 1.18 X 10 –38 Double 2.2 X 10 –308 Smallest Pos. Normalized00…0100…001.0 X 2 – {126,1022} Just larger than largest denormalized One01…1100…001.0 Largest Normalized11…1011…11(2.0 – ) X 2 {127,1023} Single 3.4 X Double 1.8 X
ECE 313 Fall 2004Lecture 9 - Floating Point21 Outline - Floating Point Motivation and Key Ideas IEEE 754 Floating Point Format Range and precision Floating Point Arithmetic MIPS Floating Point Instructions Rounding & Errors Summary
ECE 313 Fall 2004Lecture 9 - Floating Point22 Floating Point Addition (Fig. 3.16) 1.Align binary point to number with larger exponent 2.Add significands 3.Normalize result and adjust exponent 4. If overflow/underflow throw exception 5.Round result (go to 3 if normalization needed again) A 1.11 X X B X X X 2 0 (Normalize) 1.00 X Hardware - Fig. 3.17, p. 201
ECE 313 Fall 2004Lecture 9 - Floating Point23 Floating Point Multiplication (Fig. 3.18) 1.Add 2 exponents together to get new exponent (subtract 127 to get proper biased value) 2.Multiply significands 3.Normalize result if necessary (shift right) & adjust exponent 4. If overflow/underflow throw exception 5.Round result (go to 3 if normalization needed again) 6.Set sign of result using sign of X, Y
ECE 313 Fall 2004Lecture 9 - Floating Point24 Outline - Floating Point Motivation and Key Ideas IEEE 754 Floating Point Format Range and precision Floating Point Arithmetic MIPS Floating Point Instructions Rounding & Errors Summary
ECE 313 Fall 2004Lecture 9 - Floating Point25 MIPS Floating Point Instructions Organized as a coprocessor Separate registers $f0-$f31 Separate operations Separate data transfer (to same memory) Basic operations add.s - single add.d - double sub.s - single sub.d - double mul.s - single mul.d - double div.s - single div.d - double
ECE 313 Fall 2004Lecture 9 - Floating Point26 MIPS Floating Point Instructions (cont’d) Data transfer lwc1, swcl (l.s, s.s) - load/store float to fp reg l.d, s.d - load/store double to fp reg pair Testing / branching c.lt.s, c.lt.d, c.eq.s, c.eq.d, … compare and set condition bit if true bclt - branch if condition true bclf - branch if condition false
ECE 313 Fall 2004Lecture 9 - Floating Point27 Outline - Floating Point Motivation and Key Ideas IEEE 754 Floating Point Format Range and precision Floating Point Arithmetic MIPS Floating Point Instructions Rounding & Errors Summary
ECE 313 Fall 2004Lecture 9 - Floating Point28 Rounding Extra bits allow rounding after computation Guard Digit (may shift into number during normalization) Round digit - used to round when guard bit shifted during normalization Sticky bit - used when there are 1’s to the right of the round digit e.g., “ ” (round to nearest even) IEEE 754 supports four rounding modes Always round up Always round down Truncate Round to nearest even (most common)
ECE 313 Fall 2004Lecture 9 - Floating Point29 Limitations on Floating-Point Math Most numbers are approximate Roundoff error is inevitable Range (and accuracy) vary depending on exponent “Normal” math properties not guaranteed: Inverse (1/r)*r ≠ 1 Associative (A+B) + C ≠ A + (B+C) (A*B) * C ≠ A * (B*C) Distributive (A+B) * C ≠ A*B + B*C Scientific calculations require error management take a numerical analysis for more info
ECE 313 Fall 2004Lecture 9 - Floating Point30 IEEE Floating Point - Special Properties Floating Point 0 same as Integer 0 All bits = 0 Can (Almost) Use Unsigned Integer Comparison A > B if: A.EXP > B.EXP or A.EXP=B.EXP and A.SIG > B.SIG But, must first compare sign bits Must consider -0 == 0 NaNs problematic Will be greater than any other values What should comparison yield? This is equivalent to unsigned comparision!
ECE 313 Fall 2004Lecture 9 - Floating Point31 Addendum - Why Did the Ariane 5 Explode? In 1996 Ariane 5 Flight 501 exploded after launch. Estimated cost of accident: $500 million
ECE 313 Fall 2004Lecture 9 - Floating Point32 Addendum - Why Did the Ariane 5 Explode? The cause was traced to the Inertial reference system (SRI). Both the main and backup SRI failed. Both units failed due to an out-of-range conversion Input: double precision floating point Output: 16-bit integer for “horizontal bias” (BH) Careful analysis during software design had indicated that BH would “fit” in 16 bits So, why didn’t it fit?
ECE 313 Fall 2004Lecture 9 - Floating Point33 Addendum - Why did the Ariane 5 Explode? Careful analysis during software design had indicated that BH would “fit” in 16 bits BUT, all analysis had been done for the Ariane 4, the predecessor of Ariane 5 - software was reused Since Ariane 5 was a larger rocket, the values for BH were higher than anticipated AND, there was no handler to deal with the exception! For more information: Or, Google “Ariane 5”
ECE 313 Fall 2004Lecture 9 - Floating Point34 Summary - Chapter 3 Important Topics Signed & Unsigned Numbers (3.2) Addition and Subtraction (3.3) Carry Lookahead (B.6) Constructing an ALU (B.5) Multiplication and Division (3.4, 4.5) Floating Point (3.6) Coming Up: Performance (Chapter 4)