Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECE 313 - Computer Organization Floating Point Feb 2005 Reading: 3.6-3.9 HW Due Monday 3/28: 3.7,3.9, 3.10, 3.14 3.30, 3.35, 3.37, 3.38, 3.40, 3.44 EXAM.

Similar presentations


Presentation on theme: "ECE 313 - Computer Organization Floating Point Feb 2005 Reading: 3.6-3.9 HW Due Monday 3/28: 3.7,3.9, 3.10, 3.14 3.30, 3.35, 3.37, 3.38, 3.40, 3.44 EXAM."— Presentation transcript:

1

2 ECE 313 - Computer Organization Floating Point Feb 2005 Reading: 3.6-3.9 HW Due Monday 3/28: 3.7,3.9, 3.10, 3.14 3.30, 3.35, 3.37, 3.38, 3.40, 3.44 EXAM 1: Wednesday 3/30 Why did the Ariane 5 Explode? (image source: java.sun.com ) Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved Dave Patterson’s CS 152 Slides - Fall 1997 © UCB Rob Rutenbar’s 18-347 Slides - Fall 1999 CMU other sources as noted

3 Feb 2005Floating Point2 Outline - Floating Point  Motivation and Key Ideas   IEEE 754 Floating Point Format  Range and precision  Floating Point Arithmetic  MIPS Floating Point Instructions  Rounding & Errors  Summary

4 Feb 2005Floating Point3 Floating Point - Motivation  Review: n-bit integer representations  Unsigned: 0 to 2 n -1  Signed Two’s Complement: - 2 n-1 to 2 n-1 -1  Biased (excess-b):-b to 2 n -b  Problem: how do we represent:  Very large numbers9,345,524,282,135,672, 2 354  Very small numbers0.00000000000000005216, 2 -100  Rational numbers2/3  Irrational numberssqrt(2)  Transcendental numberse, π

5 Feb 2005Floating Point4 Fixed Point Representation  Idea: fixed-point numbers with fractions  Decimal point (binary point) marks start of fraction  Decimal: 1.2503 = 1 X 10 0 + 2 X 10 -1 + 5 X 10 -2 + 3 X 10 -4  Binary: 1.0100001 = 1 X 2 0 + 1 X 2 -2 + 1 X 2 -7  Problems  Limited locations for “decimal point” (binary point”)  Won’t work for very small or very larger numbers

6 Feb 2005Floating Point5 Another Approach: Scientific Notation  Represent a number as a combination of  Mantissa (significand): Normalized number AND  Exponent (base 10)  Example:6.02 X 10 23 Significand (mantissa) Radix (base) Exponent

7 Feb 2005Floating Point6  Key idea: adapt scientific notation to binary  Fixed-width binary number for significand  Fixed-width binary number for exponent (base 2)  Idea: represent a number as 1.xxxxxxx two X 2 yyyy Significand (mantissa) Radix (2) Exponent Leading ‘1’ (Implicit)  Important Points:  This is a tradeoff between precision and range  Arithmetic is approximate - error is inevitable!

8 Feb 2005Floating Point7 Outline - Floating Point  Motivation and Key Ideas  IEEE 754 Floating Point Format   Range and precision  Floating Point Arithmetic  MIPS Floating Point Instructions  Rounding & Errors  Summary

9 Feb 2005Floating Point8 IEEE 754 Floating Point  Single precision (C/C++/Java float type) Value N = (-1) S X 1.F X 2 E-127  Double precision (C/C++/Java double type) Value N = (-1) S X 1.F X 2 E-1023 Bias

10 Feb 2005Floating Point9 Floating Point Examples  8.75 ten = 1 X 2 3 + 1 X 2 -1 + 1 X 2 -2 = 1.00011 X 2 3  Single Precision: Significand: 1.00011000…. (note leading 1 is implied) Exponent: 3 + 127 = 130 = 10000010 two  Double Precision: Significand: 1.00011000… Exponent: 3 + 1023 = 1026 = 10000000010 two

11 Feb 2005Floating Point10 Floating Point Examples  -0.375 ten = 1 X 2 -2 + 1 X 2 -3 = 1. 1 X 2 -2  Single Precision: Significand: 1.1000…. Exponent: -2 + 127 = 125 = 01111101 two  Double Precision: Significand: 1.1000… Exponent: -2 + 1023 = 1021 = 01111111101 two

12 Feb 2005Floating Point11 Floating Point Examples  Q: What is the value of the following single- precision word?  Significand = 1 + 2 -1 + 2 -4 + 2 -8 + 2 -10 + 2 -12  Exponent = 8 - 127 = -119  Final Result = (1 + 2 -1 + 2 -4 + 2 -8 + 2 -10 + 2 -12 ) X 2 -119 = 2.36 X 10 -36

13 Feb 2005Floating Point12 Special Values in IEEE Floating Point  0000000 exponent - reserved for  zero value (all bits zero)  “Denormalized numbers” - drop the “1.” Used for “very small” numbers … “gradual underflow” Smallest denormalized number (single precision): 0.00000000000000000000001 X 2 -126 = 2 -149  1111111 exponent  Infinity - 111111 exponent, zero significand  NaN (Not a Number) - 1111111 exponent, nonzero significand

14 Feb 2005Floating Point13 Outline - Floating Point  Motivation and Key Ideas  IEEE 754 Floating Point Format  Range and precision   Floating Point Arithmetic  MIPS Floating Point Instructions  Rounding & Errors  Summary

15 Feb 2005Floating Point14 Floating Point Range and Precision  The tradeoff: range in exchange for uniformity  “Tiny” example: floating point with:  3 exponent bits  2 signficand bits –– –10–50+5+10 ++ DenormalizedNormalizedInfinity –1–0.8–0.6–0.4–0.20+0.2+0.4+0.6+0.8+1 DenormalizedNormalizedInfinity +0–0 Graphic and Example Source: R. Bryant and D. O’Halloran, Computer Systems: A Programmer’s Perspective, © Prentice Hall, 2002 s expS 0 1245

16 Feb 2005Floating Point15 Visualizing Floating Point - “Small” FP Representation  8-bit Floating Point Representation  the sign bit is in the most significant bit.  the next four bits are the exponent, with a bias of 7.  the last three bits are the frac  Same General Form as IEEE Format  normalized, denormalized  representation of 0, NaN, infinity) s expsignificand 0 2367 Example Source: R. Bryant and D. O’Halloran, Computer Systems: A Programmer’s Perspective, © Prentice Hall, 2002

17 Feb 2005Floating Point16 Small FP - Values Related to Exponent ExpexpE2 E 00000-6 1/64(denorms) 10001-61/64 20010-51/32 30011-41/16 40100-31/8 50101-21/4 60110-11/2 70111 01 81000+12 91001+24 101010+38 111011+416 121100+532 131101+664 141110+7128 151111n/a(inf, Nan).

18 Feb 2005Floating Point17 Small FP Example - Dynamic Range s exp frac EValue 0 0000 000-60 0 0000 001-61/8*1/64 = 1/512 0 0000 010-62/8*1/64 = 2/512 … 0 0000 110-66/8*1/64 = 6/512 0 0000 111-67/8*1/64 = 7/512 0 0001000-68/8*1/64 = 8/512 0 0001 001 -69/8*1/64 = 9/512 … 0 0110 110-114/8*1/2 = 14/16 0 0110 111-115/8*1/2 = 15/16 0 0111 000 08/8*1 = 1 0 0111 001 09/8*1 = 9/8 0 0111 010 010/8*1 = 10/8 … 0 1110110 714/8*128 = 224 0 1110 111 715/8*128 = 240 0 1111 000n/ainf closest to zero largest denorm smallest norm closest to 1 below closest to 1 above largest norm Denormalized numbers Normalized numbers

19 Feb 2005Floating Point18 Learning from Tiny & Small FP  Non-uniform spacing of numbers  very small spacing for large negative exponents  very large spacing for large positive exponents  Exact representation: sums of powers of 2  Approximate representation: everything else

20 Feb 2005Floating Point19 Summary: IEEE Floating Point Values Source: book p. 301

21 Feb 2005Floating Point20 IEEE Floating Point - Interesting Numbers Description expfrac Numeric Value Zero00…0000…000.0 Smallest Pos. Denorm.00…0000…012 – {23,52} X 2 – {126,1022}  Single  1.4 X 10 –45  Double  4.9 X 10 –324 Largest Denormalized00…0011…11(1.0 –  ) X 2 – {126,1022}  Single  1.18 X 10 –38  Double  2.2 X 10 –308 Smallest Pos. Normalized00…0100…001.0 X 2 – {126,1022}  Just larger than largest denormalized One01…1100…001.0 Largest Normalized11…1011…11(2.0 –  ) X 2 {127,1023}  Single  3.4 X 10 38  Double  1.8 X 10 308

22 Feb 2005Floating Point21 Outline - Floating Point  Motivation and Key Ideas  IEEE 754 Floating Point Format  Range and precision  Floating Point Arithmetic   MIPS Floating Point Instructions  Rounding & Errors  Summary

23 Feb 2005Floating Point22 Floating Point Addition (Fig. 3.16) 1.Align binary point to number with larger exponent 2.Add significands 3.Normalize result and adjust exponent 4. If overflow/underflow throw exception 5.Round result (go to 3 if normalization needed again) A 1.11 X 2 0 1.11 X 2 0 1.75 +B+ 1.00 X 2 -2 + 0.01 X 2 0 0.25 10.00 X 2 0 (Normalize) 1.00 X 2 1 2.00 Hardware - Fig. 3.17, p. 201

24 Feb 2005Floating Point23 Compare the exponent of the two numbers. Shift the smaller to the right until its exponent Would match the larger exponent Normalize the sum, either shifting right and Increment the exponent or shifting left and Decrementing the exponent Round the significant to the appropriate Number of bits Add the significants Overflow or underflow Still normalized?

25 Feb 2005Floating Point24 Hardware Fig. 3.17, p. 201

26 Feb 2005Floating Point25 Floating Point Multiplication (Fig. 3.18) 1.Add 2 exponents together to get new exponent (subtract 127 to get proper biased value) 2.Multiply significands 3.Normalize result if necessary (shift right) & adjust exponent 4. If overflow/underflow throw exception 5.Round result (go to 3 if normalization needed again) 6.Set sign of result using sign of X, Y

27 Feb 2005Floating Point26 Outline - Floating Point  Motivation and Key Ideas  IEEE 754 Floating Point Format  Range and precision  Floating Point Arithmetic  MIPS Floating Point Instructions   Rounding & Errors  Summary

28 Feb 2005Floating Point27 MIPS Floating Point Instructions  Organized as a coprocessor  Separate registers $f0-$f31  Separate operations  Separate data transfer (to same memory)  Basic operations  add.s - single add.d - double  sub.s - single sub.d - double  mul.s - single mul.d - double  div.s - single div.d - double

29 Feb 2005Floating Point28 MIPS Floating Point Instructions (cont’d)  Data transfer  lwc1, swcl (l.s, s.s) - load/store float to fp reg  l.d, s.d - load/store double to fp reg pair  Testing / branching  c.lt.s, c.lt.d, c.eq.s, c.eq.d, … compare and set condition bit if true  bclt - branch if condition true  bclf - branch if condition false

30 Feb 2005Floating Point29 Outline - Floating Point  Motivation and Key Ideas  IEEE 754 Floating Point Format  Range and precision  Floating Point Arithmetic  MIPS Floating Point Instructions  Rounding & Errors   Summary

31 Feb 2005Floating Point30 Rounding  Extra bits allow rounding after computation  Guard Digit (may shift into number during normalization)  Round digit - used to round when guard bit shifted during normalization  Sticky bit - used when there are 1’s to the right of the round digit e.g., “0.010000001” (round to nearest even)  IEEE 754 supports four rounding modes  Always round up  Always round down  Truncate  Round to nearest even (most common)

32 Feb 2005Floating Point31 Limitations on Floating-Point Math  Most numbers are approximate  Roundoff error is inevitable  Range (and accuracy) vary depending on exponent  “Normal” math properties not guaranteed:  Inverse (1/r)*r ≠ 1  Associative (A+B) + C ≠ A + (B+C) (A*B) * C ≠ A * (B*C)  Distributive (A+B) * C ≠ A*B + B*C  Scientific calculations require error management take a numerical analysis for more info

33 Feb 2005Floating Point32 IEEE Floating Point - Special Properties  Floating Point 0 same as Integer 0  All bits = 0  Can (Almost) Use Unsigned Integer Comparison  A > B if: A.EXP > B.EXP or A.EXP=B.EXP and A.SIG > B.SIG  But, must first compare sign bits  Must consider -0 == 0  NaNs problematic Will be greater than any other values What should comparison yield? This is equivalent to unsigned comparision!

34 Feb 2005Floating Point33 Addendum - Why Did the Ariane 5 Explode?  In 1996 Ariane 5 Flight 501 exploded after launch.  Estimated cost of accident: $500 million

35 Feb 2005Floating Point34 Addendum - Why Did the Ariane 5 Explode?  The cause was traced to the Inertial reference system (SRI).  Both the main and backup SRI failed.  Both units failed due to an out-of-range conversion  Input: double precision floating point  Output: 16-bit integer for “horizontal bias” (BH)  Careful analysis during software design had indicated that BH would “fit” in 16 bits  So, why didn’t it fit?

36 Feb 2005Floating Point35 Addendum - Why did the Ariane 5 Explode?  Careful analysis during software design had indicated that BH would “fit” in 16 bits  BUT, all analysis had been done for the Ariane 4, the predecessor of Ariane 5 - software was reused  Since Ariane 5 was a larger rocket, the values for BH were higher than anticipated  AND, there was no handler to deal with the exception!  For more information:  http://www.ima.umn.edu/~arnold/disasters/ariane.html  Or, Google “Ariane 5”

37 Feb 2005Floating Point36 Summary - Chapter 3  Important Topics  Signed & Unsigned Numbers (3.2)  Addition and Subtraction (3.3)  Carry Lookahead (B.6)  Constructing an ALU (B.5)  Multiplication and Division (3.4, 4.5)  Floating Point (3.6)  Coming Up:  Performance (Chapter 4)


Download ppt "ECE 313 - Computer Organization Floating Point Feb 2005 Reading: 3.6-3.9 HW Due Monday 3/28: 3.7,3.9, 3.10, 3.14 3.30, 3.35, 3.37, 3.38, 3.40, 3.44 EXAM."

Similar presentations


Ads by Google