Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic.

Similar presentations


Presentation on theme: "CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic."— Presentation transcript:

1 CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic

2 Floating Point (a brief look) We need a way to represent –numbers with fractions, e.g., 3.1416 –very small numbers, e.g.,.000000001 –very large numbers, e.g., 3.15576  10 9 Representation: –sign, exponent, significand: (–1) sign  significand  2 exponent –more bits for significand gives more accuracy –more bits for exponent increases range IEEE 754 floating point standard: –single precision: 8 bit exponent, 23 bit significand –double precision: 11 bit exponent, 52 bit significand

3 Floating point representation: The idea is to normalize all numbers, so the significand has exactly one digit to the left of the decimal point. –12345 = 1.2345 * 10^4 –.0000012345 = 1.2345 * 10^-6 –Do this in binary: 1.01110 x 2^(1011) IEEE FP representation –(+/-) 1.0101010101010101010101 * 2 ^ ( 10101010) –This is single precision –Double precision: 64 bits in all. Where does one need accuracy of that level?

4 Floating point numbers Representation issues: –sign bit, exponent, significand –Question: how to represent each field –Question: which order to lay them out in a word? –Factor: should be easy to do comparisons (for sorting) For arithmetic, we will have special hardware anyway –Choice: Sign + magnitude representation Sign bit, followed by exponent, then significand (why?) exponent: represented with a “bias”: add 127 (1023 for double precision) significand: assume implicit 1. (so 00001 means 1.00001)

5 Floating point representation So: –(+/-) x (1 + significand) x 2 ^ (exponent - bias) is the value of a floating point number –Example: 0 00001000 01010000000000000000000 –Example: convert -.41 to single precision form

6 IEEE 754 floating-point standard Leading “1” bit of significand is implicit Exponent is “biased” to make sorting easier –all 0s is smallest exponent all 1s is largest –bias of 127 for single precision and 1023 for double precision –summary: (–1) sign  significand)  2 exponent – bias Example: –decimal: -.75 = -3/4 = -3/2 2 –binary: -.11 = -1.1 x 2 -1 –floating point: exponent = 126 = 01111110 –IEEE single precision: 10111111010000000000000000000000

7 Floating point addition The problem is: the exponents of numbers being added may be different –2.0 * 10^1 + 3.0 * 10^(-1) –2.0 * 10^1 +.03 * 10^ 1 : Now we can add them –2.03 * 10 ^1 –But we are not necessarily done! –E.g. 9.74 * 10^0 + 3.3 * 10^(-1) –10.07 * 10^0 is not correct form! –Shift again to get the correct form: 1.037 * 10^1

8 You can get different results A + B + C = A + (B+C) = (A+B) + C –Right? Can you see a problem? When do you lose bits?

9 Floating point multiplication Add exponents, but subtract bias Then multiply significands Then normalize


Download ppt "CS 232: Computer Architecture II Prof. Laxmikant (Sanjay) Kale Floating point arithmetic."

Similar presentations


Ads by Google