Dr Damian Conway Room 132 Building 26 Real Number Representation (Lecture 25 of the Introduction to Computer Programming series) Dr Damian Conway Room 132 Building 26
Some Terminology All digits in a number following any leading zeros are significant digits: 12.345 -0.12345 0.00012345
Some Terminology The scientific notation for real numbers is: mantissa base exponent
Some Terminology The mantissa is always normalized between 1 and the base (i.e. exactly one significant figure before the point): Normalized Unnormalized 2.9979 108 2997.9 105 B.139FC 1612 B1.39FC 1611 1.0110110101 2-3 0.010110110101 2-1
Some Terminology The precision of a number is how many digits (or bits) we use to represent it. For example: 3 3.14 3.1415926 3.141592653589793238462643383279 5028
Representing numbers A real number n is represented by a floating-point approximation n* The computer uses 32 bits (or more) to store each approximation. It needs to store the mantissa, the sign of the mantissa, and the exponent (with its sign).
Representing numbers So it has to allocate some of its 32 bits to each task. The standard way to do this (specified by IEEE standard 754) is:
Representing numbers 23 bits for the mantissa; 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude); The remaining 8 bits for the exponent.
Representing numbers 23 bits for the mantissa; 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude); The remaining 8 bits for the exponent.
Representing numbers 23 bits for the mantissa; 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude); The remaining 8 bits for the exponent.
Representing numbers 23 bits for the mantissa; 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude); The remaining 8 bits for the exponent.
Representing the mantissa Since the mantissa has to be in the range 1 ≤ mantissa < base, if we use base 2 the digit before the decimal has to be a 1. So we don't have to worry about storing it! That way we get 24 bits of precision using only 23 bits.
Representing the mantissa Those 24 bits of precision are equivalent to a little over 7 decimal digits:
Representing the mantissa Suppose we want to represent : 3.141592653589793238462643383279 5..... That means that we can only represent it as: 3.141592 (if we truncate) 3.141593 (if we round)
Representing the mantissa Even if the computer appears to give you more than seven decimal places, only the first seven are meaningful. For example: #include <math.h> main() { float pi = 2 * asin(1); printf("%.35f\n", pi); }
Representing the mantissa On my machine this prints out: 3.1415927419125732000000000000000000
Representing the mantissa On my machine this prints out: 3.1415927419125732000000000000000000
Representing the exponent The exponent is represented as an excess-127 number. That is: 00000000 –127 00000001 –126 01111111 0 10000000 +1 11111111 +128
Representing the exponent However, the IEEE standard restricts exponents to the range: –126 ≤ exponent ≤ +127 The exponents –127 and +128 have special meanings (basically, zero and infinity respectively)
Floating point overflow Just like the integer representations in the previous lecture, floating point representations can overflow: 9.999999 10127 + 1.111111 10127 1.1111110 10128
Floating point overflow Just like the integer representations in the previous lecture, floating point representations can overflow: 9.999999 10127 + 1.111111 10127 1.1111110 10128
Floating point overflow Just like the integer representations in the previous lecture, floating point representations can overflow: 9.999999 10127 + 1.111111 10127 ∞
Floating point underflow But floating point numbers can also get too small: 1.000000 10-126 ÷ 2.000000 100 5.000000 10-127
Floating point underflow But floating point numbers can also get too small: 1.000000 10-126 ÷ 2.000000 100 5.000000 10-127
Floating point underflow But floating point numbers can also get too small: 1.000000 10-126 ÷ 2.000000 100
Floating point addition Five steps to add two floating point numbers: Express them with the same exponent (denormalize) Add the mantissas Adjust the mantissa to one digit/bit before the point (renormalize) Round or truncate to required precision. Check for overflow/underflow
Floating point addition example y = 1.357 106
Floating point addition example 1. Same exponents: x = 9.876 107 y = 0.1357 107
Floating point addition example 2. Add mantissas: x = 9.876 107 y = 0.1357 107 x+y = 10.0117 107
Floating point addition example 3. Renormalize sum: x = 9.876 107 y = 0.1357 107 x+y = 1.00117 108
Floating point addition example 4. Trucate or round: x = 9.876 107 y = 0.1357 107 x+y = 1.001 108
Floating point addition example 5. Check overflow and underflow: x = 9.876 107 y = 0.1357 107 x+y = 1.001 108
Floating point addition example 2 y = -3.497 10-5
Floating point addition example 2 1. Same exponents: x = 3.506 10-5 y = -3.497 10-5
Floating point addition example 2 2. Add mantissas: x = 3.506 10-5 y = -3.497 10-5 x+y = 0.009 10-5
Floating point addition example 2 3. Renormalize sum: x = 3.506 10-5 y = -3.497 10-5 x+y = 9.000 10-8
Floating point addition example 2 4. Trucate or round: x = 3.506 10-5 y = -3.497 10-5 x+y = 9.000 10-8 (no change)
Floating point addition example 2 5. Check overflow and underflow: x = 3.506 10-5 y = -3.497 10-5 x+y = 9.000 10-8
Floating point addition example 2 Question: should we believe these zeroes? x = 3.506 10-5 y = -3.497 10-5 x+y = 9.000 10-8
Floating point multiplication Five steps to multiply two floating point numbers: Multiply mantissas Add exponents Renormalize mantissa Round or truncate to required precision. Check for overflow/underflow
Floating point multiplication example y = 8.001 10-3
Floating point multiplication example 1&2. Multiply mantissas/add exponents: x = 9.001 105 y = 8.001 10-3 x y = 72.017001 102
Floating point multiplication example 3. Renormalize product: x = 9.001 105 y = 8.001 10-3 x y = 7.2017001 103
Floating point multiplication example 4. Trucate or round: x = 9.001 105 y = 8.001 10-3 x y = 7.201 103
Floating point multiplication example 4. Trucate or round: x = 9.001 105 y = 8.001 10-3 x y = 7.202 103
Floating point multiplication example 5. Check overflow and underflow: x = 9.001 105 y = 8.001 10-3 x y = 7.202 103
Limitations Float-point representations only approximate real numbers. The normal laws of arithmetic don't always hold (even less often than for integer representations). For example, associativity is not guaranteed:
Limitations x = 3.002 103 y = -3.000 103 z = 6.531 100
Limitations x = 3.002 103 x+y = 2.000 100 y = -3.000 103 z = 6.531 100
Limitations x = 3.002 103 x+y = 2.000 100 y = -3.000 103 (x+y)+z = 8.531 100 z = 6.531 100
Limitations x = 3.002 103 y = -3.000 103 z = 6.531 100
Limitations x = 3.002 103 y = -3.000 103 y+z = -2.993 103
Limitations x = 3.002 103 x+(y+z) = 0.009 103 y = -3.000 103
Limitations x = 3.002 103 x+(y+z) = 9.000 100 y = -3.000 103
Limitations x = 3.002 103 x+(y+z) = 9.000 100 y = -3.000 103
Limitations Consider the other laws of arithmetic: Commutativity (additive and multiplicative) Associativity Distributivity Identity (additive and multiplicative) Spend some time working out which ones (if any!) always hold for floating- point numbers.
Reading (for the very keen) Goldberg, D., What Every Computer Scientist Should Know About Floating- Point Arithmetic, ACM Computing Surveys, Vol.23, No.1, March 1991.