Integer and Fixed Point P & H: Chapter 3 Computer Arithmetic Integer and Fixed Point P & H: Chapter 3
Integer Formats Signed vs unsigned numbers If unsigned, all bits represent a value If signed, the leading (MSB) determines the sign Signed integer representations: Signed magnitude Just like base 10 1’s complement NOT everything to negate 2’s complement Subtract number from 1 bit larger power of 2 (or, NOT everything and add 1) (or, find the first 1 from the right and negate everything to the left)
Converting From Base 10 Ex: assume 6 bit registers 1910 = 0100112 = 13HEX = 23OCT (Euclid’s method) -1910 1910 = 0100112 therefore Signed magnitude: -1910 = 1100112 1’s complement: -1910 = 1011002 2’s complement: -1910 = 1011012
Sign Extension Ex: assume 6 bit registers increased to 8 bit registers Signed magnitude: -1910 = 1100112 100100112 1’s complement: -1910 = 1011002 111011002 2’s complement: -1910 = 1011012 111011012 Sign extension is simplest in 1’s and 2’s complement
Ranges Ex: assume 6 bit registers Signed magnitude: 1111112 value 0111112 Range is [-(25-1),25-1] Two representations for 0: 1000002 and 0000002 1’s complement: 1000002 value 0111112 Two representations for 0: 1111112 and 0000002 2’s complement: 1000002 value 0111112 Range is [-25,25-1] One representation for 0: 0000002
Problematic Examples Ex: 2+3 = ? in 3 bit arithmetic But 1012 = -(0112) = -3! What happened? Overflow occurred: 5 can’t be represented in only 3 bits Base 10 Base 2 2 010 +3 +011
Problematic Examples (2) Ex: -1-4 = ? in 3 bit arithmetic But 0112 = 3 Overflow occurred again: -5 can’t be represented in 3 bits either Base 10 Base 2 -1 111 -4 +100 -5 011
Overflow If the two operands are representable in n bits, then when can overflow occur? positive + positive? Yes (see previous example) negative + negative? Yes (see previous example) positive + negative? No if the two inputs are legal, the output must be negative + positive? No same reason as above
Detecting Overflow What is the “rule” for detecting overflow in R=A+B? Overflow only occurs on addition of two operands of the same sign Result must therefore be of the opposite sign Carry in 1 Sign bit A Sign bit B +0 +1 Sign bit R Carry out
Detecting Overflow (2) “Truth Table” for Overflow Cin Sign A Sign B Sign R Cout Overflow No 1 Yes
Detecting Overflow (3) cin cout overflow The carry-in to the MSB = opposite of the carry-out from the MSB Overflow = xor(cin,cout) = cin cout Hardware rep: When overflow occurs, almost all architectures cause an “exception” an unscheduled procedure call (software routine) that handles the event cin cout overflow
Fractional Parts Ex: 3.812510 310 = 0112 .812510 = .11012 Hence: 3.812510 = 011.11012 .8125 2 1.6250 1.2500 0.5000 1.0000
Fractional Parts (2) Ex: 0.210 There is no finite binary representation for .210 .210 = .001100110011…… Not all values are representable using a finite number of bits Roundoff/representation errors are unavoidable .2 2 0.4 0.8 1.6 Repeats 1.2
Binary Addition Ex: 2.625 + 6.75 Note: .01102 = 2-2 + 2-3 = .25 + .125 = .375 Base 10 Base 2 2.625 00010.1010 +6.750 +00110.1100 9.375 01001.0110
Binary Subtraction (2’s Complement) Ex: 2.625 - 6.75 Note: 11011.11102 = -(00100.00102) = -4.125 Subtraction is just the addition of 2’s complement numbers Base 10 Base 2 2.625 00010.1010 -6.750 +11001.0100 -4.125 11011.1110 this is just the 2’s complement of +6.75 = 00110.1100
Computer Arithmetic IEEE Floating Point
IEEE 754 Floating Point Floating point numbers require a separate representation IEEE 754 is a “common” or “standardized” format Ex: 3.812510 = 011.11012 (fixed point format) “Normalize” this to +1.11101 * 21 1.11101 is called the mantissa 2 is the base 1 is the exponent Need to represent and store the sign, exponent, and mantissa The base is assumed to be 2
Single Precision IEEE 754 Single precision format uses 32 bits (machine independent) 1 bit for the sign 8 bits for the exponent 23 bits for the mantissa (unsigned representation!) s e e e e e e e e f f f f f f f … f f f Since all normalized numbers (except 0) start with 1, don’t store that bit! It is assumed binary point
Representing Exponents The exponent accounts for 8 bits in IEEE 754 single precision format How to store an exponent? Use 2’s complement allows both positive and negative exponents If we need exponent = 5, we could store 0000 0101 If we need exponent = -5, we could store 1111 1011
Excess 127 Notation Use 2’s complement, but we store 127 + value instead 12710 = 0111 11112 Treat the result as an unsigned number If we need exponent = 5, we store 0111 1111 + 0000 0101 = 1000 0100 (this is 132) If we need exponent = -5, we store 0111 1111 + 1111 1011 = 0111 1010 (this is 122) Why do this? Comparing two exponents for which is “larger” is easier in excess 127 just see which one has the first “1” in a position where the other has a “0”
Single Precision IEEE 754 Example (1) 0 1 0 0 0 0 0 0 0 1 1 1 0 1 0 0 … 0 Sign is 0 (bit position 31 MSB) Exponent is 1 0 0 0 0 0 0 0 (underlined above) 0111 1111 + 0000 0001 = 1000 0000 Mantissa is 1.1110100…0 (the leading 1 is assumed in the representation above) That’s +1.11101 * 2(128-127) = 1.9062510 * 21 = 3.812510
Single Precision IEEE 754 Example (2) 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 … 0 In HEX: C084 0000 (Big Endian) The exponent is the “unsigned stored value” – 127 = 129 – 127 = 2 That’s -1.000010 * 2(129-127) = -1.03125 * 22 = -4.125
Single Precision IEEE 754 Example (3) Ex: What is 07D0 0000 (Big Endian)? 0 0 0 0 0 1 1 1 1 1 0 1 0 0 0 0 … 0 The exponent is the “unsigned stored value” – 127 = 15 – 127 = -112 That’s +1.1010 * 2(-112) = 1.625 * 2-112 = a very small number 0 0 0 0 0 1 1 1 1 1 0 1 0 0 0 0 … 0 The exponent is the “unsigned stored value” – 127 = 15 – 127 = -112 That’s +1.1010 * 2(-112) = 1.625 * 2-112 = a very small number
Double Precision IEEE 754 Double precision format uses 64 bits (machine independent) 1 bit for the sign 11 bits for the exponent 52 bits for the mantissa (unsigned) s e e e e e e e e e e e f f f f f f f … f f f Use excess 1023 (210-1) for exponents: 1023 = 011 1111 1111 binary point
Exceptions to IEEE 754 These rules apply when Exponent 00 Exponent FF (= 25510) Special rules apply for these situations These special rules provide for NaN (Not a Number) +/- Inf (infinity) +/- 0 (yes, two zeros!) “unnormalized” numbers allows very very very small values including “machine epsilon” (the smallest positive number allowed)
Exceptions to IEEE 754 Special cases: (E is Exponent, F is Mantissa) If E=255 and F is nonzero, then Value=NaN ("Not a Number") If E=255 and F is zero and S is 1, then Value=-Infinity If E=255 and F is zero and S is 0, then Value=Infinity If E=0 and F is nonzero, then Value=(-1)^S * 2^(-126) * (0.F) These are "unnormalized" values. If E=0 and F is zero and S is 1, then Value=-0 If E=0 and F is zero and S is 0, then Value=0
Test Yourself Using Single Precision IEEE 754, what is FF28 0000 (Big Endian)? Using Single Precision IEEE 754, what is 8038 0000 (Big Endian)?