Lecture 6: Floating Point Number Representation Information Representation: Floating Point Number Representation Lecture # 7.

Lecture 6: Floating Point Number Representation Information Representation: Floating Point Number Representation Lecture # 7

Lecture 6: Floating Point Number Representation Fractional Numbers Examples: 456.78 10 = 4 x 10 2 + 5 x 10 1 + 6 x 10 0 + 7 x 10 -1 +8 x 10 -2 1011.11 2 = 1 x 2 3 + 0 x 2 2 + 1 x 2 1 + 1 x 2 0 + 1 x 2 -1 + 1 x 2 -2 = 8 + 0 + 2 + 1 + 1/2 + ¼ = 11 + 0.5 + 0.25 = 11.75 10 Conversion from binary number system to decimal system Examples: 111.11 2 = 1 x 2 2 + 1 x 2 1 + 1 x 2 0 + 1 x 2 -1 + 1 x 2 -2 = 4 + 2 + 1 + 1/2 + ¼ = 7.75 10 Examples: 11.011 2 2 2 2 1 2 0 2 -1 2 -2 2 - 3 4 2 1 ½ ¼ 1/8 2 1 0 -1 -2 -3 xxxx

Lecture 6: Floating Point Number Representation Conversion from decimal number system to binary system Examples: 7.75 10 = (?) 2 1. 1. Conversion of the integer part: same as before – repeated division by 2 7 / 2 = 3 (Q), 1 (R)  3 / 2 = 1 (Q), 1 (R)  1 / 2 = 0 (Q), 1 (R) 7 10 = 111 2 2. 2. Conversion of the fractional part: perform a repeated multiplication by 2 and extract the integer part of the result 0.75 x 2 =1.50  extract 1 0.5 x 2 = 1.0  extract 1 0.75 10 = 0.11 2 0.0  stop  Combine the results from integer and fractional part, 7.75 10 = 111.11 2 How about choose some of Examples: try 5.625 B write in the same order 421 1/21/41/8 =0.5 =0.25=0.125

Lecture 6: Floating Point Number Representation Fractional Numbers (cont.) Exercise 1: Convert (0.625) 10 to its binary form Exercise 2: Convert (0.6) 10 to its binary form Solution: Solution: 0.625 x 2 = 1.25  extract 1 0.25 x 2 = 0.5  extract 0 0.5 x 2 = 1.0  extract 1 0.0  stop  (0.625) 10 = (0.101) 2 0.6 x 2 = 1.2  extract 1 0.2 x 2 = 0.4  extract 0 0.4 x 2 = 0.8  extract 0 0.8 x 2 = 1.6  extract 1 0.6 x 2 =   (0.6) 10 = (0.1001 1001 1001 …) 2

Lecture 6: Floating Point Number Representation Fractional Numbers (cont.) Exercise 3: Convert (0.8125) 10 to its binary form Solution: 0.8125 x 2 = 1.625  extract 1 0.625 x 2 = 1.25  extract 1 0.25 x 2 = 0.5  extract 0 0.5 x 2 = 1.0  extract 1 0.0  stop  (0.8125) 10 = (0.1101) 2

Lecture 6: Floating Point Number Representation Fractional Numbers (cont.) Errors One source of error in the computations is due to back and forth conversions between decimal and binary formats Example: (0.6) 10 + (0.6) 10 = 1.2 10 Since (0.6) 10 = (0.1001 1001 1001 …) 2 Lets assume a 8-bit representation: (0.6) 10 = (0.1001 1001) 2, therefore 0.60.10011001 + 0.6  +0.10011001 1.00110010 Lets reconvert to decimal system: (1.00110010) b = 1 x 2 0 + 0 x 2 -1 + 0 x 2 -2 + 1 x 2 -3 + 1 x 2 -4 + 0 x 2 -5 + 0 x 2 -6 + 1 x 2 -7 + 0 x 2 -8 = 1 + 1/8 + 1/16 + 1/128 = 1.1953125  Error = 1.2 – 1.1953125 = 0.0046875

Lecture 6: Floating Point Number Representation If x is a real number then its normal form representation is: x = f Base E where f : mantissa E: exponent exponent Example: 125.32 10 = 0.12532 10 3 mantissa - 125.32 10 = - 0.12532 10 3 0.0546 10 = 0.546 10 –1 The mantissa is normalized, so the digit after the fractional point is non-zero. If needed the mantissa should be shifted appropriately to make the first digit (after the fractional point) to be non-zero & the exponent is properly adjusted. Floating Point Number Representation

Lecture 6: Floating Point Number Representation Example: 134.15 10 = x 10 0.0021 10 = x 10 101.11 B = 0.011 B = AB.CD H = 0.00AC H = 0.13415 0.21 3 -2

Lecture 6: Floating Point Number Representation Assume we use 16-bit binary pattern for normalized binary form based on the following convention (MSB to LSB) Sign of mantissa ( ± )= left most bit (where 0: +; 1: - ) Mantissa (f)= next 11 bits Sign of exponent ( ± )= next bit (where 0: +; 1: - ) Exponent (E) = next three bits x = ± f Base ± E LSBMSB + : 0 - : 1 E : converted to binary, b 1 b 2 b 3 ?1?1 ?2?2 ?3?3 ?4?4 ? 11 ? 10 ?9?9 ?8?8 ?7?7 ?5?5 ?6?6 f = 0.? 1 ? 2 ? 3 ? 4 …? 11 ? 12 …? 15 b1b1 b2b2 b3b3 + : 0 - : 1

Lecture 6: Floating Point Number Representation Question: How the computer expresses the 16-bit approximation of 1110.111010111111 in normalized binary form using the following convention Sign of mantissa = left most bit (where 0: +; 1: - ) Mantissa = next 11 bits Sign of exponent = next bit (where 0: +; 1: - ) Exponent = next three bits Answer: Step 1: Normalization 1110.111010111111 = + 1.110111010111111 * 2 +3 Step 2: “Plant” 16 bits the 16 bit floating point representation is 0 11101110101 0 011 Floating Point Number Representation exponent 3 bits sign 1 bit mantissa 11 bits sign 1 bit

Lecture 6: Floating Point Number Representation Information Representation: Floating Point Number Representation Lecture # 7.

Similar presentations

Presentation on theme: "Lecture 6: Floating Point Number Representation Information Representation: Floating Point Number Representation Lecture # 7."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 6: Floating Point Number Representation Information Representation: Floating Point Number Representation Lecture # 7.

Similar presentations

Presentation on theme: "Lecture 6: Floating Point Number Representation Information Representation: Floating Point Number Representation Lecture # 7."— Presentation transcript:

Similar presentations

About project

Feedback