Download presentation

Presentation is loading. Please wait.

1
**Introduction to Computer Systems**

Lecturer: Steve Maybank Department of Computer Science and Information Systems Autumn 2013 Week 4a: Floating Point Notation for Binary Fractions 22 October 2013 Birkbeck College, U. London

2
**Birkbeck College, U. London**

Recap: Binary Numbers In the standard notation for binary numbers a string of binary digits such as 1001 stands for a sum of powers of 2: 1x23+0x22+0x21+1x20 Binary[1001] and Decimal[9] are different names for the same number 22 October 2013 Birkbeck College, U. London

3
**Recap: Binary Addition**

column: 1 1 ===== There is a carry from column 0 to column 1 and from column 1 to column 2. In tests or examinations, always show the carries. 22 October 2013 Brookshear, Section 1.5

4
**Recap: Two’s Complement**

Two’s complement representations can be added as if they were standard binary numbers. == ==== 22 October 2013 Brookshear, Section 1.6

5
**Birkbeck College, U. London**

Numbers in Computing cells in table: all numbers *: numbers that can be stored in memory #: numbers that can be referred to in a program * # *# Example: 0.1 cannot be stored in memory in IEEE double precision floating point, but the following is a correct Java statement t = 0.1; 22 October 2013 Birkbeck College, U. London

6
**Spacing Between Numbers**

Two’s complement: equally spaced numbers Floating point: big gaps between big numbers, small gaps between small numbers. 22 October 2013 Birkbeck College, U. London

7
**Birkbeck College, U. London**

The Key: Exponents 2-4 2-3 2-2 2-1 20 21 22 23 1/ / ¼ ½ big gaps between big numbers small gaps between small numbers 22 October 2013 Birkbeck College, U. London

8
**Example of a Binary Fraction**

The binary fraction has three parts: The sign – The position of the radix point The bit string 22 October 2013 Brookshear, Section 1.7

9
**Reconstruction of a Binary Fraction**

The sign is + The position of the radix point is just to the right of the second bit from the left The bit string is What is the binary fraction? 22 October 2013 Brookshear, Section 1.7

10
Summary To represent a binary fraction three pieces of information are needed: Sign Position of the radix point Bit string 22 October 2013 Brookshear, Section 1.7

11
**Standard Form for a Binary Fraction**

Any non-zero binary fraction can be written in the form ±2r x 0.t where t is a bit string beginning with 1. Examples = +22 x = -2-1 x 22 October 2013 Brookshear, Section 1.7

12
**Floating Point Representation**

Write a non-zero binary fraction in the form ± 2r x 0.t Record the sign – bit string s1 Record r – bit string s2 Record t – bit string s3 Output s1||s2||s3 22 October 2013 Brookshear, Section 1.7

13
**Floating Point Notation**

8 bit floating point: s e1 e2 e3 m1 m2 m3 m4 sign exponent mantissa 1 bit bits bits radix r bit string t The exponent is in 3 bit excess notation 22 October 2013 Brookshear, Section 1.7

14
**To Find the Floating Point Notation**

Write the non-zero number as ± 2r x 0.t If sign = -1, then s1=1, else s1=0. s2 = 3 bit excess notation for r. s3= rightmost four bits of t. 22 October 2013 Brookshear, Section 1.7

15
**Birkbeck College, U. London**

Example b= s=1 b= -2-2 x exponent = -2, s2 =010 Floating point notation 22 October 2013 Birkbeck College, U. London

16
**Birkbeck College, U. London**

Second Example Floating point notation: s1=1, therefore negative. s2 = 011, exponent=-1 s3 = 1100 Binary fraction = -3/8 22 October 2013 Birkbeck College, U. London

17
**Birkbeck College, U. London**

Class Examples Find the floating point representation of the decimal number -1 1/8 Find the decimal number which has the floating point representation 22 October 2013 Birkbeck College, U. London

18
Round-Off Error 2+5/8= 2 ½ = The 8 bit floating point notations for 2 5/8 and 2 ½ are the same: The error in approximating 2+5/8 with is round-off error or truncation error. 22 October 2013 Brookshear, Section 1.7

19
**Floating Point Addition**

Let [x] be the floating point number closest to the number x. Floating point addition, is defined by Each operation w |-> [w] may introduce round off. 22 October 2013 Birkbeck College, U. London

20
**Examples of Floating Point Addition**

2 ½: 1/8: ¼: 2 ¾: 1/8)=2 1/4=2 ¾ (2 1/8=2 1/8=2 ½ 22 October 2013 Birkbeck College, U. London

21
**Round-Off in Decimal and Binary**

1/5=0.2 exactly in decimal notation 1/5= ….. in binary notation 1/5 cannot be represented exactly in binary floating point no matter how many bits are used. Round-off is unavoidable but it is reduced by using more bits. 22 October 2013 Birkbeck College, U. London

22
**Size of Round-Off Error E(x)**

E(x)/x≈α where α is constant. If x > 0, y > 0 and |x-y|<α x, then x-y cannot be found accurately using floating point arithmetic. 22 October 2013 Birkbeck College, U. London

23
**Birkbeck College, U. London**

Examples 4 1/4: , fpoint = 4: , fpoint = 4 ¼-4 -> fpoint = ½: 0.1, fpoint = ¼: 0.01, fpoint = ½-¼ -> fpoint = 22 October 2013 Birkbeck College, U. London

24
**Birkbeck College, U. London**

Floating Point Errors Overflow: number too large to be represented. Underflow: number <>0 and too small to be represented. Invalid operation: e.g. SquareRoot[-1]. See 22 October 2013 Birkbeck College, U. London

25
**IEEE Standard for Floating Point Arithmetic**

Single precision, 32 bits. 1 … 8 9 31 Mantissa m bits 9-31 Sign s bit 0 Exponent e bits 1-8 If 0<e<255, then value = (-1)s x 2e-127 x 1.m If e=0, s=0, m=0, then value = 0 If e=0, s=1, m=0, then value = -0 For a general discussion of fp arithmetic see 22 October 2013 Birkbeck College, U. London

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google