Dr Damian Conway Room 132 Building 26

Slides:



Advertisements
Similar presentations
Spring 2013 Advising Starts this week! CS2710 Computer Organization1.
Advertisements

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved Floating-point Numbers.
Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Floating Point Numbers
Faculty of Computer Science © 2006 CMPUT 229 Floating Point Representation Operating with Real Numbers.
1 IEEE Floating Point Revision Guide for Phase Test Week 5.
COMP3221: Microprocessors and Embedded Systems Lecture 14: Floating Point Numbers Lecturer: Hui Wu Session 2, 2004.
CS 447 – Computer Architecture Lecture 3 Computer Arithmetic (2)
Round-Off and Truncation Errors
1 CSE1301 Computer Programming Lecture 30: Real Number Representation.
CSE1301 Computer Programming Lecture 33: Real Number Representation
Chapter 5 Floating Point Numbers. Real Numbers l Floating point representation is used whenever the number to be represented is outside the range of integer.
CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.
1 Error Analysis Part 1 The Basics. 2 Key Concepts Analytical vs. numerical Methods Representation of floating-point numbers Concept of significant digits.
Floating Point Numbers
Computer ArchitectureFall 2008 © August 27, CS 447 – Computer Architecture Lecture 4 Computer Arithmetic (2)
Simple Data Type Representation and conversion of numbers
Binary Real Numbers. Introduction Computers must be able to represent real numbers (numbers w/ fractions) Two different ways:  Fixed-point  Floating-point.
Information Representation (Level ISA3) Floating point numbers.
Computer Arithmetic Nizamettin AYDIN
1 Lecture 5 Floating Point Numbers ITEC 1000 “Introduction to Information Technology”
Dale Roberts Department of Computer and Information Science, School of Science, IUPUI CSCI 230 Information Representation: Negative and Floating Point.
Fixed-Point Arithmetics: Part II
Computing Systems Basic arithmetic for computers.
ECE232: Hardware Organization and Design
Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating.
Data Representation in Computer Systems
S. Rawat I.I.T. Kanpur. Floating-point representation IEEE numbers are stored using a kind of scientific notation. ± mantissa * 2 exponent We can represent.
Computer Science Engineering B.E.(4 th sem) c omputer system organization Topic-Floating and decimal arithmetic S ubmitted to– Prof. Shweta Agrawal Submitted.
CH09 Computer Arithmetic  CPU combines of ALU and Control Unit, this chapter discusses ALU The Arithmetic and Logic Unit (ALU) Number Systems Integer.
Round-off Errors and Computer Arithmetic. The arithmetic performed by a calculator or computer is different from the arithmetic in algebra and calculus.
Lecture 9: Floating Point
CSC 221 Computer Organization and Assembly Language
Floating Point Arithmetic
COMP201 Computer Systems Floating Point Numbers. Floating Point Numbers  Representations considered so far have a limited range dependent on the number.
1 Number Systems Lecture 10 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier, 2007.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 3:
Lecture notes Reading: Section 3.4, 3.5, 3.6 Multiplication
1 Floating Point Operations - Part II. Multiplication Do unsigned multiplication on the mantissas including the hidden bits Do unsigned multiplication.
Computer Architecture Lecture 22 Fasih ur Rehman.
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.
Binary Arithmetic.
ESO 208A/ESO 218 LECTURE 2 JULY 31, ERRORS MODELING OUTPUTS QUANTIFICATION TRUE VALUE APPROXIMATE VALUE.
Module 2.2 Errors 03/08/2011. Sources of errors Data errors Modeling Implementation errors Absolute and relative errors Round off errors Overflow and.
Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.
Introduction to Numerical Analysis I
Floating Point Representations
Fundamentals of Computer Science
Introduction To Computer Science
Dr. Clincy Professor of CS
Lecture 9: Floating Point
Floating Point Number system corresponding to the decimal notation
IEEE floating point format
ECE/CS 552: Floating Point
CSCI206 - Computer Organization & Programming
Number Representations
CSCI206 - Computer Organization & Programming
How to represent real numbers
How to represent real numbers
Approximations and Round-Off Errors Chapter 3
COMS 161 Introduction to Computing
Number Representations
Lecture 9: Shift, Mult, Div Fixed & Floating Point
Presentation transcript:

Dr Damian Conway Room 132 Building 26 Real Number Representation (Lecture 25 of the Introduction to Computer Programming series) Dr Damian Conway Room 132 Building 26

Some Terminology All digits in a number following any leading zeros are significant digits: 12.345 -0.12345 0.00012345

Some Terminology The scientific notation for real numbers is: mantissa  base exponent

Some Terminology The mantissa is always normalized between 1 and the base (i.e. exactly one significant figure before the point): Normalized Unnormalized 2.9979  108 2997.9  105 B.139FC  1612 B1.39FC  1611 1.0110110101  2-3 0.010110110101  2-1

Some Terminology The precision of a number is how many digits (or bits) we use to represent it. For example: 3 3.14 3.1415926 3.141592653589793238462643383279 5028

Representing numbers A real number n is represented by a floating-point approximation n* The computer uses 32 bits (or more) to store each approximation. It needs to store the mantissa, the sign of the mantissa, and the exponent (with its sign).

Representing numbers So it has to allocate some of its 32 bits to each task. The standard way to do this (specified by IEEE standard 754) is:

Representing numbers 23 bits for the mantissa; 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude); The remaining 8 bits for the exponent.

Representing numbers 23 bits for the mantissa; 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude); The remaining 8 bits for the exponent.

Representing numbers 23 bits for the mantissa; 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude); The remaining 8 bits for the exponent.

Representing numbers 23 bits for the mantissa; 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude); The remaining 8 bits for the exponent.

Representing the mantissa Since the mantissa has to be in the range 1 ≤ mantissa < base, if we use base 2 the digit before the decimal has to be a 1. So we don't have to worry about storing it! That way we get 24 bits of precision using only 23 bits.

Representing the mantissa Those 24 bits of precision are equivalent to a little over 7 decimal digits:

Representing the mantissa Suppose we want to represent : 3.141592653589793238462643383279 5..... That means that we can only represent it as: 3.141592 (if we truncate) 3.141593 (if we round)

Representing the mantissa Even if the computer appears to give you more than seven decimal places, only the first seven are meaningful. For example: #include <math.h> main() { float pi = 2 * asin(1); printf("%.35f\n", pi); }

Representing the mantissa On my machine this prints out: 3.1415927419125732000000000000000000

Representing the mantissa On my machine this prints out: 3.1415927419125732000000000000000000

Representing the exponent The exponent is represented as an excess-127 number. That is: 00000000  –127 00000001  –126 01111111  0 10000000  +1 11111111  +128

Representing the exponent However, the IEEE standard restricts exponents to the range: –126 ≤ exponent ≤ +127 The exponents –127 and +128 have special meanings (basically, zero and infinity respectively)

Floating point overflow Just like the integer representations in the previous lecture, floating point representations can overflow: 9.999999  10127 + 1.111111  10127                 1.1111110  10128

Floating point overflow Just like the integer representations in the previous lecture, floating point representations can overflow: 9.999999  10127 + 1.111111  10127                 1.1111110  10128

Floating point overflow Just like the integer representations in the previous lecture, floating point representations can overflow: 9.999999  10127 + 1.111111  10127                 ∞

Floating point underflow But floating point numbers can also get too small: 1.000000  10-126 ÷ 2.000000  100                 5.000000  10-127

Floating point underflow But floating point numbers can also get too small: 1.000000  10-126 ÷ 2.000000  100                 5.000000  10-127

Floating point underflow But floating point numbers can also get too small: 1.000000  10-126 ÷ 2.000000  100                

Floating point addition Five steps to add two floating point numbers: Express them with the same exponent (denormalize) Add the mantissas Adjust the mantissa to one digit/bit before the point (renormalize) Round or truncate to required precision. Check for overflow/underflow

Floating point addition example y = 1.357  106

Floating point addition example 1. Same exponents: x = 9.876  107 y = 0.1357  107

Floating point addition example 2. Add mantissas: x = 9.876  107 y = 0.1357  107 x+y = 10.0117  107

Floating point addition example 3. Renormalize sum: x = 9.876  107 y = 0.1357  107 x+y = 1.00117  108

Floating point addition example 4. Trucate or round: x = 9.876  107 y = 0.1357  107 x+y = 1.001  108

Floating point addition example 5. Check overflow and underflow: x = 9.876  107 y = 0.1357  107 x+y = 1.001  108

Floating point addition example 2 y = -3.497  10-5

Floating point addition example 2 1. Same exponents: x = 3.506  10-5 y = -3.497  10-5

Floating point addition example 2 2. Add mantissas: x = 3.506  10-5 y = -3.497  10-5 x+y = 0.009  10-5

Floating point addition example 2 3. Renormalize sum: x = 3.506  10-5 y = -3.497  10-5 x+y = 9.000  10-8

Floating point addition example 2 4. Trucate or round: x = 3.506  10-5 y = -3.497  10-5 x+y = 9.000  10-8 (no change)

Floating point addition example 2 5. Check overflow and underflow: x = 3.506  10-5 y = -3.497  10-5 x+y = 9.000  10-8

Floating point addition example 2 Question: should we believe these zeroes? x = 3.506  10-5 y = -3.497  10-5 x+y = 9.000  10-8

Floating point multiplication Five steps to multiply two floating point numbers: Multiply mantissas Add exponents Renormalize mantissa Round or truncate to required precision. Check for overflow/underflow

Floating point multiplication example y = 8.001  10-3

Floating point multiplication example 1&2. Multiply mantissas/add exponents: x = 9.001  105 y = 8.001  10-3 x  y = 72.017001  102

Floating point multiplication example 3. Renormalize product: x = 9.001  105 y = 8.001  10-3 x  y = 7.2017001  103

Floating point multiplication example 4. Trucate or round: x = 9.001  105 y = 8.001  10-3 x  y = 7.201  103

Floating point multiplication example 4. Trucate or round: x = 9.001  105 y = 8.001  10-3 x  y = 7.202  103

Floating point multiplication example 5. Check overflow and underflow: x = 9.001  105 y = 8.001  10-3 x  y = 7.202  103

Limitations Float-point representations only approximate real numbers. The normal laws of arithmetic don't always hold (even less often than for integer representations). For example, associativity is not guaranteed:

Limitations x = 3.002  103 y = -3.000  103 z = 6.531  100

Limitations x = 3.002  103 x+y = 2.000  100 y = -3.000  103 z = 6.531  100

Limitations x = 3.002  103 x+y = 2.000  100 y = -3.000  103 (x+y)+z = 8.531  100 z = 6.531  100

Limitations x = 3.002  103 y = -3.000  103 z = 6.531  100

Limitations x = 3.002  103 y = -3.000  103 y+z = -2.993  103

Limitations x = 3.002  103 x+(y+z) = 0.009  103 y = -3.000  103

Limitations x = 3.002  103 x+(y+z) = 9.000  100 y = -3.000  103

Limitations x = 3.002  103 x+(y+z) = 9.000  100 y = -3.000  103

Limitations Consider the other laws of arithmetic: Commutativity (additive and multiplicative) Associativity Distributivity Identity (additive and multiplicative) Spend some time working out which ones (if any!) always hold for floating- point numbers.

Reading (for the very keen) Goldberg, D., What Every Computer Scientist Should Know About Floating- Point Arithmetic, ACM Computing Surveys, Vol.23, No.1, March 1991.