1. 2 12.1 Rounding Modes 3 Rounding: the process to obtain the best possible floating-point representation for a given real value. ANSI/IEEE standard:

Slides:

Advertisements

Similar presentations

Spring 2013 Advising Starts this week! CS2710 Computer Organization1.

Advertisements

Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.

Comp Sci Floating Point Arithmetic 1 Ch. 10 Floating Point Unit.

Computer Organization CS224 Fall 2012 Lesson 19. Floating-Point Example  What number is represented by the single-precision float …00 

05/03/2009CA&O Lecture 8,9,10 By Engr. Umbreen sabir1 Computer Arithmetic Computer Engineering Department.

Arithmetic in Computers Chapter 4 Arithmetic in Computers2 Outline Data representation integers Unsigned integers Signed integers Floating-points.

CSCE 212 Chapter 3: Arithmetic for Computers Instructor: Jason D. Bakos.

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.

Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.

Lecture 16: Computer Arithmetic Today’s topic –Floating point numbers –IEEE 754 representations –FP arithmetic Reminder –HW 4 due Monday 1.

Datorteknik FloatingPoint bild 1 Floating point Number system corresponding to the decimal notation 1,837 * 10 significand exponent a great number of corresponding.

Floating Point Numbers

Microprocessors The MIPS Architecture (Floating Point Instruction Set) Mar 26th, 2002.

1 Lecture 9: Floating Point Today’s topics:  Division  IEEE 754 representations  FP arithmetic Reminder: assignment 4 will be posted later today.

Assembly Language and Computer Architecture Using C++ and Java

COE 308: Computer Architecture (T032) Dr. Marwan Abu-Amara Integer & Floating-Point Arithmetic (cont.) (Appendix A, Computer Architecture: A Quantitative.

COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.

Systems Architecture Lecture 14: Floating Point Arithmetic

Operations on data CHAPTER 4.

Computer Organization and Architecture Computer Arithmetic Chapter 9.

Computer Arithmetic. Instruction Formats Layout of bits in an instruction Includes opcode Includes (implicit or explicit) operand(s) Usually more than.

CEN 316 Computer Organization and Design Computer Arithmetic Floating Point Dr. Mansour AL Zuair.

Fixed-Point Arithmetics: Part II

Ellen Spertus MCS 111 October 11, 2001 Floating Point Arithmetic.

Fall 2011SYSC 5704: Elements of Computer Systems 1 Data Representation Also called Encoding Murdocca Chapter 2.

ECE232: Hardware Organization and Design

Computer Architecture and Operating Systems CS 3230 :Assembly Section Lecture 10 Department of Computer Science and Software Engineering University of.

Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating.

Data Representation in Computer Systems

S. Rawat I.I.T. Kanpur. Floating-point representation IEEE numbers are stored using a kind of scientific notation. ± mantissa * 2 exponent We can represent.

CPS3340 COMPUTER ARCHITECTURE Fall Semester, /14/2013 Lecture 16: Floating Point Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.

CH09 Computer Arithmetic  CPU combines of ALU and Control Unit, this chapter discusses ALU The Arithmetic and Logic Unit (ALU) Number Systems Integer.

9.4 FLOATING-POINT REPRESENTATION

Lecture 9: Floating Point

CSC 221 Computer Organization and Assembly Language

Floating Point Representation for non-integral numbers – Including very small and very large numbers Like scientific notation – –2.34 × –

Conversion to Larger Number of Bits Ex: Immediate Field (signed 16 bit) to 32 bit Positive numbers have implied 0’s to the left. So, put 16 bit number.

Floating-Point Representation We can store integers and characters easily in binary, but what about fractions? ¼ =.25 = 2.5 * *

Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.

CHAPTER 3 Floating Point.

CDA 3101 Fall 2013 Introduction to Computer Organization

Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.

1 Lecture 10: Floating Point, Digital Design Today’s topics:  FP arithmetic  Intro to Boolean functions.

Chapter 3 Arithmetic for Computers. Chapter 3 — Arithmetic for Computers — 2 Arithmetic for Computers Operations on integers Addition and subtraction.

Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.

MIPS mul div, and MIPS floating point instructions.

CSE 340 Simulation Modeling | MUSHFIQUR ROUF CSE340:

10/7/2004Comp 120 Fall October 7 Read 5.1 through 5.3 Register! Questions? Chapter 4 – Floating Point.

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.

Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.

Morgan Kaufmann Publishers Arithmetic for Computers

Part III The Arithmetic/Logic Unit

Computer Architecture & Operations I

Integer Division.

Floating Point Number system corresponding to the decimal notation

CS/COE0447 Computer Organization & Assembly Language

Floating Point Arithmetics

Chapter 6 Floating Point

Arithmetic for Computers

Outline Introduction Floating Point Arithmetic Adder Multiplier.

Lecture 10: Floating Point, Digital Design

CSCE 350 Computer Architecture

Morgan Kaufmann Publishers Arithmetic for Computers

Floating Point Faculty of Information Technology University of Petra

Computer Organization and Assembly Language

Presentation transcript:

1

Rounding Modes

3 Rounding: the process to obtain the best possible floating-point representation for a given real value. ANSI/IEEE standard: round to floating number whose significand has an LSB of 0 (of two adjacent floating- point number, the significand of one must end in 0, and the other one in 1). This is called round-to-near- even. For example, 3.5 and 4.5 are both rounded to 4, the closet even number, based on round-to-near-even.

4 Other rounding methods –Round inward (toward 0):choose the nearest value in the same direction as 0. –Round upward (toward +∞): choose the larger of the two possible values. –Round downward (toward -∞): choose the smaller of the two possible vavlues.

5 Example 12.1 Rounding to the nearest integer a.Consider the rounded even integer corresponding to a real signed-magnitude number x a rtnei(x). Plot this round-to- nearest-even-integer for x in the range [-4,4]. b.Repeat part a for the function rtni(x), that is, round-to-nearest-integer function, where the midway values are always rounded up

6

7 Example 12.2 Directed rounding a.Consider the inward-directed round corresponding to a real signed-magnitude number x as a function ritni(x). Plot this round-inward-to-nearest-integer function for x in the range [-4,4]. b.Repeat part a for the round-upward-to- nearest-integer rutni(x).

8 Figure 12.3 Two directed round-to-nearest-integer functions for x in [– 4, 4].

9 Figure 12.3 (Continued)

Special Values and Execeptions Five special values in ANSI/IEEE floating-point standard –±0Biased exponent=0, significand=0 (no hidden 1) –± ∞Biased exponent=255 (short), or 2047 (long), significand=0 –NaNBiased exponent=255 (short), or 2047 (long), significand≠0

11 Consider the addition of ±2 e1 s1 and ±2 e2 s2, where e1 > e2 (±2 e1 s1) +(±2 e2 s2)=±2 e1 (s1±s2/2 e1-e2 ) 12.3 Floating-Point Addition

12

13 Figure 12.6 Simplified schematic of a floating-point adder

Other Floating-point Operations Multiplication of ±2 e1 s1 and ±2 e2 s2 (±2 e1 s1)×(±2 e2 s2)=±2 e1+e2 (s1×s2/2 e1-e2 ) Division of ±2 e1 s1 and ±2 e2 s2 (±2 e1 s1)/(±2 e2 s2)=±2 e1-e2 (s1/s2)

15 Figure 12.6 Simplified schematic of a floating-point multiply/divide unit.

16 Figure 12.7 The common floating-point instruction format for MiniMIPS and components for arithmetic instructions. The extension (ex) field distinguishes single ( * = s ) from double ( * = d ) operands Floating-Point Instructions 10 floating-point arithmetic instructions (5 different operations: add, sub, multiply, divide, negate) add.s $f0,$f8,$f10# set $f0 to ($f8)+($f10) add.d $f0,$f8,$f10# set $f0 $f1 to ($f8 $f9 )+($f10 $f11 ) Single operands can be in any of the floating registers. Double operands must be in specified to be in even numbered registers

17 Figure 12.8 Floating-point instructions for format conversion in MiniMIPS. 6 format conversion instructions: integer to single/double, single to double, double to single, and single/double to integer cvt.s.w $f0,$f8 # set $f0 to single (integer $f8) cvt.d.w $f0,$f8 # set $f0 to double (integer $f8) cvt.d.s $f0,$f8 # set $f0 to double ($f8) cvt.s.d $f0,$f8 # set $f0 to single ( $f8, $f9,) cvt.w.s $f0,$f8 # set $f0 to integer ($f8) cvt.w.d $f0,$f8 # set $f0 to integer ($f8, $f9)

18 Figure 12.9 Instructions for floating-point data movement in MiniMIPS. 6 data transfer instructions: load/store word to/from coprocessor1, move single/double from one FP register to another, move (copy) between FP registers and CPU general registers. lwcl $f8, 40($3) # load mem[40+($s3)] into $f8 swc1 $f8, A($3) # store mem[A+($s3)] into $f8 mv.s $f0,$f8 # load $f0 with ($f8) mv.d $f0,$f8 # load $f0,$f1 with ( $f8, $f9,) mfc1 $t0,$f12 # load $t0 with ($f12) mtc1 $f8,$t4 # load $f8 with ($t4)

19 Figure Floating-point branch and comparison instructions in MiniMIPS. 2 branch and 6 comparison instructions. The FP unit has a flag that is set to T or F based on 6 comparisons (equal, less than, or less or equal for single/double data type) bc1tL # branch on FP flag true bc1fL # branch on FP flag false c.eq.* $f0, $f8 # if ($f0)=($f8), set flag to true c.lt.* $f0, $f8 # if ($f0)<($f8), set flag to true c.lw.* $f0, $f8 # if ($f0)≤($f8), set flag to true

20 Table 12.1 The 30 MiniMIPS floating-point instructions:because the op field contains 17 for all but two of the instructions (49 for lwc1 and 50 for swc1 ), it is not shown.

Result Precision and Errors FP arithmetic can be quite dangerous and must be used with proper care, because results of FP computations are inexact. Why? –Many real numbers do not have exact binary representation within a finite word format. This is referred as representation error. –Even for values that are exactly representable, FP arithmetic produces inexact results. For example, product of 2 short FP numbers will have a 48 bits significant that must be rounded to 23 bits (plus hidden 1) This is called computation error.

22 Example Associate law of addition does not hold in general in FP arithmetic. For example a= -2 5 ×( ) b=2 5 × ( ) c=-2 -2 × ( ) (a+b)+c = a+(b+c) ?

23 Figure Algebraically equivalent computations may yield different results with floating-point arithmetic.

24 Using guard digits to avoid excessive error. For example, in a 10-digit calculator, 1/3 is represented as , multiplying 3 results in , but not 1. However, in a calculator with 2 guard bits, 1/3 is represented as , but still displayed as , multiplying 3 results in 1.

25 Figure Function evaluation by table lookup and linear interpolation.