(Part 3-Floating Point Arithmetic)

Slides:



Advertisements
Similar presentations
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Advertisements

CENG536 Computer Engineering department Çankaya University.
Topics covered: Floating point arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Floating Point Numbers. CMPE12cGabriel Hugh Elkaim 2 Floating Point Numbers Registers for real numbers usually contain 32 or 64 bits, allowing 2 32 or.
Floating Point Numbers. CMPE12cCyrus Bazeghi 2 Floating Point Numbers Registers for real numbers usually contain 32 or 64 bits, allowing 2 32 or 2 64.
Floating Point Numbers
4 Operations On Data Foundations of Computer Science ã Cengage Learning.
Binary Representation and Computer Arithmetic
Information Representation (Level ISA3) Floating point numbers.
Computer Organization and Architecture Computer Arithmetic Chapter 9.
Computer Arithmetic Nizamettin AYDIN
Dale Roberts Department of Computer and Information Science, School of Science, IUPUI CSCI 230 Information Representation: Negative and Floating Point.
Computing Systems Basic arithmetic for computers.
Computer Architecture
Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating.
Data Representation in Computer Systems
Fixed and Floating Point Numbers Lesson 3 Ioan Despi.
CSC 221 Computer Organization and Assembly Language
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Binary Numbers The arithmetic used by computers differs in some ways from that used by people. Computers perform operations on numbers with finite and.
Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.
1 CE 454 Computer Architecture Lecture 4 Ahmed Ezzat The Digital Logic, Ch-3.1.
Floating Point Arithmetic – Part I
William Stallings Computer Organization and Architecture 8th Edition
MATH Lesson 2 Binary arithmetic.
Floating Point Numbers
Floating Point Representations
Dr.Faisal Alzyoud 2/20/2018 Binary Arithmetic.
Programming and Data Structure
Binary Numbers The arithmetic used by computers differs in some ways from that used by people. Computers perform operations on numbers with finite and.
A brief comparison of integer and double representation
Dr. Clincy Professor of CS
Lecture No. 4 Number Systems
Number Representation
Introduction To Computer Science
Dr. Clincy Professor of CS
Data Representation Binary Numbers Binary Addition
Floating Point Math & Representation
Lecture 9: Floating Point
Floating Point Numbers: x 10-18
Floating Point Number system corresponding to the decimal notation
April 2006 Saeid Nooshabadi
PRESENTED BY J.SARAVANAN. Introduction: Objective: To provide hardware support for floating point arithmetic. To understand how to represent floating.
William Stallings Computer Organization and Architecture 7th Edition
Chapter 6 Floating Point
Data Structures Mohammed Thajeel To the second year students
Topic 3d Representation of Real Numbers
Luddy Harrison CS433G Spring 2007
Recent from Dr. Dan Lo regarding 12/11/17 Dept Exam
Binary Numbers Material on Data Representation can be found in Chapter 2 of Computer Architecture (Nicholas Carter) CSC 370 (Blum)
Chapter 2 Bits, Data Types & Operations Integer Representation
CSCI206 - Computer Organization & Programming
Number Representations
Data Representation Data Types Complements Fixed Point Representation
Floating Point Representation
How to represent real numbers
Dr. Clincy Professor of CS
Digital Logic & Design Lecture 02.
Faculty of Cybernetics, Statistics and Economic Informatics –
October 17 Chapter 4 – Floating Point Read 5.1 through 5.3 1/16/2019
Chapter 3 DataStorage Foundations of Computer Science ã Cengage Learning.
Floating Point Numbers
Recent from Dr. Dan Lo regarding 12/11/17 Dept Exam
Topic 3d Representation of Real Numbers
Floating Point Numbers
Chapter3 Fixed Point Representation
Computer Organization and Assembly Language
Number Representations
Presentation transcript:

(Part 3-Floating Point Arithmetic) SERIAL PROCESSORS MACHINE LAYER (Part 3-Floating Point Arithmetic) Yavuz Oruç Sabatech Corporation & University of Maryland ©All rights reserved 2005. 11/27/2018 6:47 PM

Floating-Point Numbers Informally speaking, a floating-point number is a number which can be moved on a scale by adjusting the location of its radix-point. As compared to fixed-point number systems, floating-point number systems permit computers to process a very large range of numbers within a common framework of arithmetic operations which approximates real arithmetic. One of the most commonly used floating-point number representations is the scientific notation. In this representation, a floating number, x, is written as a product of a number, m, called the mantissa, or significand, and a non-negative integer, e, called the exponent: x=± m Be, where B is an integer greater than or equal to 2. B is called the base, and e is called the exponent. 11/27/2018 6:47 PM

Floating-Point Numbers (Cont’d) Example The following are all floating-point numbers: 123.765  100 23.4  82 101.110  2-3 It is easy to see that there are many choices for m and e to represent the same floating-point number. For example, when base B =10, the first number above can be expressed as 1.23765  102 = 1.23765  102 = 1237650  10-4 = 123.765 100 = 12.3765 101 = …, etc. 11/27/2018 6:47 PM

Floating-Point Numbers (Cont’d) When one of these representations is selected such that the mantissa lies between 1 and the base, i.e., 1 < m < B, then the number is said to be normalized, and the corresponding representation is called a normalized floating-point number. Although a number has infinitely many floating-point representations, it has only one normalized floating-point representation. The normalized representations for the numbers in the example above are 1.23765  102 2.34  81 1.01110  2--1 Nearly all modern computers use the normalized scientific representation with base 2, i.e., the notation x=± m  2e, where 1 < m < 2. 11/27/2018 6:47 PM

Machine Representation of Floating-Point Numbers Floating numbers are stored in a computer by their sign, mantissa, and exponent. The number of bits allocated to the mantissa and exponent, and the exact way these terms are stored vary from one system to another. Here we will describe one of the most commonly used representations, the one using a hidden bit, and a biased exponent. The three terms in this representation are stored as Sign Exponent Mantissa S E M where S is a 1-bit sign, and E and M are k-bit and p-bit numbers which represent e and m, respectively. A floating-point number in this representation is stored as a sign-magnitude number and is positive when S = 0 and it is negative when S =1. 11/27/2018 6:47 PM

Machine Representation of Floating-Point Numbers (Cont’d) The true exponent, e, is found by subtracting a fixed number from E, called the bias. For a k-bit long exponent, the bias is 2k-1-1, and the true exponent and E are related by e = E- (2k-1-1). For a k-bit exponent, this mapping from the bit-representation to a true exponent carries the domain of unsigned k-bit numbers {0,1,…, 2k-1} onto the set of signed exponents, {-2k-1+1, -2k-1+2,…,0,1,…, 2k-1} This translation from a true exponent to a biased exponent makes the lexicographical ordering of the binary tuple representation of the floating-point numbers consistent with their decimal values. 11/27/2018 6:47 PM

Machine Representation of Floating-Point Numbers (Cont’d) The mantissa of a floating-point number is stored in M except its first digit. Since 1< m < 2 for binary representations, the digit to the left of the point in the mantissa is always 1. Therefore, it does not have to be stored, although a 1 is always inserted as the first digit of the mantissa in floating-point operations. This first digit is called the hidden bit. If the p-bit number M is given by then the mantissa with a hidden bit is given by 11/27/2018 6:47 PM

Machine Representation of Floating-Point Numbers (Cont’d) Example: Consider the decimal floating-point number (-142.275)10. Converting the mantissa into a 16-binary floating-point number we have (142.275)10. = (100001110.01000110)2, and normalizing the binary number, we have (142.275)10. = (1.0000011001000110)2 2-7. Hence, with p = 16, and an 8-bit biased exponent, this number will be represented internally in a computer as S E M 1 10000110 0000 0110 0100 0110 11/27/2018 6:47 PM

Machine Representation of Floating-Point Numbers (Cont’d) Floating-point umbers in the normalized representation with p = k = 2. 11/27/2018 6:47 PM

Machine Representation of Floating-Point Numbers (Cont’d) Remark One problem with the normalized floating-point number system is that there is no way to normalize zero since all of its bits are identically 0. Instead, we adopt the convention that both (1+k+p)-bit representations shown below represent 0. More generally, floating-point numbers with a zero exponent are called de-normalized floating-numbers and their values are computed by setting the hidden bit to 0. S E M 0000…0 000000000000…0 1 11/27/2018 6:47 PM

Precision and Mantissa Range of A Floating-Point Number System When we fix the values of p and k, the number of floating-point numbers which can be represented is also fixed. A p-bit normalized mantissa with a 1-bit sign and a k-bit biased exponent gives us 2p+k+1 floating-point numbers in all. The smallest differential mantissa, i.e., 2-p, in a floating-point representation is called the precision of the representation. Any two numbers in a floating–point number system with or without a normalized p-bit mantissa cannot be closer than its precision, i.e., |x-y| > 2-p for any x,y which can be expressed in such a number system. The normalization of floating-point numbers further constrains the representation by introducing a gap between 0 and the rest of the numbers. The mantissa range of a floating-point number system is the interval of mantissas which can be represented in that floating-point number system. 11/27/2018 6:47 PM

Machine Representation of Floating-Point Numbers (Cont’d) This contrast between a normalized and un-normalized representation is illustrated in the table below. The shaded area in the normalized scale shows that there is a set of floating-point numbers in the neighborhood of 0 which are included in the un-normalized floating-point number system, but cannot be represented in the normalized floating-point number system. 11/27/2018 6:47 PM

Single and Double Precision Floating-Point Numbers By convention, a floating-point number is called a single precision or double precision number based on either (1) the number of bits it uses or (2) the number of registers it occupies. We will adopt the latter convention for classifying a floating-point number as a single or double precision number. Thus, in single precision, each register in the register file of a processor represents a floating-point number by itself. If the processor supports 8, 16 and 32-bit registers, it is possible to define 8-bit, 16-bit, and 32-bit floating-point numbers in single precision. In double precision, pairs of registers, typically with adjacent indicies, and starting out with R0 represent a single floating-point number. For example, in a register file with four registers, R0, R1, R2, R3, each of the pairs (R0,R1), (R1,R2), (R2,R3), and (R3,R0) represents a floating-point number. The higher order bits of the number are stored in the first register in the pair, and the lower order bits are stored in the second entry. Using this convention, it is thus possible to deal with 16-bit, 32-bit, and 64-bit floating-point numbers in double precision. 11/27/2018 6:47 PM

Single and Double Precision Floating-Point Numbers (Cont’d) For a given number of bits, the allocation of bits between the mantissa and exponent sections in a floating number representation will be kept the same whether or not one or two registers are used. However, in single precision, all bits are stored in a single register, while in double precision they are divided between two registers. For example, the single precision representation with a single 32-bit register, and double precision representation with two16-bit registers have the same bit allocation. The bit-allocations for 32-bit and 64-bit cases correspond to the IEEE-754 single and double precision floating-point representations. In IEEE-754 floating-point number representations, the number of bits in the representation is used as a convention to classify a floating-point number as a single and double precision floating-point number. 11/27/2018 6:47 PM

Single and Double Precision Floating-Point Numbers (Cont’d) The bit-allocations for 32-bit and 64-bit cases correspond to the IEEE-754 single and double precision floating-point representations. In IEEE-754 floating-point number representations, the number of bits in the representation is used as a convention to classify a floating-point number as a single and double precision floating-point number. Representation size Sign Exponent Mantissa 8 1 2 5 16 4 11 32 23 64 52 Typical floating-point representations. 11/27/2018 6:47 PM

Single and Double Precision Floating-Point Numbers (Cont’d) The most positive and most negative representations in IEEE-754 single precision floating-point format (32-bit numbers) are 11/27/2018 6:47 PM

Representation of Infinity and Other Exceptional Numbers The above formulas can be used to find the representation of most numbers. However, some representations are reserved for some special cases, such as zero, infinity, and not-a-number (NaN). The last number denotes the outcome of undefined operations such as 0/0, and 0 . We already saw that the normalized floating-point formulas above cannot represent zero because of the hidden bit without a special interpretation of the numbers with the most negative exponent. The other special cases are the infinity and the NaN. These cases are represented by setting the exponent to its maximum (all 1's). Therefore, when E = 2k-1, or equivalently e=2k-1, the representation corresponds to either infinity or NaN. We use the mantissa to distinguish between the two. If M = 0 and E = 2k-1, then the representation is for infinity. If M ≠ 0 but E = 2k-1, then the representation is for NaN. 11/27/2018 6:47 PM

Representation of Infinity and Other Exceptional Numbers (Cont’d) In all of the special cases, the sign bit is used to distinguish between positive and negative versions of these numbers, i.e., +0, -0, +, -, +NaN, -NaN. The NaNs are further refined into quiet NANs (QNaNs) and significant NaNs (SNaNs). The QNaNs are designated by setting the most significant bit of the mantissa, and the SNaNs are specified by clearing the same bit. The QNaNs can be viewed as NaNs that can be tolerated during the course of a floating-point computation whereas SNaNs will force the processor to signal an invalid operation as in the case of division of 0 by 0. 11/27/2018 6:47 PM

Representation of Infinity and Other Exceptional Numbers (Cont’d) Example For example, the first two numbers below represent + and -, respectively, and the last two represent NaNs: 11/27/2018 6:47 PM

Mantissas In 2’s Complement Format Most processors use a sign-magnitude representation to represent mantissas in floating-point numbers. Instead, one can also use 1's or 2's complement notations as in fixed-point numbers to represent signed mantissas. This makes the subtraction of mantissas easier to handle. In un-normalized 2's complement mantissa representation, the sign of the mantissa is specified by the leading bit. If that bit is 0 then the mantissa is positive, and if it is 1, then the mantissa is negative. 11/27/2018 6:47 PM

Mantissas In 2’s Complement Format (Cont’d) Determining the value of a floating-point number with a 2's complement mantissa is only slightly more complex. In fact, if the leading bit of the mantissa is 0 then the value of the number is positive, and it is the same as if its mantissa is expressed in sign-magnitude notation. When the leading bit is 1 then the number is negative, and its value is determined by complementing its bits and adding ½-p to it, where p is the number of bits in the mantissa part of the number. Example: Consider the mantissa 101011.01110111 in 2's complement notation. The value of this number is determined as -(010100.10001000 + .00000001)2 = -(010100.10001001)2 = -(20.53515625)10. 11/27/2018 6:47 PM

Homework Set 5 Problem 1. For each of the numbers below give its normalized floating-point form in the base it is expressed. (a) (345.21)8  82 (b) (1000.1110)2 26 (c) (7.8912 )10 10-2 (d) (0.0031112)4  46 Problem 2. For each of the decimal numbers below show its normalized and un-normalized representations in 8-bit biased exponent and 24-bit sign magnitude mantissa format. (a) -4.25 (b) 3.25 (c) 77.777 (d) -45.321 Problem 3. How many floating-point numbers can be written with a 2-bit biased exponent and 3- bit normalized sign-magnitude mantissa assuming (a) the numbers are not de-normalized (b) de- normalized for the most negative and most positive exponents. Specify the most positive and most negative numbers, and precision of the number system in each case. Problem 4. Specify the decimal value of each of the floating-point machine numbers assuming that the exponent is 8 bits and biased and the mantissa is 23 bits and in 2’s complement format. (a) (b) Problem 5. Develop an algorithm to compare two floating-point numbers E1,M1 and E2, M2where E1 and E2 are k-bit biased exponents and M1 and M2 are(p+1)-bit 2’s complement mantissas. Sign Biased exponent 2’s complement mantissa 1 11111000 00001111111111000111000 00001010 11100000000000111111111 11/27/2018 6:47 PM