Computer Architecture Lecture Notes Spring 2005 Dr. Michael P. Frank Competency Area 4: Computer Arithmetic.

Slides:



Advertisements
Similar presentations
Chapter Three.
Advertisements

©UCB CPSC 161 Lecture 6 Prof. L.N. Bhuyan
1 CONSTRUCTING AN ARITHMETIC LOGIC UNIT CHAPTER 4: PART II.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 3:
Princess Sumaya Univ. Computer Engineering Dept. Chapter 3: IT Students.
Chapter 3 Arithmetic for Computers. Multiplication More complicated than addition accomplished via shifting and addition More time and more area Let's.
Computer ArchitectureFall 2007 © September 5, 2007 Karem Sakallah CS 447 – Computer Architecture.
Assembly Language and Computer Architecture Using C++ and Java
1  2004 Morgan Kaufmann Publishers Chapter Three.
Chapter 6 Arithmetic. Addition Carry in Carry out
1 Chapter 4: Arithmetic Where we've been: –Performance (seconds, cycles, instructions) –Abstractions: Instruction Set Architecture Assembly Language and.
Chapter Four Arithmetic and Logic Unit
CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.
1 ECE369 Chapter 3. 2 ECE369 Multiplication More complicated than addition –Accomplished via shifting and addition More time and more area.
1  1998 Morgan Kaufmann Publishers Chapter Four Arithmetic for Computers.
COE 308: Computer Architecture (T041) Dr. Marwan Abu-Amara Integer & Floating-Point Arithmetic (Appendix A, Computer Architecture: A Quantitative Approach,
Chapter 3 Arithmetic for Computers. Arithmetic Where we've been: Abstractions: Instruction Set Architecture Assembly Language and Machine Language What's.
Computer ArchitectureFall 2008 © August 27, CS 447 – Computer Architecture Lecture 4 Computer Arithmetic (2)
Computer Organization and Architecture Computer Arithmetic Chapter 9.
1 Bits are just bits (no inherent meaning) — conventions define relationship between bits and numbers Binary numbers (base 2)
Computer Arithmetic Nizamettin AYDIN
Computer Arithmetic. Instruction Formats Layout of bits in an instruction Includes opcode Includes (implicit or explicit) operand(s) Usually more than.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 3.
1 Modified from  Modified from 1998 Morgan Kaufmann Publishers Chapter Three: Arithmetic for Computers citation and following credit line is included:
Computer Arithmetic.
EGRE 426 Fall 09 Chapter Three
Computing Systems Basic arithmetic for computers.
Chapter 6-1 ALU, Adder and Subtractor
Number Systems and Arithmetic or Computers go to elementary school Reading – Peer Instruction Lecture Materials for Computer Architecture by Dr.
07/19/2005 Arithmetic / Logic Unit – ALU Design Presentation F CSE : Introduction to Computer Architecture Slides by Gojko Babić.
CH09 Computer Arithmetic  CPU combines of ALU and Control Unit, this chapter discusses ALU The Arithmetic and Logic Unit (ALU) Number Systems Integer.
Oct. 18, 2007SYSC 2001* - Fall SYSC2001-Ch9.ppt1 See Stallings Chapter 9 Computer Arithmetic.
1 EGRE 426 Fall 08 Chapter Three. 2 Arithmetic What's up ahead: –Implementing the Architecture 32 operation result a b ALU.
1  1998 Morgan Kaufmann Publishers Arithmetic Where we've been: –Performance (seconds, cycles, instructions) –Abstractions: Instruction Set Architecture.
Csci 136 Computer Architecture II – Constructing An Arithmetic Logic Unit Xiuzhen Cheng
Princess Sumaya Univ. Computer Engineering Dept. Chapter 3:
CDA 3101 Fall 2013 Introduction to Computer Organization The Arithmetic Logic Unit (ALU) and MIPS ALU Support 20 September 2013.
Computer Arithmetic See Stallings Chapter 9 Sep 10, 2009
Kavita Bala CS 3410, Spring 2014 Computer Science Cornell University.
1 ELEN 033 Lecture 4 Chapter 4 of Text (COD2E) Chapters 3 and 4 of Goodman and Miller book.
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Arithmetic: Part II.
Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CE-321: Computer.
순천향대학교 정보기술공학부 이 상 정 1 3. Arithmetic for Computers.
Prof. Hsien-Hsin Sean Lee
Computer Arthmetic Chapter Four P&H. Data Representation Why do we not encode numbers as strings of ASCII digits inside computers? What is overflow when.
9/23/2004Comp 120 Fall September Chapter 4 – Arithmetic and its implementation Assignments 5,6 and 7 posted to the class web page.
EE204 L03-ALUHina Anwar Khan EE204 Computer Architecture Lecture 03- ALU.
By Wannarat Computer System Design Lecture 3 Wannarat Suntiamorntut.
1 CPTR 220 Computer Organization Computer Architecture Assembly Programming.
Chapter 9 Computer Arithmetic
William Stallings Computer Organization and Architecture 8th Edition
Computer System Design Lecture 3
CS/COE0447 Computer Organization & Assembly Language
William Stallings Computer Organization and Architecture 7th Edition
Arithmetic Logical Unit
Computer Arithmetic Multiplication, Floating Point
ECEG-3202 Computer Architecture and Organization
Chapter 8 Computer Arithmetic
Presentation transcript:

Computer Architecture Lecture Notes Spring 2005 Dr. Michael P. Frank Competency Area 4: Computer Arithmetic

In previous chapters we’ve discussed: —Performance (execution time, clock cycles, instructions, MIPS, etc) —Abstractions: Instruction Set Architecture Assembly Language and Machine Language In this chapter: —Implementing the Architecture: –How does the hardware really add, subtract, multiply and divide? –Signed and unsigned representations –Constructing an ALU (Arithmetic Logic Unit) Introduction

Humans naturally represent numbers in base 10, however, computers understand base 2. Example: ( ) 2 = 255 (Signed Representation) (Unsigned Representation) Note: Signed representation includes sign-magnitude and two’s complement. Also, one’s complement representation.

Sign Magnitude: One's Complement Two's Complement 000 = = = = = = = = = = = = = = = = = = = = = = = = -1 Sign Magnitude (first bit is sign bit, others magnitude) Two’s Complement (negation: invert bits and add 1) One’s Complement (first bit is sign bit, invert other bits for magnitude) NOTE: Computers today use two’s complement binary representations for signed numbers. Possible Representations

32 bit signed numbers (MIPS): two = 0 ten two = + 1 ten two = + 2 ten two = + 2,147,483,646 ten two = + 2,147,483,647 ten two = – 2,147,483,648 ten two = – 2,147,483,647 ten two = – 2,147,483,646 ten two = – 3 ten two = – 2 ten two = – 1 ten maxint minint Two’s Complement Representations The hardware need only test the first bit to determine the sign.

Negating a two's complement number: –invert all bits and add 1 –Or, preserve rightmost 1 and 0’s to its right, flip all bits to the left of the rightmost 1 Converting n-bit numbers into m-bit numbers with m > n: Example: Convert 4-bit signed number into 8-bit number  (+2 10 ) 1010  (-6 10 ) —"sign extension" is used. The most significant bit is copied into the right portion of the new word. For unsigned numbers, the leftmost bits are filled with 0’s. —Example instructions: lbu/lb, slt/sltu, etc. Two’s Complement Operations

Just like in grade school (carry/borrow 1s) Two's complement operations easy —subtraction using addition of negative numbers Overflow (result too large for finite computer word): —e.g., adding two n-bit numbers does not yield an n-bit number note that overflow term is somewhat misleading, 1000it does not mean a carry “overflowed” Addition and Subtraction

32-bit ALU with Zero Detect: * Recall that given following control lines, we get these functions: 000 = and 001 = or 010 = add 110 = subtract 111 = slt * We’ve learned how to build each of these functions in hardware.

We’ve studied how to implement a 1-bit ALU in hardware that supports the MIPS instruction set: —key idea: use multiplexor to select desired output function —we can efficiently perform subtraction using two’s complement —we can replicate a 1-bit ALU to produce a 32-bit ALU Important issues about hardware: —all of the gates are always working —the speed of a gate is affected by the number of inputs to the gate —the speed of a circuit is affected by the number of gates in series (on the “critical path” or the “deepest level of logic”) Changes in hardware organization can improve performance —we’ll look at examples for addition (carry lookahead adder) and multiplication, and division So far…

For adder design: —Problem  ripple carry adder is slow due to sequential evaluation of carry-in/carry-out bits Consider the carryin inputs: Better adder design Using substitution, we can see the “ripple” effect:

Faster carry schemes exist that improve the speed of adders in hardware and reduce complexity in equations, namely the carry lookahead adder. Let cin i represent the ith carryin bit, then: We can now define the terms generate and propagate: Then, Carry-Lookahead Adder

Suppose g i is 1. The adder generates a carryout independent of the value of the carryin, i.e. Now suppose g i is 0 and p i is 1: The adder propagates a carryin to a carryout. In summary, cout is 1 if either g i is 1 or both p i and cin are 1. This new approach creates the first level of abstraction. Carry-Lookahead Adder

Sometimes the first level of abstraction will produce large equations. It is beneficial then to look at the second level of abstraction. It is produced by considering a 4-bit adder where we propagate and generate signals at a higher level: Carry-Lookahead Adder We’re representing a 16-bit adder, with a “super” propagate signal and a “super” generate signal. So P i is true only if the each of the bits in the group propagates a carry.

For the “super” generate signals it matters only if there is a carry out in the most significant bit. Carry-Lookahead Adder Now we can represent the carryout signals for the 16-bit adder with two levels of abstraction as

2 nd Level of Abstraction Carry-LookAhead Adder Design

O(log n)-time carry-skip adder (8 bit segment shown) With this structure, we can do a 2 n -bit add in 2(n+1) logic stages Hardware overhead is <2× regular ripple-carry. 1 st carry tick 2 nd carry tick 3 rd carry tick 4 th carry tick

Recall that multiplication is accomplished via shifting and addition. Example: 0010(multiplicand) x0110(multiplier) (shift multiplicand left 1 bit) (product) Multiplication Algorithms Intermediate product Multiply by LSB of multiplier

Multiplication Algorithm 1 Hardware implementation of Algorithm 1:

Multiplication Algorithm 1 For each bit:

Multiplication Algorithm 1 IterationStepMultiplierMultiplicandProduct 0Initial Steps a  LSB multiplier =  Shift Mcand lft  shift Multiplier rgt a  LSB multiplier =  Shift Mcand lft  shift Multiplier rgt  LSB multiplier =  Shift Mcand lft  shift Multiplier rgt  LSB multiplier =  Shift Mcand lft  shift Multiplier rgt Example: (4-bit)

For Algorithm 1 we initialize the left half of the multiplicand to 0 to accommodate for its left shifts. All adds are 64 bits wide. This is wasteful and slow. Algorithm 2  instead of shifting multiplicand left, shift product register to the right => half the widths of the ALU and multiplicand Multiplication Algorithms

Multiplication Algorithm 2 For each bit:

Multiplication Algorithm 2 IterationStepMultiplierMultiplicandProduct 0Initial Steps a  LSB multiplier =  Shift Product register rgt  shift Multiplier rgt a  LSB multiplier =  Shift Product register rgt  shift Multiplier rgt  LSB multiplier =  Shift Product register rgt  shift Multiplier rgt  LSB multiplier =  Shift Product register rgt  shift Multiplier rgt Example: (4-bit)

Multiplication Algorithm 3 The third multiplication algorithm combines the right half of the product with the multiplier. This reduces the number of steps to implement the multiply and it also saves space. Hardware Implementation of Algorithm 3:

Multiplication Algorithm 3 For each bit:

Multiplication Algorithm 3 IterationStepMultiplicandProduct 0Initial Steps a  LSB product =  Shift Product register rgt a  LSB product =  Shift Product register rgt  LSB product =  Shift Product register rgt  LSB product =  Shift Product register rgt Example: (4-bit)

Example: Hardware implementations are similar to multiplication algorithms: +Algorithm 1  implements conventional division method +Algorithm 2  reduces divisor register and ALU by half +Algorithm 3  eliminates quotient register completely Division Algorithms - - (Divisor) (Quotient) (Dividend)

We need a way to represent: —numbers with fractions, e.g., —very small numbers, e.g., —very large numbers, e.g.,  10 9 Representation: —sign, exponent, significand: (–1) sign  significand  2 exponent —more bits for significand gives more accuracy —more bits for exponent increases dynamic range IEEE 754 floating point standard: —For Single Precision: 8 bit exponent, 23 bit significand, 1 bit sign —For Double Precision: 11 bit exponent, 52 bit significand, 1 bit sign Floating Point Numbers SIGNEXPONENTSIGNIFICAND

Leading “1” bit of significand is implicit Exponent is usually “biased” to make sorting easier —All 0s is smallest exponent, all 1s is largest —bias of 127 for single precision and 1023 for double precision Summary: (–1) sign  fraction)  2 exponent – bias Example: − = −1.1 2  2 −1 — Single precision: (−1) 1  ( …)  2 126−127 –1| | — Double precision: (−1) 1  ( …)  −1023 –1| | … (32 more 0s) IEEE 754 floating-point standard

FP Addition Algorithm The number with the smaller exponent must be shifted right before adding. —So the “binary points” align. After adding, the sum must be normalized. —Then it is rounded, –and possibly re-normalized Possible errors include: —Overflow (exponent too big) —Underflow (exp. too small)

Floating-Point Addition Hardware Implements algorithm from prev. slide. Note high complexity compared with integer addition HW.

FP Multiplication Algorithm Add the exponents. —Adjusting for bias. Multiply the significands. Normalize, —Check for over/under flow, then round. —Repeat if necessary. Compute the sign.

Ethics Addendum: Intel Pentium FP bug In July 1994, Intel discovered there was a bug in the Pentium’s FP division hardware… —But decided not to announce the bug, and go ahead and ship chips having the flaw anyway, to save them time & money –Based on their analysis, they thought errors could arise only rarely. Even after the bug was discovered by users, Intel initially refused to replace the bad chips on request! —They got a lot of bad PR from this… Lesson: Good, ethical engineers fix problems when they first find them, and don’t cover them up!

Computer arithmetic is constrained by limited precision. Bit patterns have no inherent meaning but standards do exist: –two’s complement –IEEE 754 floating point Computer instructions determine “meaning” of the bit patterns. Performance and accuracy are important so there are many complexities in real machines (i.e., algorithms and implementation). * Please read the remainder of the chapter on your own. However, you will only be responsible for the material that has been covered in the lectures for exams. Summary