Chapter 3 Arithmetic for Computers

Name: Chapter 3 Arithmetic for Computers
Uploaded: 2017-11-30T10:20:11+00:00
Duration: PTM53S6
Description: Chapter 3 Arithmetic for Computers

Chapter 3 Arithmetic for Computers

Taxonomy of Computer Information
Instructions Data Addresses Numeric Non-numeric Fixed-point Floating-point Character (ASCII) Boolean Other Single- precision Double- precision Unsigned (ordinal) Signed Sign-magnitude 2’s complement Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Number Format Considerations
Type of numbers (integer, fraction, real, complex) Range of values between smallest and largest values wider in floating-point formats Precision of values (max. accuracy) usually related to number of bits allocated n bits => represent 2n values/levels Value/weight of least-significant bit Cost of hardware to store & process numbers (some formats more difficult to add, multiply/divide, etc.) Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

From Patterson/Hennessey Slides
Unsigned Integers Positional number system: an-1an-2 …. a2a1a0 = an-1x2n-1 + an-2x2n-2 + … + a2x22 + a1x21 + a0x20 Range = [(2n -1) … 0] Carry out of MSB has weight 2n Fixed-point fraction: 0.a-1a-2 …. a-n = a-1x2-1 + a-2x2-2 + … + a-nx2-n Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Signed Integers Sign-magnitude format (n-bit values): A = San-2 …. a2a1a (S = sign bit) Range = [-(2n-1-1) … +(2n-1-1)] Addition/subtraction difficult (multiply easy) Redundant representations of 0 2’s complement format (n-bit values): -A represented by 2n – A Range = [-(2n-1) … +(2n-1-1)] Addition/subtraction easier (multiply harder) Single representation of 0 Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Computing the 2’s Complement
To compute the 2’s complement of A: Let A = an-1an-2 …. a2a1a0 2n - A = 2n – A = (2n -1) - A + 1 … - an-1 an-2 …. a2 a1 a a’n-1a’n-2 …. a’2a’1a’ (one’s complement + 1) Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

2’s Complement Arithmetic
Let (2n-1-1) ≥ A ≥ 0 and (2n-1-1) ≥ B ≥ 0 Case 1: A + B (2n-2) ≥ (A + B) ≥ 0 Since result < 2n, there is no carry out of the MSB Valid result if (A + B) < 2n-1 MSB (sign bit) = 0 Overflow if (A + B) ≥ 2n-1 MSB (sign bit) = 1 if result ≥ 2n-1 Carry into MSB Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Case 2: A - B Compute by adding: A + (-B) 2’s complement: A + (2n - B) -2n-1 < result < 2n-1 (no overflow possible) If A ≥ B: 2n + (A - B) ≥ 2n Weight of adder carry output = 2n Discard carry (2n), keeping (A-B), which is ≥ 0 If A < B: 2n + (A - B) < 2n Adder carry output = 0 Result is 2n - (B - A) = 2’s complement representation of -(B-A) Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Case 3: -A - B Compute by adding: (-A) + (-B) 2’s complement: (2n - A) + (2n - B) = 2n + 2n - (A + B) Discard carry (2n), making result 2n - (A + B) = 2’s complement representation of -(A + B) 0 ≥ result ≥ -2n Overflow if -(A + B) < -2n-1 MSB (sign bit) = 0 if 2n - (A + B) < 2n-1 no carry into MSB Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Relational Operators Compute A-B & test ALU “flags” to compare A vs. B ZF = result zero OF = 2’s complement overflow SF = sign bit of result CF = adder carry output Signed Unsigned A = B ZF = 1 A <> B ZF = 0 A ≥ B (SF  OF) = 0 CF = 1 (no borrow) A > B (SF  OF) + ZF = 0 CF  ZF’ = 1 A ≤ B (SF  OF) + ZF = 1 CF  ZF’ = 0 A < B (SF  OF) = 1 CF = 0 (borrow) Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

MIPS Overflow Detection
An exception (interrupt) occurs when overflow detected for add,addi,sub Control jumps to predefined address for exception Interrupted address is saved for possible resumption Details based on software system / language example: flight control vs. homework assignment Don't always want to detect overflow — new MIPS instructions: addu, addiu, subu note: addiu still sign-extends! note: sltu, sltiu for unsigned comparisons Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Designing the Arithmetic & Logic Unit (ALU)
Provide arithmetic and logical functions as needed by the instruction set Consider tradeoffs of area vs. performance (Material from Appendix B) 32 operation result a b ALU Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Different Implementations
Not easy to decide the “best” way to build something Don't want too many inputs to a single gate (fan in) Don’t want to have to go through too many gates (delay) For our purposes, ease of comprehension is important Let's look at a 1-bit ALU for addition: How could we build a 1-bit ALU for add, and, and or? How could we build a 32-bit ALU? cout = a b + a cin + b cin sum = a xor b xor cin Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Building a 32 bit ALU Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

What about subtraction (a – b) ?
Two's complement approach: just negate b and add. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Adding a NOR function Can also choose to invert a. How do we get “a NOR b” ? A i n v e r t O p e r a t i o n B i n v e r t C a r r y I n a 1 1 R e s u l t b + 2 1 C a r r y O u t Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Tailoring the ALU to the MIPS
Need to support the set-on-less-than instruction (slt) remember: slt is an arithmetic instruction produces a 1 if rs < rt and 0 otherwise use subtraction: (a-b) < 0 implies a < b Need to support test for equality (beq $t5, $t6, $t7) use subtraction: (a-b) = 0 implies a = b Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Supporting slt B i n v e r t a b C y I O p o 1 2 + R s u l 3 L B i n v e r t a b C y I O u p o 1 2 + R s l A 3 L A i n v e r t 1 S e t O v e r f l o w O v e r f l o w d e t e c t i o n all other bits Use this ALU for most significant bit Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Supporting slt B i n a O p e r t i o n C y I A L U s u b 1 R l 2 3 v e r t A i n v e r t R e s u l t 2 . . . . . . . . . C a r r y I n R e s u l t 3 1 b 3 1 S e t s O v e r f l o w Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Test for equality Notice control lines: = and 0001 = or 0010 = add 0110 = subtract 0111 = slt 1100 = NOR Note: zero is a 1 when the result is zero! Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Conclusion We can build an ALU to support the MIPS instruction set key idea: use multiplexor to select the output we want we can efficiently perform subtraction using two’s complement we can replicate a 1-bit ALU to produce a 32-bit ALU Important points about hardware all of the gates are always working the speed of a gate is affected by the number of inputs to the gate the speed of a circuit is affected by the number of gates in series (on the “critical path” or the “deepest level of logic”) Our primary focus: comprehension, however, Clever changes to organization can improve performance (similar to using better algorithms in software) We saw this in multiplication, let’s look at addition now Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Problem: ripple carry adder is slow
Is a 32-bit ALU as fast as a 1-bit ALU? Is there more than one way to do addition? two extremes: ripple carry and sum-of-products Can you see the ripple? How could you get rid of it? c1 = b0c0 + a0c0 + a0b0 c2 = b1c1 + a1c1 + a1b1 c2 = c3 = b2c2 + a2c2 + a2b2 c3 = c4 = b3c3 + a3c3 + a3b3 c4 = Not feasible! Why? Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

One-bit Full-Adder Circuit
FAi XOR sumi ai XOR AND bi AND OR Ci+1 Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

32-bit Ripple-Carry Adder
sum0 FA0 sum1 a1 b1 FA1 sum2 a2 b2 FA2 sum31 FA31 a31 b31 Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

How Fast is Ripple-Carry Adder?
Longest delay path (critical path) runs from cin to sum31. Suppose delay of full-adder is 100ps. Critical path delay = 3,200ps Clock rate cannot be higher than 1012/3,200 = 312MHz. Must use more efficient ways to handle carry. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Fast Adders In general, any output of a 32-bit adder can be evaluated as a logic expression in terms of all 65 inputs. Levels of logic in the circuit can be reduced to log2N for N-bit adder. Ripple-carry has N levels. More gates are needed, about log2N times that of ripple-carry design. Fastest design is known as carry lookahead adder. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

N-bit Adder Design Options
Type of adder Time complexity (delay) Space complexity (size) Ripple-carry O(N) Carry-lookahead O(log2N) O(N log2N) Carry-skip O(√N) Carry-select Reference: J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative Approach, Second Edition, San Francisco, California, 1990. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Carry-lookahead adder
An approach in-between our two extremes Motivation: If we didn't know the value of carry-in, what could we do? When would we always generate a carry? gi = aibi When would we propagate the carry? pi = ai + bi Did we get rid of the ripple? c1 = g0 + p0c0 c2 = g1 + p1c1 c2 = g1 + p1 g0 + p1p0c0 c3 = g2 + p2c2 c3 = … c4 = g3 + p3c3 c4 = … Feasible! Why? Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Use principle to build bigger adders
Can’t build a 16 bit adder this way... (too big) Could use ripple carry of 4-bit CLA adders Better: use the CLA principle again! Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Carry-Select Adder 16-bit ripple carry adder a0-a15 b0-b15 cin sum0-sum15 a16-a31 16-bit ripple carry adder b16-b31 Multiplexer sum16-sum31 a16-a31 16-bit ripple carry adder 1 b16-b31 This is known as carry-select adder 1 Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

ALU Summary We can build an ALU to support MIPS addition Our focus is on comprehension, not performance Real processors use more sophisticated techniques for arithmetic Where performance is not critical, hardware description languages allow designers to completely automate the creation of hardware! Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Multiplication More complicated than addition accomplished via shifting and addition More time and more area Let's look at 3 versions based on a gradeschool algorithm (multiplicand) __x_ (multiplier) Negative numbers: convert and multiply there are better techniques, we won’t look at them Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Multiplication: Implementation
Datapath Control Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Final Version Multiplier starts in right half of product What goes here? Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Multiplying Signed Numbers with Boothe’s Algorithm
Consider A x B where A and B are signed integers (2’s complemented format) Decompose B into the sum B1 + B2 + … + Bn A x B = A x (B1 + B2 + … + Bn) = (A x B1) + (A x B2) + … (A x Bn) Let each Bi be a single string of 1’s embedded in 0’s: …0011…1100… Example: = Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Boothe’s Algorithm Scanning from right to left, bit number u is the first 1 bit of the string and bit v is the first 0 left of the string: v u Bi = 0 … 0 1 … 1 0 … 0 = 0 … 0 1 … 1 1 … (2v – 1) - 0 … 0 0 … 0 1 … (2u – 1) = (2v – 1) - (2u – 1) = 2v – 2u Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Boothe’s Algorithm Decomposing B: A x B = A x (B1 + B2 + … ) = A x [(2v1 – 2u1) + (2v2 – 2u2) + …] = (A x 2v1) – (A x 2u1) + (A x 2v2) – (A x 2u2) … A x B can be computed by adding and subtracted shifted values of A: Scan bits right to left, shifting A once per bit When the bit string changes from 0 to 1, subtract shifted A from the current product P – (A x 2u) When the bit string changes from 1 to 0, add shifted A to the current product P + (A x 2v) Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Floating Point Numbers
We need a way to represent a wide range of numbers numbers with fractions, e.g., large number: 976,000,000,000,000 = 9.76 × 1014 small number: = 9.76 × 10-14 Representation: sign, exponent, significand: (–1)sign × significand × 2exponent more bits for significand gives more accuracy more bits for exponent increases range Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Scientific Notation Scientific notation: 0.525×105 = 5.25×104 = 52.5×103 5.25 ×104 is in normalized scientific notation. position of decimal point fixed leading digit non-zero Binary numbers 5.25 = = ×22 Binary point multiplication by 2 moves the point to the left division by 2 moves the point to the right Known as floating point format. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Binary to Decimal Conversion
Binary (-1)S (1.b1b2b3b4) × 2E Decimal (-1)S × (1 + b1×2-1 + b2×2-2 + b3×2-3 + b4×2-4) × 2E Example: × 2-2 (binary) = - ( ) ×2-2 = - ( )/4 = /4 = (decimal) Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

IEEE Std. 754 Floating-Point Format
Single-Precision S E: 8-bit Exponent F: 23-bit Fraction bits 23-30 bits 0-22 bit 31 Double-Precision S E: 11-bit Exponent F: 52-bit Fraction + bits 20-30 bits 0-19 bit 31 Continuation of 52-bit Fraction bits 0-31 Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

IEEE 754 floating-point standard
Represented value = (–1)sign × (1+F) × 2exponent – bias Exponent is “biased” (excess-K format) to make sorting easier bias of 127 for single precision and 1023 for double precision E values in [ ] (0 and 255 reserved) Range = to (1038 to 10+38) Significand in sign-magnitude, normalized form Significand = (1 + F) = 1. b-1b-1b-1…b-23 Suppress storage of leading 1 Overflow: Exponent requiring more than 8 bits. Number can be positive or negative. Underflow: Fraction requiring more than 23 bits. Number can be positive or negative. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

IEEE 754 floating-point standard
Example: Decimal: = - ( ½ + ¼ ) Binary: = x 22 Floating point: exponent = 129 = IEEE single precision: Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Examples Biased exponent (0-255), bias 127 ( ) to be subtracted × = = × 220 × = = × 220 × = = × 2-20 × = = × 2-20 0.5 0.125 Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Numbers in 32-bit Formats
Two’s complement integers Floating point numbers Ref: W. Stallings, Computer Organization and Architecture, Sixth Edition, Upper Saddle River, NJ: Prentice-Hall. Expressible numbers -231 231-1 Positive underflow Negative underflow Negative Overflow Expressible negative numbers Expressible positive numbers Positive Overflow -2-127 2-127 - (2 – 2-23)×2127 (2 – 2-23)×2127 Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

IEEE 754 Special Codes Zero S ± 1.0 × 2-127 Smallest positive number in single-precision IEEE 754 standard. Interpreted as positive/negative zero. Exponent less than -127 is positive underflow (regard as zero). Infinity S ± 1.0 × 2128 Largest positive number in single-precision IEEE 754 standard. Interpreted as ± ∞ If true exponent = 128 and fraction ≠ 0, then the number is greater than ∞. It is called “not a number” or NaN (interpret as ∞). Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Addition and Subtraction
Addition/subtraction of two floating-point numbers: Example: x Align mantissas: x 104 + 3 x x 104 3.2 x 104 General Case: m1 x 2e1 ± m2 x 2e2 = (m1 ± m2 x 2e2-e1) x 2e for e1 > e2 = (m1 x 2e1-e2 ± m2) x 2e2 for e2 > e1 Shift smaller mantissa right by | e1 - e2| bits to align the mantissas. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Addition/Subtraction Algorithm
0. Zero check - Change the sign of subtrahend - If either operand is 0, the other is the result 1. Significand alignment: right shift smaller significand until two exponents are identical. 2. Addition: add significands and report exception if overflow occurs. 3. Normalization - Shift significand bits to normalize. - report overflow or underflow if exponent goes out of range. 4. Rounding Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

FP Add/Subtract (PH Text Figs. 3.16/17)
Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Example Subtraction: 0.5ten ten Step 0: Floating point numbers to be added 1.000two×2-1 and two×2-2 Step 1: Significand of lesser exponent is shifted right until exponents match -1.110two×2-2 → two×2-1 Step 2: Add significands, 1.000two + ( two) Result is 0.001two ×2-1 Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Example (Continued) Step 3: Normalize, 1.000two× 2-4 No overflow/underflow since 127 ≥ exponent ≥ -126 Step 4: Rounding, no change since the sum fits in 4 bits. 1.000two ×2-4 = (1+0)/16 = ten Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

FP Multiplication: Basic Idea
(m1 x 2e1) x (m2 x 2e2) = (m1 x m2) x 2e1+e2 Separate signs Add exponents Multiply significands Normalize, round, check overflow Replace sign Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

FP Multiplication Algorithm
P-H Figure 3.18 Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

FP Mult. Illustration Multiply 0.5ten and ten (answer = ten) or Multiply 1.000two×2-1 and two×2-2 Step 1: Add exponents -1 + (-2) = -3 Step 2: Multiply significands 1.000 ×1.110 0000 1000 Product is Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

FP Mult. Illustration (Cont.)
Step 3: Normalization: If necessary, shift significand right and increment exponent. Normalized product is × 2-3 Check overflow/underflow: 127 ≥ exponent ≥ -126 Step 4: Rounding: × 2-3 Step 5: Sign: Operands have opposite signs, Product is × 2-3 Decimal value = - ( )/8 = ten Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

FP Division: Basic Idea
Separate sign. Check for zeros and infinity. Subtract exponents. Divide significands. Normalize/overflow/underflow. Rounding. Replace sign. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

MIPS Floating Point 32 floating point registers, $f0, , $f31 FP registers used in pairs for double precision; $f0 denotes double precision content of $f0,$f1 Data transfer instructions: lwc1 $f1, 100($s2) # $f1←Mem[$s1+100] swc1 $f1, 100($s2) # Mem[$s2+100]←$f1 Arithmetic instructions: (xxx=add, sub, mul, div) xxx.s single precision xxx.d double precision Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Floating Point Complexities
Operations are somewhat more complicated (see text) In addition to overflow we can have “underflow” Accuracy can be a big problem IEEE 754 keeps two extra bits, guard and round four rounding modes positive divided by zero yields “infinity” zero divide by zero yields “not a number” other complexities Implementing the standard can be tricky Not using the standard can be even worse see text for description of 80x86 and Pentium bug! Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Chapter Three Summary Computer arithmetic is constrained by limited precision Bit patterns have no inherent meaning but standards do exist two’s complement IEEE 754 floating point Computer instructions determine “meaning” of the bit patterns Performance and accuracy are important so there are many complexities in real machines Algorithm choice is important and may lead to hardware optimizations for both space and time (e.g., multiplication) You may want to look back (Section 3.10 is great reading!) Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides

Chapter 3 Arithmetic for Computers

Similar presentations

Presentation on theme: "Chapter 3 Arithmetic for Computers"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 3 Arithmetic for Computers

Similar presentations

Presentation on theme: "Chapter 3 Arithmetic for Computers"— Presentation transcript:

Similar presentations

About project

Feedback