Unrolling Carry Recurrence
Carry-Lookahead Equations
4-Bit CLA
Circuit Structure
CLA Complexity
Managing CLA Complexity
Multilevel CLA Example
Radix-16 Addition Two Binary Numbers Grouped into Hex Digits Block Generate and Propagate Signals in Each radix-16 Digit Replace c4 Position of CLA Network with Block Signals g[i,i+3] and p[i,i+3] Results in 4-bit “Lookahead Carry Generator”
CLA Design
Lookahead Carry Generator
Block Generate and Propagate Assume i0 < i1 < i2 Example: g[0,3] is Generate Signal of Block for bits 0-3 Relationships Allow for Merging of Blocks Can Allow Merged Block to Overlap
Lookahead Carry Generator Example Merged LAG x15-12 y15-12 x11-8 y11-8 x7-4 y7-4 x3-0 y3-0 CLA 3 c12 CLA 2 c8 CLA 1 c4 CLA c0 g15-12 p15-12 g11-8 p11-8 g7-4 p7-4 g3-0 p3-0 s15-12 s11-8 s7-4 s3-0 Lookahead Carry Generator g15-0 p15-0
CLA Latency
CLA Architecture
Overlapped LAGs Overlap Blocks [i1,j1] and [i0,j0] Relationships Become: Useful for Building Trees of Different Shapes
CLA With LAG
CLA Latency Example: 64-bit CLA in 13 gate levels since 43 = 64 Generates final carry out for Fig. 6.5
Ling Adders
Ling Adders – Wired OR
Block p and g Generators
Carry Determination as Prefix Computations Two Contiguous (or Overlapping) Blocks (g, p) and (g, p) Merged Block (g, p) g = g + gp p = p p Large Group Generates Carry if: left group generates carry right group generates and left group propagates
Carry Operator, ¢ Define Operator Over (g, p) Pairs (g, p) = (g, p ) ¢ (g, p) g = g + g p p = p p ¢ is Associative (g, p) ¢ (g, p) ¢ (g, p) = [(g, p) ¢ (g, p) ] ¢ (g, p) = (g, p) ¢ [(g, p) ¢ (g, p)]
Carry Operator, ¢ (cont) ¢ is NOT Commutative (g, p) ¢ (g, p) (g, p)¢ (g, p) This is Easy to See Because: g = g+ gp g+ g p
Prefix Adders
Carry Determination Assume Adder with NO cIN ci+1 = g[0,i] Carry Enters i+1 Block iff Generated in Block [0,i] Assume Adder with cIN = 1 Viewed as Generated Carry from Stage -1 p-1 = 0, g-1 = cIN Compute g[-1,i] For All i Formulate Carry Determination as:
Prefix Computation
Prefix Sums Analogy Designs for Prefix Sums Can be Converted to Carry Computation Replace Adder with ¢ Operator Addition IS Commutative, Order Doesn’t Matter Can Group (g, p) In Anyway to Combine Into Block Signals (as long as order is preserved) (g, p) Allow for Overlapping Groups, Prefix Sums Does Not (sum would contain some values added two or times)
Prefix Sum Network (adder levels) (# of adders)
Another Way for Prefix Sums Compute the Following First: x0+x1 x2+x3 x4+x5 ... xk-2+xk-1 Yields the Partial Sums, s1, s3, s5, ..., sk-1 Next, Even Indexed Sums Computed As: s2j = s2j-1 + x2j
Alternative Prefix Sum Network
Comparison of Prefix Sum Networks First Design Faster: lg2(k) versus 2lg2(k)-2 (levels) First Design has High Fan-out Requirements First Design Requires More Cells (k/2)lg2k versus 2k-2-lg2k Second Design is Brent-Kung Parallel Prefix Graph First Design is Kogge-Stone Parallel Prefix Graph (fan-out can be avoided by distributing computations)
Brent-Kung Network independent, so single delay
Kogge-Stone Network
Area/Levels of Prefix Networks
Hybrid Parallel Prefix Network