# COMBINATIONAL LOGIC DYNAMICS

## Presentation on theme: "COMBINATIONAL LOGIC DYNAMICS"— Presentation transcript:

COMBINATIONAL LOGIC DYNAMICS

Fast Complex Gates: Design Technique 1
Transistor sizing as long as fan-out capacitance dominates Progressive sizing CL Distributed RC line M1 > M2 > M3 > … > MN (the fet closest to the output is the smallest) InN MN M1 have to carry the discharge current from M2, M3, … MN and CL so make it the largest MN only has to discharge the current from MN (no internal capacitances) C3 In3 M3 C2 In2 M2 Can reduce delay by more than 20%; C1 In1 M1

Fast Complex Gates: Design Technique 2
Transistor ordering critical path critical path 01 CL CL charged charged 1 In1 In3 M3 M3 1 C2 1 C2 In2 In2 M2 discharged M2 charged For lecture. Critical input is latest arriving signal Place latest arriving signal (critical path) closest to the output 1 C1 C1 In3 discharged In1 charged M1 M1 01 delay determined by time to discharge CL, C1 and C2 delay determined by time to discharge CL

Fast Complex Gates: Design Technique 3
Alternative logic structures F = ABCDEFGH Reduced fan-in -> deeper logic depth Reduction in fan-in offsets, by far, the extra delay incurred by the NOR gate (second configuration). Only simulation will tell which of the last two configurations is faster, lower power

Fast Complex Gates: Design Technique 4
Isolating fan-in from fan-out using buffer insertion CL CL Reduce CL on large fan-in gates, especially for large CL, and size the inverters progressively to handle the CL more effectively

Fast Complex Gates: Design Technique 5
Reducing the voltage swing linear reduction in delay also reduces power consumption But the following gate is much slower! Or requires use of “sense amplifiers” to restore the signal level (memory design) tpHL = 0.69 (3/4 (CL VDD)/ IDSATn ) = 0.69 (3/4 (CL Vswing)/ IDSATn )

Sizing Logic Paths for Speed
Frequently, input capacitance of a logic path is constrained Logic also has to drive some capacitance Example: ALU load in an Intel’s microprocessor is 0.5pF How do we size the ALU datapath to achieve maximum speed? We have already solved this for the inverter chain – can we generalize it for any type of logic?

Buffer Example In Out CL 1 2 N (in units of tinv)
For given N: Ci+1/Ci = Ci/Ci-1 To find N: Ci+1/Ci ~ 4 How to generalize this to any logic path?

Logical Effort p – intrinsic delay (3kRunitCunitg) - gate parameter  f(W) g – logical effort (kRunitCunit) – gate parameter  f(W) f – effective fanout Normalize everything to an inverter: ginv =1, pinv = 1 Divide everything by tinv (everything is measured in unit delays tinv) Assume g = 1.

Delay in a Logic Gate Gate delay: d = h + p effort delay
intrinsic delay Effort delay: h = g f logical effort effective fanout = Cout/Cin Logical effort is a function of topology, independent of sizing Effective fanout (electrical effort) is a function of load/gate size

Logical Effort Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current Logical effort increases with the gate complexity

Intrinsic Delay Inverter has the smallest intrinsic delay and of all static CMOS gates Intrinsic delay of a gate presents the ratio of its output capacitance to the inverter output capacitance when sized to deliver the same current Intrinsic delay increases with the gate complexity

Logical Effort Logical effort is the ratio of input capacitance of a gate to the input capacitance of an inverter with the same output current g =p= 1 g = 4/3, p=2 g = 5/3, p=2

Logical Effort of Gates
Fan-out (f) Normalized delay (d) t 1 2 3 4 5 6 7 pINV pNAND F(Fan-in) g = 1 p = 1 d = f+1 g = 4/3 p = 2 d = (4/3)f+2 h = g f d = h+p g – logical effort f - effective fan out p – intrinsic delay Intrinsic Delay Effort

Add Branching Effort Branching effort: Coff-path is the branch capacitance

Multistage Networks Stage effort: hi = gifi
Path electrical effort: F = Cout/Cin Path logical effort: G = g1g2…gN Branching effort: B = b1b2…bN Path effort: H = GFB Path delay D = Sdi = Spi + Shi

Optimum Effort per Stage
When each stage bears the same effort: Stage efforts: g1f1 = g2f2 = … = gNfN Effective fanout of each stage: Minimum path delay

Logical Effort From Sutherland, Sproull

Example – 8-input AND Fan out is not known here g=10/3 g=1 p=8 p=1
Logical efforts Intrinsic delays g=2 g=5/3 p=4 p=2 g=4/3 g=5/3 g=4/3 g=1 p= p= p= p=1 Fan out is not known here

Example: Optimize Path
g1 = 1 f1 = ag2/g1 g2 = 5/3 f2 = bg3/ag2 g3 = 5/3 f3 = cg4/bg3 g4 = 1 f 4= 5/cg4= Output load Input load Stage fan-out is fi and a,b,c are scale factors comparing a gate size to the minimum size gate with the same speed as inverter Effective fanout, F = 5 Path electrical effort: F = Cout/Cin G = Path logical effort : G = g1g2…gN H = Path effort: H = GFB h = Stage effort: hi = gifi a = b = c =

Example: Optimize Path
g1 = 1 f1 = ag2/g1 g2 = 5/3 f2 = bg3/ag2 g3 = 5/3 f3 = cg4/bg3 g4 = 1 f 4= 5/cg4 Stage fan-out is fi and a,b,c are scale factors comparing a gate size to the minimum size gate with the same speed as inverter Effective fanout, F = 5 G = 25/9 H = 125/9 = (since no branching here then B=1, H=GFB) h = (this is the optimum effort for each gate h=H1/4) a = h/g2= (from h=f1g1=1.93 and f1=ag2/g1) b = ha/g3 = (same as h=f2g2=1.93 and f2= bg3/ag2) c = hb/g4 = (same as h=f3g3=1.93 and f3=cg4/bg3)

Method of Logical Effort
Compute the path effort: H = GBF Find the best number of stages N ~ log4H Compute the stage effort h= H1/N Sketch the path with this number of stages Work from either end, find sizes: Cin = Cout*g/h Reference: Sutherland, Sproull, Harris, “Logical Effort, Morgan-Kaufmann 1999.

Ratio Based Logic V DD SS PDN In 1 2 3 F R L Load Resistive Depletion PMOS (a) resistive load (b) depletion load NMOS (c) pseudo-NMOS T < 0 Goal: to reduce the number of devices over complementary CMOS

Ratio Based Logic V PDN In F R Load Resistive N transistors + Load • V
DD SS PDN In 1 2 3 F R L Load Resistive N transistors + Load • V OH = V OL = DN + R • Asymmetrical response • Static power consumption • t pLH = 0.69 R C VDD

Ratio Based Logic Problems
Problems with Resistive Load IL = (VDD – Vout )/ RL Charging current drops rapidly once Vout starts to rise Solution: Use a current source! Available current is independent of voltage Reduces tpLH by 25%

DD SS In 1 2 3 F PDN Depletion Load PMOS depletion load NMOS pseudo-NMOS T < 0

IL ~ (kn, load / 2) (|VTn|)2 Deviates from ideal current source Channel length modulation Body effect VSB varies with Vout reduces |VTn|, hence IL gets smaller for increasing Vout

VGS = - VDD , higher load current IL = (kp / 2) (VDD - |VTn|)2 Larger VGS causes pseudo-NMOS load to leave saturation mode sooner than NMOS

Pseudo-NMOS

Pseudo-NMOS VTC Noise margin low is significantly reduced comparing
0.0 0.5 1.0 1.5 2.0 2.5 3.0 V in [V] o u t W/L p = 4 = 2 = 1 = 0.25 = 0.5 NL Vin_low Vin_high Noise margin low is significantly reduced comparing to CMOS

Pseudo-NMOS NAND Gate VDD Out GND

Improved Loads (1) For fast low-to-high transition in standby circuits

Improved Loads (2) Differential Cascode Voltage Switch Logic (DCVSL)
DD SS PDN1 Out PDN2 A B M1 M2 Differential Cascode Voltage Switch Logic (DCVSL) Have no static current Requires that each gate generates both Out and its complement

DCVSL Example

DCVSL Transient Response
0.2 0.4 0.6 0.8 1.0 -0.5 0.5 1.5 2.5 A B [V] e g A B a t o l V A , B A,B Time [ns] DCVSL transient response of AND/NAND gate

Pass-Transistor Logic

Example: AND Gate

NMOS-Only Logic In Out x [V] e g a t l o V Time [ns] 3.0 2.0 1.0 0.0
0.5 1 1.5 2 Time [ns]

NMOS-only Switch V does not pull up to 2.5V, but 2.5V - V
A = 2.5 V A = 2.5 V B M B n C M 1 L V does not pull up to 2.5V, but 2.5V - V B TN Threshold voltage loss causes static power consumption NMOS has higher threshold than PMOS (body effect)

Pass-Transistor Logic- Solution 1: Level Restoring Transistor
DD Level Restorer weak transistor V DD M r B M 2 X A M n Out M 1 • Advantages: Full Swing, No static power dissipation • Restorer adds capacitance, takes away pull down current at X • Ratio problem

Restorer Transistor Sizing
100 200 300 400 500 0.0 1.0 2.0 W / L r =1.0/0.25 =1.25/0.25 =1.50/0.25 =1.75/0.25 V o l t a g e [V] Time [ps] 3.0 Level restoring transistor cannot be too strong otherwise it will prevent output from reaching VDD value Upper limit on restorer size Pass-transistor pull-down can have several transistors in stack

Pass-Transistor Logic - Solution 2: Single Transistor Pass Gate with VT=0
Out V DD 2.5V 0V WATCH OUT FOR LEAKAGE CURRENTS If pass transistors have VT=0 the output does not require level restorer but there is a leakage current

Complementary Pass Transistor Logic

Pass-Transistor Logic Solution 3: Transmission Gate
B A B C C C = 2.5 V A = 2.5 V B C L C = 0 V

Resistance of Transmission Gate

Transmission Gate Based Multiplexer
VDD GND In1 S S In2

Transmission Gate Based XOR
F M1 M3/M4 B B

Propagate signal Similar delays for sum and carry 24 transistors