Jan 7, 2010Agrawal: Low Power CMOS Design1 Vishwani D. Agrawal James J. Danaher Professor ECE Dept., Auburn University, Auburn, AL rd International Conference on VLSI Design Education Forum, January 7, 2010 Bangalore, India
Jan 7, 2010Agrawal: Low Power CMOS Design2 F. M. Wanlass and C. T. Sah, “Nanowatt Logic using Field-Effect Metal-Oxide-Semiconductor Triodes,” IEEE International Solid- State Circuits Conference Digest, vol. IV, February 1963, pp No static leakage path exists for either 1 or 0 input.
Jan 7, 2010Agrawal: Low Power CMOS Design3 V DD Ground CLCL R R Dynamic Power = C L V DD 2 /2 + P sc Static power = V DD I leakage ViVi VoVo i sc
Jan 7, 2010Agrawal: Low Power CMOS Design4 Why is it a concern?
Jan 7, 2010Agrawal: Low Power CMOS Design5 “Ten years from now, microprocessors will run at 10GHz to 30GHz and be capable of processing 1 trillion operations per second – about the same number of calculations that the world's fastest supercomputer can perform now. “Unfortunately, if nothing changes these chips will produce as much heat, for their proportional size, as a nuclear reactor....” Patrick P. Gelsinger Senior Vice President General Manager Digital Enterprise Group INTEL CORP.
Jan 7, 2010Agrawal: Low Power CMOS Design Pentium® P Year Power Density (W/cm 2 ) Hot Plate Nuclear Reactor Rocket Nozzle Sun’s Surface Source: Intel
Design practices that reduce power consumption by at least one order of magnitude; in practice 50% reduction is often acceptable. Low-power design methods: Algorithms and architectures High-level and software techniques Gate and circuit-level methods Test power Jan 7, 2010Agrawal: Low Power CMOS Design7
Dynamic Power Signal transitions Logic activity Glitches Short-circuit Static Power Leakage Jan 7, 2010Agrawal: Low Power CMOS Design8 P total = P dyn + P stat = P tran + P sc + P stat Then = P tran + P sc + P stat Now
Each transition of a gate consumes CV 2 /2. Methods of power saving: Minimize load capacitances Transistor sizing Reduce transitions Logic design Glitch reduction Jan 7, 2010Agrawal: Low Power CMOS Design9
Design a digital circuit for minimum transient energy consumption by eliminating hazards Jan 7, 2010Agrawal: Low Power CMOS Design10 Total transitions = 6 Essential transitions = 2 Glitch transitions = 4
Jan 7, 2010Agrawal: Low Power CMOS Design11 Delay D < DPD ABAB C ABCABC D DHazard or glitch DPD DPD: Differential path delay time
Jan 7, 2010Agrawal: Low Power CMOS Design12 Delay D < DPD ABAB C ABCABC D No glitch DPD Delay buffer time
Jan 7, 2010Agrawal: Low Power CMOS Design13 Delay D > DPD ABAB C ABCABC Filtered glitch DPD time
Maintain specified critical path delay. Glitch suppressed at all gates by Path delay balancing Glitch filtering by increasing inertial delay of gates or by inserting delay buffers when necessary. A linear program optimally combines all objectives. Jan 7, 2010Agrawal: Low Power CMOS Design14 Delay D Path delay = d1 Path delay = d2 Minimum transient energy condition: |d1 – d2| < D
Variables: gate and buffer delays, arrival time variables. Objective: minimize number of delay buffers. Subject to: overall circuit delay constraint for all input-output paths. Subject to: minimum transient energy condition for all multi-input gates. Reference: T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input Delay CMOS Logic for Low Power Design,” IEEE Trans. CAD, vol. 17, no. 10, pp , Oct Jan 7, 2010Agrawal: Low Power CMOS Design15
Jan 7, 2010Agrawal: Low Power CMOS Design Critical path delay =
Jan 7, 2010Agrawal: Low Power CMOS Design17 Gate delay variables:d 4... d 12 Buffer delay variables:d d 29 Arrival time variables (earliest):t 4... T 29 (longest):T T 29
For Gate 7: T 7 ≥ T 5 + d 7 t 7 ≤ t 5 + d 7 d 7 > T 7 - t 7 T 7 ≥ T 6 + d 7 t 7 ≤ t 6 + d 7 Jan 7, 2010Agrawal: Low Power CMOS Design18
Jan 7, 2010Agrawal: Low Power CMOS Design19 T 16 + d 19 = T 19 t 16 + d 19 = t 19 Buffer 19:
T 11 ≤ maxdelay T 12 ≤ maxdelay maxdelay is specified Jan 7, 2010Agrawal: Low Power CMOS Design20
Need to minimize the number of buffers. Because that leads to a nonlinear objective function, we use an approximate criterion: minimize ∑ (all buffer delays) i.e.,minimize d 15 + d 16 + ∙ ∙ ∙ + d 29 This gives near optimum results. Jan 7, 2010Agrawal: Low Power CMOS Design21
Jan 7, 2010Agrawal: Low Power CMOS Design Critical path delay =
Jan 7, 2010Agrawal: Low Power CMOS Design Critical path delay =
Jan 7, 2010Agrawal: Low Power CMOS Design Critical path delay =
Jan 7, 2010Agrawal: Low Power CMOS Design25
Power Saving: Average 58%, Peak 68% Jan 7, 2010Agrawal: Low Power CMOS Design26
Dynamic Power Signal transitions Logic activity Glitches Short-circuit Static Power Leakage Jan 7, 2010Agrawal: Low Power CMOS Design27
Jan 7, 2010Agrawal: Low Power CMOS Design28 65nm CMOS technology: Low threshold transistors, gate delay 5ps, leakage current 10nA. High threshold transistors, gate delay 12ps, leakage 1nA. Minimize leakage current without increasing critical path delay. What is the percentage reduction in leakage power? What will be leakage power reduction if 30% critical path delay increase is allowed?
Jan 7, 2010Agrawal: Low Power CMOS Design29 Reduction in leakage power = 1 – (4×1+7×10)/(11×10) = 32.73% Critical path delay = 25ps 5ps 12ps
Jan 7, 2010Agrawal: Low Power CMOS Design30 Several solutions are possible. Notice that any 3-gate path can have 2 high threshold gates. Four and five gate paths can have only one high threshold gate. One solution is shown in the figure below where six high threshold gates are shown with shading and the critical path is shown by a dashed red line arrow. Reduction in leakage power = 1 – (6×1+5×10)/(11×10) = 49.09% Critical path delay = 29ps 12ps 5ps
Assign every gate i an integer [0,1] variable Xi. Define ILP constraints for critical path delay. Define objective function to minimize total leakage. Let ILP find values of Xi’s: If Xi = 1, assign low threshold to gate i If Xi = 0, assign high threshold to gate i Jan 7, 2010Agrawal: Low Power CMOS Design31
Jan 7, 2010Agrawal: Low Power CMOS Design32
Jan 7, 2010Agrawal: Low Power CMOS Design33 Leakage exceeds dynamic power Y. Lu and V. D. Agrawal, “CMOS Leakage and Glitch Minimization for Power- Performance Tradeoff,” Journal of Low Power Electronics (JOLPE), vol. 2, no. 3, pp , December 2006.
Jan 7, 2010Agrawal: Low Power CMOS Design34 M2 R4 A datapath R1R2 M1 R3
Jan 7, 2010Agrawal: Low Power CMOS Design35 LFSR1LFSR2 M1 M2 MISR1MISR2 Test time Test power T1: test for M1 T2: test for M2
Jan 7, 2010Agrawal: Low Power CMOS Design36 R1LFSR2 M1 M2 MISR1MISR2 Test time Test power T1: test for M1 T2: test for M2
Test resources: Typically registers and multiplexers that can be reconfigured as test pattern generators (e.g., LFSR) or as output response analyzers (e.g., MISR). Test resources (R1,...) and tests (T1,...) are identified for the system to be tested. Each test is characterized for test time, power dissipation and resources it requires. Jan 7, 2010Agrawal: Low Power CMOS Design37
Jan 7, 2010Agrawal: Low Power CMOS Design38 T1T2T3T4T5T6 R2R1R3R4R5R6R7R8R9 Reference: R. M. Chou, K. K. Saluja and V. D. Agrawal, “Scheduling Tests for VLSI Systems Under Power Constraints,” IEEE Trans. VLSI Systems, vol. 5, no. 2, pp , June 1997.
Jan 7, 2010Agrawal: Low Power CMOS Design39 T1 (2, 100) T2 (1,10) T3 (1, 10) T4 (1, 5) T5 (2, 10) T6 (1, 100) Tests that form a clique can be performed concurrently (test session) Power Test time Pmax = 4
CLIQUE NO. iTEST SESSION TEST LENGTH, LiPOWER, Pi 1T1, T3, T T1, T3, T T1, T T1, T T1, T T1. T T2, T T2, T T3, T T3, T T T T T451 15T T61001 Jan 7, 2010Agrawal: Low Power CMOS Design40
For each clique (test session) i, define: Integer variable, xi = 1, test session selected, or xi = 0, test session not selected. Constants, Li = test length, Pi = power. Constraints to cover all tests: T1 is covered if x1 + x2 + x3 + x4 + x5 + x6 + x11 ≥ 1 Similar constraint for each test, Tk Constraints for power: Pi × xi ≤ Pmax Jan 7, 2010Agrawal: Low Power CMOS Design41
Objective function: Minimize Σ Li × xi all cliques Solution: x3 = x8 = x10 = 1, all other xi’s are 0 Test session 3 includes T1 and T6 Test session 8 includes T2 and T5 Test session 10 includes T3 and T4 Test length = L3 + L8 + L10 = 120 Peak power = max {P3, P8, P10} = 3 (Pmax = 4) Jan 7, 2010Agrawal: Low Power CMOS Design42
Underlying theme in our research – use of mathematical optimization methods for power reduction at gate level: Dynamic power Leakage power Power minimization under process variation Test power Other research Min-max power estimation Architecture level power management Software, instruction set Multicore Jan 7, 2010Agrawal: Low Power CMOS Design43
T. Raja, MS 2002, PhD 2004 (NVIDIA) S. Uppalapati, MS 2004 (Intel) F. Hu, PhD 2006 (Intel) Y. Lu, PhD 2007 (Intel) J. D. Alexander, MS 2008 (Texas Instruments) K. Sheth, MS 2008 M. Allani, PhD J. Yao, PhD K. Kim, PhD M. Kulkarni, MS Jan 7, 2010Agrawal: Low Power CMOS Design44
Dissertations: Papers: Jan 7, 2010Agrawal: Low Power CMOS Design45