Presentation is loading. Please wait.

Presentation is loading. Please wait.

NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization.

Similar presentations


Presentation on theme: "NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization."— Presentation transcript:

1 NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization

2 2 Outline Sleep Transistor Sizing Problem Sleep Transistor Sizing Problem Maximum Instantaneous Current Estimation Maximum Instantaneous Current Estimation Time-Frame Partitioning for Sizing Time-Frame Partitioning for Sizing Experimental Results Experimental Results Conclusions Conclusions

3 3 Trend of Low Power Designs Leakage increases exponentially Leakage increases exponentially –reaches more than 50% of total power in 65nm technology. Low power design is a must-have, not an optional. Low power design is a must-have, not an optional. Power dissipation Power dissipation –Active power (active mode) –Leakage power (sleep mode) drain source gate Sub-threshold leakage

4 4 Power Gating – reduce leakage. –One of the most effective ways to reduce leakage. Low V th Logic Devices VDD GND use high V th Sleep Transistor to reduce the leakage current SL VGND GND ModeSL Sleep Transistor Active0ON Sleep1OFF

5 5 C1C1 C2C2 C3C3 Implementation of Power Gating Distributed Sleep Transistor Network (DSTN) Distributed Sleep Transistor Network (DSTN) VDD VGND Low V th Logic Device SL

6 6 Leakage Saving In sleep mode: In sleep mode: –Leakage: proportional to the ST ’ s size. –Small ST to reduce leakage. I leakage VDD VGND I leakage

7 7 Voltage Drop across Sleep Transistor In active mode: In active mode: –Voltage drop across a ST degrades the performance. –Voltage drop: inversely proportional to the ST ’ s size. –Large ST to bind the voltage drop. V ST VDD VGND V ST

8 8 Sleep Transistor (ST) Sizing Dilemma scenario: Dilemma scenario: –Small ST to reduce leakage. (sleep mode) –Large ST to bind the voltage drop. (active mode) Objective: minimize ST size (leakage) under a specified voltage-drop constraint, V ST *. Objective: minimize ST size (leakage) under a specified voltage-drop constraint, V ST *. V ST * V ST VDD VGND V ST V ST *

9 9 C1C1 C2C2 C3C3 Voltage Drop Estimation with MIC Maximum Instantaneous Current (MIC) through a ST Maximum Instantaneous Current (MIC) through a ST –determines the worst case IR drop. Estimating the upper bound of MIC(ST) Estimating the upper bound of MIC(ST) –to size ST properly to meet the voltage-drop constraint. MIC(ST 1 ) VDD VGND MIC(ST 2 ) MIC(ST 3 ) MIC(ST): MIC across a ST.

10 10 C1C1 C2C2 C3C3 Voltage Drop Estimation with MIC MIC(C) (MIC of a cluster) is easy to measure. MIC(C) (MIC of a cluster) is easy to measure. Due to current balancing effect Due to current balancing effect –MIC(ST) (MIC through a ST) is hard to predict. MIC(ST 1 ) VDD VGND MIC(ST 2 ) MIC(ST 3 ) MIC(C 1 ) Finding the MIC of a cluster is fast. Finding the MIC across a ST is time-consuming.

11 11 Temporal Perspective of Cluster ’ s MIC Conventional way Conventional way –ST sizes are determined with MIC of the entire clock period. (Time Unit : 10ps) Cluster 1 Cluster 2 MIC(C 2 ) occurs at T 9. one clock cycle MIC(C i ) waveform (Current) MIC(C 1 ) occurs at T 6.

12 12 (Time Unit: 10ps) Current (mA) Cluster 1 Cluster 2 Temporal Perspective of Cluster ’ s MIC one clock cycle MIC(C i ) waveform Smaller time frames lead to: Smaller time frames lead to: –a more accurate MIC estimation. –but high computation complexity.

13 13 Difficulties Current balancing effect complicates the sizing problem. Current balancing effect complicates the sizing problem. Time-frame partitioning leads to high computation complexity. Time-frame partitioning leads to high computation complexity. MIC one clock cycle

14 14 Contributions More accurate MIC prediction from temporal perspective. More accurate MIC prediction from temporal perspective. Variable-length partitioning to reduce computation complexity. Variable-length partitioning to reduce computation complexity. Algorithm to minimize the sizes of sleep transistors. Algorithm to minimize the sizes of sleep transistors. Achieving 21% area reduction in total sleep transistor sizes compared with [2] Achieving 21% area reduction in total sleep transistor sizes compared with [2]. - [2] Chiou et al. DAC’06

15 15 Outline Sleep Transistor Sizing Problem Sleep Transistor Sizing Problem Maximum Instantaneous Current Estimation Maximum Instantaneous Current Estimation Time-Frame Partitioning for Sizing Time-Frame Partitioning for Sizing Experimental Results Experimental Results Conclusions Conclusions

16 16 Resistance Network I(ST 1 ) I(ST 2 ) I(ST 3 ) I(C1)I(C1) I(C2)I(C2) I(C3)I(C3) R(ST 1 ) R(ST 2 ) R(ST 3 ) RVRV RVRV C1C1 C2C2 C3C3 VGND

17 17 The discharging ratio can be calculated by The discharging ratio can be calculated by –Kirchhoff ’ s Current Law –Ohm ’ s Law Discharging Ratio 9Ω9Ω 8Ω8Ω 10Ω 2Ω2Ω 2Ω2Ω C1C1 C2C2 C3C3 0.43* I(C 1 ) 0.34* I(C 1 ) 0.23* I(C 1 ) I(C1)I(C1) VGND

18 18 Discharging Matrix Ψ (SAI ) → where I(ST 1 ) I(ST 2 ) I(ST 3 ) I(C1)I(C1) I(C2)I(C2) I(C3)I(C3) C1C1 C2C2 C3C3 VGND

19 19 MIC(ST) Estimation Mechanism → MIC(ST 1 ) MIC(ST 2 ) MIC(ST 3 ) MIC(C 1 ) MIC(C 2 ) MIC(C 3 ) C1C1 C2C2 C3C3 where

20 20 Outline Sleep Transistor Sizing Problem Sleep Transistor Sizing Problem Maximum Instantaneous Current Estimation Maximum Instantaneous Current Estimation Time-Frame Partitioning for Sizing Time-Frame Partitioning for Sizing Experimental Results Experimental Results Conclusions Conclusions

21 21 Temporal Perspective of Cluster ’ s MIC Different MIC(C i ) occurs at different time points. (Time Unit: 10ps) Cluster 1 Cluster 2 MIC(C 2 ) occurs at T 9. one clock cycle MIC(C i ) waveform (Current) MIC(C 1 ) occurs at T 6.

22 22 Temporal Perspective of Cluster ’ s MIC Different MIC(C i ) occurs at different time points within a clock period. Traditional way to estimate MIC(ST i ) is over pessimistic. over-estimated !

23 23 Time-Frame Partitioning for MIC(ST) Estimation Expand MIC(C i ) into MIC(C i,T j ). (Time Frame) Cluster 1 Cluster 2 one clock cycle MIC(C i,T j ) waveform (Current) MIC(C 1,T 1 ) MIC(C 2,T 1 ) MIC(C 1,T 3 ) MIC(C 2,T 3 ) MIC(C 1,T 6 ) MIC(C 2,T 6 )

24 24 For each time frame T j, use MIC(C i,T j ) to obtain MIC(ST i,T j ). Time-Frame Partitioning for MIC(ST) Estimation

25 25 Time-Frame Partitioning for MIC(ST) Estimation For ST 1, the maximum MIC(ST 1,T j ) among all T j is the upper bound of MIC(ST 1 ) after partitioning. Cluster 1 Cluster 2 (Time Frame) one clock cycle MIC(ST i,T j ) waveform MIC(ST 1 ) ST 1 ST 2 (Current) MIC(ST 2 )

26 26 Notation Review MIC(C i ) MIC(C i ) –Maximum Instantaneous Current of i th Cluster MIC(ST i ) MIC(ST i ) –Estimated MIC upper bound flowing through i th sleep transistor MIC(C i,T j ) MIC(C i,T j ) –MIC of C i in j th time frame MIC(ST i,T j ) =Ψ * MIC(C i,T j ) MIC(ST i,T j ) =Ψ * MIC(C i,T j ) –Estimated MIC upper bound through ST i in j th time frame MIC(ST i ) = Ψ * MIC(C i ) MIC(ST i ) = Ψ * MIC(C i ) –With time-frame partitioning MIC(ST i ) = max{ MIC(ST i,T j ) for all j } MIC(ST i ) = max{ MIC(ST i,T j ) for all j } –Without time-frame partitioning

27 27 Time-Frame Partitioning for MIC(ST) Estimation Cluster 1 Cluster 2 (Time Frame) one clock cycle MIC(ST i,T j ) waveform MIC(ST 1 ) ST 1 ST 2 MIC(ST 2 ) (Current) ORIGINAL_MIC(ST 1 ) 37% larger! ORIGINAL_MIC(ST 2 ) 27% larger! Time-Frame Partitioning leads to a better MIC(ST) estimation!

28 28 Reduce the Computation Complexity More time frames lead to More time frames lead to –more accurate voltage-drop estimation. –but higher computation complexity. Reduce the computation complexity: Reduce the computation complexity: –dominated time-frame removal –variable length time-frame partitioning

29 29 Dominated Time Frame Removal T 3 is dominated by T 6. T 3 is dominated by T 6. –MIC(C 1,T 6 ) > MIC(C 1,T 3 ), –MIC(C 2,T 6 ) > MIC(C 2,T 3 ). Neglect T 3 Neglect T 3 –MIC(ST 1,T 6 ) > MIC(ST 1,T 3 ), –MIC(ST 2,T 6 ) > MIC(ST 2,T 3 ). Cluster 1 Cluster 2 MIC(C 1,T 6 ) MIC(C 1,T 3 ) MIC(C 2,T 6 ) MIC(C 2,T 3 ) Cluster MIC waveform

30 30 (T b dominates T c ) and (T b dominates T d ) (T b dominates T c ) and (T b dominates T d ) => the estimated upper bound of Fig(2) will be smaller. Variable-Length Time-Frame Partitioning TaTa uniform two-way partition Variable-length two-way partition TbTb TdTd TcTc MIC(C 1,T b ) MIC(C 2,T b ) MIC(C 1,T d ) MIC(C 2,T d )MIC(C 1,T c ) MIC(C 2,T c ) (1) (2)

31 31 Variable-Length Time-Frame Partitioning With all MIC(C i )s are separated With all MIC(C i )s are separated - MIC(ST i ) can be better estimated! Example with the number of time frames = 3 Example with the number of time frames = 3 one clock cycle T1 T2 T3 Cluster 1 Cluster 2 Cluster 3 Cluster MIC waveform

32 32 Partition one clock period Partition one clock period –with the minimum time unit exhausively Not efficient Not efficient Accurate MIC(ST i ) estimation Accurate MIC(ST i ) estimation –with limited number of variable-length time frames Efficient Efficient Only lose slight accuracy Only lose slight accuracy Variable-Length Time-Frame Partitioning

33 33 Problem Formulation of ST Sizing Inputs: Inputs: 1.Voltage-drop constraint. 2.MIC(C i,T j ): Cluster ’ s MIC information. Objective: Objective: 1.Minimize the total width of sleep transistors. 2.Voltage drops must meet the constraint. Output: Output: 1.A set of sleep transistor width.

34 34 ST Sizing Algorithm 99Ω 1. Initialize ST size with a large value. MIC(ST i,T j ) = . MIC(C i,T j ) V(ST i,T j ) = MIC(ST i,T j ) . R(ST i ) 3. Update MIC(ST i,T j ) and voltage drops. 0.38 0.30 0.21 0.18 0.27 0.30 0.21 0.18 0.21 0.24 0.35 0.28 0.14 0.16 0.23 0.36 = 2. Update the discharging matrix. Return ST size Yes Voltage drops ok? No 4. Resize ST with the worst drop. 99 7399

35 35 Outline Sleep Transistor Sizing Problem Sleep Transistor Sizing Problem Maximum Instantaneous Current Estimation Maximum Instantaneous Current Estimation Time-Frame Partitioning for Sizing Time-Frame Partitioning for Sizing Experimental Results Experimental Results Conclusions Conclusions

36 36 Environment Setup TSMC 130nm CMOS technology. TSMC 130nm CMOS technology. Vdd = 1.3 volt. Vdd = 1.3 volt. Specified tolerable voltage drop: 5% of the ideal supply voltage (0.065 volt.) Specified tolerable voltage drop: 5% of the ideal supply voltage (0.065 volt.) MIC(C i ) is obtained via 10,000-random-pattern PrimePower TM simulations. MIC(C i ) is obtained via 10,000-random-pattern PrimePower TM simulations. Minimum time unit is set to 10 pico-second. Minimum time unit is set to 10 pico-second.

37 37 Implementation Flow RTL netlist SDF file Gate Positioning Gate location VCD Partitioning Partitioned VCD file : Our tools : Commercial tools Synthesis Gate-level netlist MIC Estimation Variable-length Partitioning (Optional) ST size ST Sizing Simulation VCD file Placement DEF file

38 38 Experimental Results Avg. AES des t481 i8 frg2 dalu C7552 C5315 C3540 C1355 C880 C499 C432 Circuit 18.091.0611.261.70 35242837928137272293396544378 1180832181457850976611804 1514162895402502473899405 1080772081417836993113247 1367012255223228353632 48338162283211029043468 28961721625621242692950041016 21901383019534187852377329794 9421685620282186502302029808 422251411496105911305619352 3452561967692331129615050 568364472296684834710741 495426270866775849112817 V-TPTPV-TPTP[2][8] Runtime (Sec.)Total Area (Width in μm) Previous works: [2] Chiou et al. DAC’06, [8] Long et al. DAC’03

39 39 Outline Sleep Transistor Sizing Problem Sleep Transistor Sizing Problem Maximum Instantaneous Current Estimation Maximum Instantaneous Current Estimation Time-Frame Partitioning for Sizing Time-Frame Partitioning for Sizing Experimental Results Experimental Results Conclusions Conclusions

40 40 Conclusions Propose an efficient sleep transistor sizing method for DSTN power-gating designs. Propose an efficient sleep transistor sizing method for DSTN power-gating designs. Present theorems based on temporal perspective to estimate a tight upper bound of voltage drop. Present theorems based on temporal perspective to estimate a tight upper bound of voltage drop. Achieve 21% size as well as leakage reduction on average compared with [2] Achieve 21% size as well as leakage reduction on average compared with [2]. - [2] Chiou et al. DAC’06

41 41 Thanks for your time.

42 42 Q & A

43 43 Backup Slides

44 44 Sleep Transistor (ST) Sizing In the active mode In the active mode –Sleep Transistors operate in linear region. –W ST is inversely proportional to R ST. W ST = k / R ST W ST = k / R ST Relations between W ST and V ST. Relations between W ST and V ST. VDD VGND GND I(ST) I(ST): the current through the sleep transistor V ST V ST : the voltage drop across the sleep transistor

45 45 Sleep Transistor (ST) Sizing Determine the minimum required size (W ST * ) based on: Determine the minimum required size (W ST * ) based on: 1.MIC(ST) 2.V ST *: IR-drop constraint VDD VGND GND MIC(ST) MIC(ST) : Maximum Instantaneous Current (MIC) through ST Smaller MIC(ST) leads to a better ST size!

46 46 MIC Waveform Current Time Current Pattern 1 Pattern 2 Pattern 3 MIC waveform of 3 patterns

47 47 R ST Initialization Physical limitation Physical limitation –CMOS process limits the width of a sleep transistor. –Choose the minimum width as the initial R ST.


Download ppt "NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization."

Similar presentations


Ads by Google