Presentation is loading. Please wait.

Presentation is loading. Please wait.

Polynomial-Time Algorithms for Designing Dual-Voltage Energy Efficient Circuits Master’s Thesis Defense Mridula Allani Advisor : Dr. Vishwani D. Agrawal.

Similar presentations


Presentation on theme: "Polynomial-Time Algorithms for Designing Dual-Voltage Energy Efficient Circuits Master’s Thesis Defense Mridula Allani Advisor : Dr. Vishwani D. Agrawal."— Presentation transcript:

1 Polynomial-Time Algorithms for Designing Dual-Voltage Energy Efficient Circuits Master’s Thesis Defense Mridula Allani Advisor : Dr. Vishwani D. Agrawal Committee Members: Dr. Victor P. Nelson, Dr. Adit D. Singh Department of Electrical and Computer Engineering Auburn University October 19, 2011

2 Outline Motivation Problem statement Background Contributions Algorithm to find V DDL Algorithm to assign V DDL Results Future work References 10/19/2011 2 Mridula Allani - MS Thesis Defense

3 Motivation Ref. http://www.anandtech.com/show/3794/the-iphone-4-review/13 http://www.anandtech.com/show/3794/the-iphone-4-review/13 10/19/2011 3 Mridula Allani - MS Thesis Defense

4 Motivation Current dual voltage designs use 0.7V DD as the lower supply voltage. Algorithms to assign low voltage have exponential or polynomial complexity. Require faster algorithms that increase energy savings. 10/19/2011 4 Mridula Allani - MS Thesis Defense

5 Problem Statement Develop a linear time algorithm to find the optimal lower voltage. Develop new algorithms for voltage assignment in dual-V DD design. 10/19/2011 5 Mridula Allani - MS Thesis Defense

6 Background Gate slack: The amount of time by which a signal is early or late. Critical path: The longest path in the circuit. All gates on this path have ‘zero’ slack. Timing constraints: No other path can be longer than the critical path. No gate should have a negative slack. 10/19/2011 6 Mridula Allani - MS Thesis Defense

7 Background Timing violations: A path is longer than the critical path. The gates on this path have negative slack. Topological constraints: NoV DDL gate is at the input of any V DD gate. Estimate of energy savings (neglecting leakage): where N is the number of gates in low voltage and n is the total number of gates. 10/19/2011 7 Mridula Allani - MS Thesis Defense

8 Background Basic idea: decrease energy consumption without any delay penalty. Done by assigning lower supply voltage to gates on non-critical paths. Different algorithms propose different ways of finding these non-critical gates. 10/19/2011 8 Mridula Allani - MS Thesis Defense

9 Background Authors Kuroda and Hamada say that power reduction ratio is minimum when 0.6V DD ≤ V DDL ≤ 0.7V DD. The works described by Chen, et. al., Kulkarni, et. al., Srivatsava, et. al., claims that the optimal value of V DDL for minimizing total power is 50% of V DD. Rule of thumb proposed by Hamada, et. al. says 10/19/2011 9 Mridula Allani - MS Thesis Defense

10 Background CVS Structure [Usami and Horowitz] ECVS Structure [Usami, et. al.] V DDL V DD Level Converter Ref. K. Usami and M. Horowitz, “Clustered Voltage Scaling Technique for Low-Power Design," in Proceedings of the International Symposium on Low Power Design, pp. 23-26, 1995. Ref. K. Usami, et. al.,“Automated Low-Power Technique Exploiting Multiple Supply Voltages Applied to a Media Processor," IEEE Journal of Solid-State Circuits, vol. 33, no. 3, pp. 463-472, Mar. 1998. 10/19/2011 10 Mridula Allani - MS Thesis Defense

11 Background Kulkarni, et al. Greedy heuristic based on gate slacks. Uses 0.7V DD and 0.5V DD as V DDL. Includes power and delay overhead of level converters. Sundararajan and Parhi Linear programming based model. Minimizes the power consumption. Includes level converter delay overheads. 10/19/2011 11 Mridula Allani - MS Thesis Defense

12 Background TPI (i): longest time for an event to arrive at gate i from PI. TPO (i): longest time for an event from gate i to reach PO. Slack time for gate i: S i = Tc – D p,i, where T c = Max { D p,i } for all i [Kim and Agrawal] Delay of the longest path through gate i : D p,i = TPI(i) + TPO(i) 10/19/2011 12 Mridula Allani - MS Thesis Defense TPI (i) TPO (i) TcTc PIPO

13 Background S u, the upper slack time is the lower bound of slacks of the gates which can be unconditionally assigned low voltage without affecting the critical timing of the circuit. where β = D ’ p,I / D p,i and D ’ p,i, D p,i is the longest path delay through the gate i when it is supplied with V DDL and V DD, respectively. [Kim and Agrawal] 10/19/2011 13 Mridula Allani - MS Thesis Defense S u = T c

14 Background Recent work [Kim and Agrawal]: Assign V DDL to gates with S i ≥S u. Assign V DDL to gates with S l ≤ S i ≤ S u one by one without violating timing or topological constraints. Repeat last two steps across all voltages to find the best V DDL and the corresponding dual-voltage design with the least energy. Ref. K. Kim and V. D. Agrawal, “Dual Voltage Design for Minimum Energy Using Gate Slack,” in Proceedings of the IEEE International Conference on Industrial Technology, pp. 419-424, March, 2011. 10/19/2011 14 Mridula Allani - MS Thesis Defense

15 Example Without level converter V1V1 V1V1 V1V1 V1V1 V1V1 V2V2 V2V2 V2V2 V2V2 V2V2 IN OUT 10/19/2011 15 Mridula Allani - MS Thesis Defense

16 Example: Energy per cycle and delay Without level converter 9.69fJ ∞ 44.84fJ 280.6ps 15.75fJ 123.7ps 7.315fJ 95.61ps 7.863fJ 84.15ps 6.465fJ ∞ 10.13fJ 204.5ps 4.573fJ 123.2ps 5.203fJ 99.28ps 6.65fJ 91.19ps 6.6fJ 1183ps 2.651fJ 203.3ps 3.233fJ 132.3ps 4.289fJ 115ps 5.678fJ 107.7ps 1.291fJ 801.5ps 1.761fJ 235.4ps 2.543fJ 179.4ps 3.567fJ 164.3ps 4.977fJ 156.1ps 0.755fJ 1062ps 1.285fJ 614 ps 2.052fJ 565.3ps 3.082fJ 560.5ps 4.423fJ 557.7ps V 2 (V) V 1 (V) 0.4 0.6 0.8 1.0 1.2 0.40.60.81.01.2 10/19/2011 16 Mridula Allani - MS Thesis Defense 90 nm PTM model Clock period: 1500 ps

17 Example With level converter V1V1 V1V1 V1V1 V1V1 V1V1 V2V2 V2V2 V2V2 V2V2 V2V2 IN OUT 10/19/2011 17 Mridula Allani - MS Thesis Defense

18 10.44fJ ∞ 7.18fJ 249.1ps 7.18fJ 184.0ps 7.98fJ 161.7ps 9.316fJ 153.4ps 7.13fJ 1198ps 4.39fJ 268.5ps 4.96fJ 203.3ps 5.94fJ 182.8ps 8.05fJ 174.8ps 2.74fJ 952.5ps 2.83fJ 309.4ps 3.56fJ 251.4ps 4.93fJ 231.8ps 16.14fJ 225.8ps 1.408fJ 948.8ps 1.91fJ 470.7ps 2.82fJ 418.9ps 10.34fJ 405.7ps 45.31fJ 387.8ps 0.81fJ 2188ps 1.4fJ 1757ps 7.08fJ 1733ps 6.46fJ ∞ 9.75fJ ∞ 9.69fJ ∞ 44.84fJ 280.6ps 15.75fJ 123.7ps 7.315fJ 95.61ps 7.863fJ 84.15ps 6.465fJ ∞ 10.13fJ 204.5ps 4.573fJ 123.2ps 5.203fJ 99.28ps 6.65fJ 91.19ps 6.6fJ 1183ps 2.651fJ 203.3ps 3.233fJ 132.3ps 4.289fJ 115ps 5.678fJ 107.7ps 1.291fJ 801.5ps 1.761fJ 235.4ps 2.543fJ 179.4ps 3.567fJ 164.3ps 4.977fJ 156.1ps 0.755fJ 1062ps 1.285fJ 614 ps 2.052fJ 565.3ps 3.082fJ 560.5ps 4.423fJ 557.7ps Example 0.4 0.6 0.8 1.0 1.2 0.40.60.81.01.2 With level converterWithout level converter 0.40.60.81.01.2 10/19/2011 18 Mridula Allani - MS Thesis Defense V 2 (V) V 1 (V)

19 Outline Motivation Problem statement Background Contributions Algorithm to find V DDL Algorithm to assign V DDL Results Future work References 10/19/2011 19 Mridula Allani - MS Thesis Defense

20 Grouping of gates 45 o line S u = 336.9 ps P G ≥0 10/19/2011 20 Mridula Allani - MS Thesis Defense ∑(dl i –dh i )≤min{S i }

21 Groups when V DDL = 1.2V 45 o line P G 10/19/2011 21 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 1.2V T c = 510 ps S u = 0 ps

22 45 o line P G 10/19/2011 22 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 1.19V T c = 510 ps S u = 14.6 ps Groups when V DDL = 1.19V

23 45 o line S u = 336.9 ps P G 10/19/2011 23 Mridula Allani - MS Thesis Defense T c = 510 ps Groups when V DDL = 0.49V

24 45 o line P G 10/19/2011 24 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 0.39V S u = 469ps T c = 510 ps Groups when V DDL = 0.39V

25 Groups when V DDL = 0.1V G 10/19/2011 25 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 0.1V S u = 510 ps = T c T c = 510 ps P 45 o line

26 Theorems 1. Gates above the 45 o line in the ‘Delay increment versus slack’ plot cannot be assigned lower supply voltage without violating the timing constraint. 2. where β i = dl i /dh i and dl i is the low voltage delay and dh i is the high voltage delay of gate i. The maximum value of β i ; β max, will give us the lower bound on the gate slacks. 10/19/2011 26 Mridula Allani - MS Thesis Defense

27 Theorems 3. Groups within P which satisfy can be assigned lower supply voltage without violating the timing constraint. (where, y i = dl i – dh i, dl i = low voltage delay of gate i, dh i = high voltage delay of gate i and S i = slack of the gate i at V DD.) 4. Group with slacks greater than S u, G, can always be assigned the lower supply voltage without causing any topological violations. 10/19/2011 27 Mridula Allani - MS Thesis Defense

28 Algorithm to find V DDL Assume all gates are assigned V DD initially. Calculate the gate slacks. Group the gates according to their slacks and delays. 10/19/2011 28 Mridula Allani - MS Thesis Defense

29 Algorithm to find V DDL V DDL = V DDL1, when using no level converter. V DDL = (V DDL1 V DDL2 ) 1/2, when using level converter. 10/19/2011 29 Mridula Allani - MS Thesis Defense

30 Algorithm to find V DDL 10/19/2011 30 Mridula Allani - MS Thesis Defense =V DD C880 Total 360 gates

31 Algorithm to find V DDL 10/19/2011 31 Mridula Allani - MS Thesis Defense =V DD C880 Total 360 gates V DDL1 = 0.49VV DDL2 = 0.71V

32 Results: V DDL selection algorithm ISCAS ’85 Total gates Without level converters V DDL = V DDL1 V DDL = V DDL2 V DDL = (V DDL1 +V DDL2 )/2 V DDL = (V DDL1 V LDD2 ) 1/2 V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) C4321540.8082.90.8982.30.8482.70.8482.7 C4994930.7611313.71.111414.10.9312310.00.9112911.1 C8803600.4921349.30.7122941.30.622947.70.5822948.8 C13554690.77769.51.111083.40.94766.30.92766.7 C19085840.6022128.41.0022111.60.8022121.90.7722122.3 C26709010.4857053.10.8257033.70.6557044.70.6257046.4 C354012700.521499.50.731497.40.621498.60.611498.7 C531520770.49122049.00.75122636.00.62122043.10.60122044.1 C628824070.55752.51.00770.980.77771.90.73772.0 C728828230.54158244.70.7121238.90.62167243.40.61167243.4 10/19/2011 32 Mridula Allani - MS Thesis Defense

33 Results: V DDL selection algorithm ISCAS ’85 Total gates With level converters V DDL = V DDL1 V DDL = V DDL2 V DDL = (V DDL1 +V DDL2 )/2 V DDL = (V DDL1 V LDD2 ) 1/2 V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) V DDL (V) Gates in V DDL E sav (%) C4321540.807317.10.898524.80.848126.80.848126.8 C4994930.7617321.11.1135910.50.9324920.20.9124721.3 C8803600.4922351.60.7130955.80.629060.40.5828660.9 C13554690.7712215.31.112608.00.9419716.30.9219317.0 C19085840.6026333.81.0026724.40.8039537.60.7738538.8 C26709010.4837635.10.8278446.40.6567753.10.6263351.5 C354012700.5264741.40.73107353.20.6290652.30.6188151.5 C531520770.49114045.00.75177752.10.62163357.60.60160257.8 C628824070.5565921.61.00187723.80.77130231.80.73118947.3 C728828230.54156044.10.71223551.50.62199851.90.61119751.8 10/19/2011 33 Mridula Allani - MS Thesis Defense

34 Results: Comparison with reported data ISCAS’85 Total gates Without level converters V DDL =V DDL1 V DDL = V DDL = 0.7V DD = 0.84V V DDL = V DDL = 0.5V DD = 0.6V V DDL (V) Gates in V DDL E sav (%) Gates in V DDL E sav ( %) Gates in V DDL E sav (%) C4321540.8082.982.783.9 C4994930.7611313.712112.5568.5 C8803600.4921349.322932.422947.7 C13554690.77769.5768.36410.2 C19085840.6022128.422119.322128.4 C26709010.4857053.157032.357047.5 C354012700.521499.51496.01498.8 C531520770.49122049.0124030.5122044.1 C628824070.55752.5771.6752.3 C728828230.54158244.7235942.6167243.9 10/19/2011 34 Mridula Allani - MS Thesis Defense

35 Results: Comparison with reported data ISCAS’85 Total gates With level converters V DDL =V DDL1 V DDL =V DDL = 0.7V DD = 0.84V V DDL =V DDL = 0.5V DD = 0.6V V DDL (V) Gates in V DDL E sav (%) Gates in V DDL E sav ( %) Gates in V DDL E sav (%) C4321540.848126.88126.84320.9 C4994930.9124721.321121.29915.1 C8803600.5828660.932345.829060.4 C13554690.9219317.015416.8447.0 C19085840.7738538.841536.226333.8 C26709010.6263351.581346.060650.5 C354012700.6188151.5109343.986451.0 C531520770.60160257.8181244.5160256.9 C628824070.73118947.3147031.278024.3 C728828230.61119751.8234742.4194351.6 10/19/2011 35 Mridula Allani - MS Thesis Defense

36 Outline Motivation Problem statement Background Contributions Algorithm to find V DDL Algorithm to assign V DDL Results Future work References 10/19/2011 36 Mridula Allani - MS Thesis Defense

37 Algorithm to assign V DDL Assume all gates are at V DD initially. Calculate slacks of all gates. Assign V DDL to gates whose slacks, S i ≥S u Recalculate slacks. 10/19/2011 37 Mridula Allani - MS Thesis Defense

38 Algorithm to assign V DDL Assign V DDL to a group of gates in P satisfying the condition Recalculate slacks. Check whether there are any V DDL gates at the inputs of any V DD gates and if there are any negative slacks. 10/19/2011 38 Mridula Allani - MS Thesis Defense

39 Algorithm to assign V DDL If there any violations occur, put the corresponding gate back to V DD. Recalculate slacks. Repeat previous five steps until we do not have any V DD gates in groups P and G. 10/19/2011 39 Mridula Allani - MS Thesis Defense

40 c880 slack distribution 45 o line S u =336.9 ps P G 10/19/2011 40 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 0.49V

41 Slack data after V DDL assignment 45 o line S u = 336.9ps P G V DD = 1.2V V DDL = 0.49V 10/19/2011 41 Mridula Allani - MS Thesis Defense

42 ISCAS’85 Total gates V DDL =V DDL1 Determination and assignment SPICE Results ** [Kim and Agrawal] V DDL (V) Gates in V DDL E sav (%) CPU* (s) E single VDD (fJ) E dual VDD ( fJ) E sav (%) CPU (s) C4321540.8082.91.78161.3155.43.7 3.915.8 C4994930.7611313.79.414634277.8 5.9194.4 C8803600.4921349.35.39277.6115.858.3 50.862.1 C13554690.77769.58.75455.2433.14.9 4.3132 C19085840.6022128.411.43496.5378.323.8 19.0247.8 C26709010.4857053.123.49660.3251.561.9 47.8480.7 C354012700.521499.545.441843162012.2 9.61244 C531520770.49122049.0109.472320127245.2 N/R C628824070.55752.5154.94193218693.3 2.66128 C728828230.54158244.7191.042465156236.6 N/R Dual voltage design without level converter Intel Core i5 2.30GHz, 4GB RAM ** 90nm PTM model 10/19/2011 42 Mridula Allani - MS Thesis Defense

43 CPU Time Vs. Number of Gates 10/19/2011 43 Mridula Allani - MS Thesis Defense

44 c880 slacks with 5% increase in T c 45 o line S u = 293ps PG 10/19/2011 44 Mridula Allani - MS Thesis Defense V DD = 1.2V V DDL = 0.67V

45 c880 final slacks with 5% increase in T c 45 o line S u = 293ps P G V DD = 1.2V V DDL = 0.67V 10/19/2011 45 Mridula Allani - MS Thesis Defense

46 Dual voltage design without level converter with 5% increase in T c ISCAS’85 Total gates V DDL =V DDL1 Determination and assignment SPICE Results ** V DDL (V) Gates in V DDL E sav (%) CPU * (s) E single VDD (fJ) E dual VDD (fJ) E sav (%) C4321541.0815419.01.70161.3123.923.2 C4994931.0349326.39.18463321.930.5 C8803600.6733465.84.32277.683.8669.8 C13554691.0646922.08.52455.2339.912.2 C19085841.0058430.68.56496.544510.4 C26709010.8189954.315.81660.3257.361.0 C354012700.90127043.828.221843949.548.5 C531520770.72207764.061.772320716.869.1 C628824071.07240720.5108.391932146424.2 C728828230.68281667.7175.072465677.272.3 Intel Core i5 2.30GHz, 4GB RAM ** 90nm PTM model 10/19/2011 46 Mridula Allani - MS Thesis Defense

47 Future work Accommodate level converter energy overheads. Consider leakage energy reduction. Dual threshold designs. Simultaneous dual supply voltage and dual threshold voltage designs. Include the effects of process variations. 10/19/2011 47 Mridula Allani - MS Thesis Defense

48 References 1. T. Kuroda and M. Hamada, “Low-Power CMOS Digital Design with Dual Embedded Adaptive Power Supplies," IEEE Journal of Solid-State Circuits, vol. 35, no. 4, pp. 652-655, Apr. 2000. 2. M. Hamada, Y. Ootaguro, and T. Kuroda, “Utilizing Surplus Timing for Power Reduction,” in Proceedings of the IEEE Custom Integrated Circuits Conference, pp. 89-92, 2001. 3. C. Chen, A. Srivastava, and M. Sarrafzadeh, “On Gate Level Power Optimization Using Dual-Supply Voltages," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 9, no. 5, pp. 616-629, Oct. 2001. 4. S. H. Kulkarni, A. N. Srivastava, and D. Sylvester, “A New Algorithm for Improved VDD Assignment in Low Power Dual VDD Systems," in Proceedings of the International Symposium on Low Power Design, pp. 200-205, 2004. 5. A. Srivastava, D. Sylvester, and D. Blaauw, “Concurrent Sizing, Vdd and Vth Assignment for Low-Power Design," Proceedings of the Design, Automation and Test in Europe Conference, pp. 107-118, 2004. 6. K. Kim, Ultra Low Power CMOS Design. PhD thesis, Auburn University, ECE Dept., Auburn, AL, May 2011. 10/19/2011 48 Mridula Allani - MS Thesis Defense

49 References 7. K. Kim and V. D. Agrawal, “Dual Voltage Design for Minimum Energy Using Gate Slack,” in Proceedings of the IEEE International Conference on Industrial Technology, pp. 419-424, Mar. 2011. 8. K. Usami and M. Horowitz, “Clustered Voltage Scaling Technique for Low- Power Design," in Proceedings of the International Symposium on Low Power Design, pp. 23-26, 1995. 9. K. Usami, M. Igarashi, F. Minami, T. Ishikawa, M. Kanzawa, M. Ichida, and K. Nogami, “Automated Low-Power Technique Exploiting Multiple Supply Voltages Applied to a Media Processor," IEEE Journal of Solid-State Circuits, vol. 33, no. 3, pp. 463-472, Mar. 1998. 10. V. Sundararajan and K. K. Parhi, “Synthesis of Low Power CMOS VLSI Circuits Using Dual Supply Voltages," in Proceedings of the 36th Annual Design Automation Conference, pp. 72-75, 1999. 11. M. Allani and V. D. Agrawal, “Level-Converter Free Dual-Voltage Design of Energy Efficient Circuits Using Gate Slack,” Submitted to Design Automation and Test in Europe Conference, March 12-16, 2012. 10/19/2011 49 Mridula Allani - MS Thesis Defense

50 Thank you.


Download ppt "Polynomial-Time Algorithms for Designing Dual-Voltage Energy Efficient Circuits Master’s Thesis Defense Mridula Allani Advisor : Dr. Vishwani D. Agrawal."

Similar presentations


Ads by Google