Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSV881: Low-Power Design Multicore Design for Low Power

Similar presentations


Presentation on theme: "CSV881: Low-Power Design Multicore Design for Low Power"— Presentation transcript:

1 CSV881: Low-Power Design Multicore Design for Low Power
Vishwani D. Agrawal James J. Danaher Professor Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 Copyright Agrawal, 2011 Lecture 13: Multicore Design

2 Low-Power Datapath Architecture
Lower supply voltage This slows down circuit speed Use parallel computing to gain the speed back Works well when threshold voltage is also lowered. About 60% reduction in power obtainable. Reference: A. P. Chandrakasan and R. W. Brodersen, Low Power Digital CMOS Design, Boston: Kluwer Academic Publishers (Now Springer), 1995. Copyright Agrawal, 2011 Lecture 13: Multicore Design

3 Lecture 13: Multicore Design
A Reference Datapath Combinational logic Input Register Register Output Cref CK Supply voltage = Vref Total capacitance switched per cycle = Cref Clock frequency = f Power consumption: Pref = CrefVref2f Copyright Agrawal, 2011 Lecture 13: Multicore Design

4 A Parallel Architecture
Supply voltage: VN ≤ V1 = Vref N = Deg. of parallelism Each copy processes every Nth input, operates at reduced voltage Register Comb. Logic Copy 1 f/N Register Comb. Logic Copy 2 Register Output Input N to 1 multiplexer f/N f Register Comb. Logic Copy N Multiphase Clock gen. and mux control f/N CK Copyright Agrawal, 2011 Lecture 13: Multicore Design

5 Lecture 13: Multicore Design
Level Converter: L to H Transistors with thicker oxide and longer channels VDDH Vout_H Vin_L VDDL N. H. E. Weste and D. Harris, CMOS VLSI Design, Third Edition, Section , Addison-Wesley, 2005. Copyright Agrawal, 2011 Lecture 13: Multicore Design

6 Lecture 13: Multicore Design
Level Converter: H to L Transistors with thicker oxide and longer channels VDDL Vout_L Vin_H N. H. E. Weste and D. Harris, CMOS VLSI Design, Third Edition, Section , Addison-Wesley, 2005. Copyright Agrawal, 2011 Lecture 13: Multicore Design

7 Lecture 13: Multicore Design
Control Signals, N = 4 CK Phase 1 Phase 2 Phase 3 Phase 4 Copyright Agrawal, 2011 Lecture 13: Multicore Design

8 Lecture 13: Multicore Design
Power PN = Pproc + Poverhead Pproc = N(Cinreg + Ccomb) VN2f/N = (Cinreg + Ccomb) VN2f = CrefVN2f Poverhead = CoverheadVN2f ≈ δCref(N – 1)VN2f PN = [1 + δ(N – 1)]CrefVN2f PN VN2 ── = [1 + δ(N – 1)] ─── P Vref2 Copyright Agrawal, 2011 Lecture 13: Multicore Design

9 Lecture 13: Multicore Design
Voltage vs. Speed CLVref CLVref Delay of a gate, T ≈ ──── = ────────── I k(W/L)(Vref – Vt)2 where I is saturation current k is a technology parameter W/L is width to length ratio of transistor Vt is threshold voltage 4.0 3.0 2.0 1.0 0.0 1.2μ CMOS Voltage reduction slows down as we get closer to Vt N=3 Normalized gate delay, T N=2 N=1 Supply voltage Vt V3 V2=2.9V Vref =5V Copyright Agrawal, 2011 Lecture 13: Multicore Design

10 Increasing Multiprocessing
1.0 0.8 0.6 0.4 0.2 0.0 1.2μ CMOS, Vref = 5V Vt=0.8V PN/P1 Vt=0.4V Vt=0V (extreme case) N Copyright Agrawal, 2011 Lecture 13: Multicore Design

11 Lecture 13: Multicore Design
Extreme Cases: Vt = 0 Delay, T α 1/ Vref For N processing elements, delay = NT → VN = Vref/N PN 1 ── = [1+ δ (N – 1)] ── → 1/N P1 N2 For negligible overhead, δ→0 PN 1 ── ≈ ── P1 N2 For Vt > 0, power reduction is less and there will be an optimum value of N. Copyright Agrawal, 2011 Lecture 13: Multicore Design

12 Example: Multiplier Core
Specification: 200MHz Clock 15W 5V Low voltage operation, VDD ≥ 1.5 volts (VDD – 0.5)2 Relative clock rate = ─────── 20.25 Problem: Integrate multiplier core on a SOC Power budget for multiplier ~ 5W Copyright Agrawal, 2011 Lecture 13: Multicore Design

13 Lecture 13: Multicore Design
A Multicore Design Multiplier Core 1 Reg 40MHz Multiplier Core 2 Output Reg 5 to 1 mux Reg Input 40MHz 200MHz Multiphase Clock gen. and mux control Multiplier Core 5 Reg 40MHz 200MHz CK Core clock frequency = 200/N, N should divide 200. Copyright Agrawal, 2011 Lecture 13: Multicore Design

14 Lecture 13: Multicore Design
How Many Cores? For N cores: clock frequency = 200/N MHz Supply voltage, VDDN = (20.25/N)1/2 volts Assuming 10% overhead per core, VDDN Power dissipation =15 [ (N – 1)] (───)2 watts 5 Copyright Agrawal, 2011 Lecture 13: Multicore Design

15 Design Tradeoffs Number of cores, N Clock (MHz)
Core supply VDDN (Volts) Total Power (Watts) 1 200 5.00 15.0 2 100 3.68 8.94 4 50 2.75 5.90 5 40 2.51 5.29 8 25 2.10 4.50 Copyright Agrawal, 2011 Lecture 13: Multicore Design

16 Power Reduction in Processors
Just about everything is used. Hardware methods: Voltage reduction for dynamic power Dual-threshold devices for leakage reduction Clock gating, frequency reduction Sleep mode Architecture: Instruction set hardware organization Software methods Copyright Agrawal, 2011 Lecture 13: Multicore Design

17 Parallel Architecture
Processor Processor Input Output Output f/2 Input Processor f f Capacitance = C Voltage = V Frequency = f Power = CV2f Capacitance = 2.2C Voltage = 0.6V Frequency = 0.5f Power = 0.396CV2f f/2 Copyright Agrawal, 2011 Lecture 13: Multicore Design

18 Pipeline Architecture
Processor Proc. Proc. Input Output Input Output Register Register Register f f Capacitance = C Voltage = V Frequency = f Power = CV2f Capacitance = 1.2C Voltage = 0.6V Frequency = f Power = 0.432CV2f Copyright Agrawal, 2011 Lecture 13: Multicore Design

19 Lecture 13: Multicore Design
Approximate Trend n-parallel proc. n-stage pipeline proc. Capacitance nC C Voltage V/n Frequency f/n f Power CV2f/n2 Chip area n times 10-20% increase G. K. Yeap, Practical Low Power Digital VLSI Design, Boston: Springer, 1998. Copyright Agrawal, 2011 Lecture 13: Multicore Design

20 SPECint2000 and SPECfp2000 benchmarks
Multicore Processors Computer, May 2005, p. 12 Multicore SPECint2000 and SPECfp2000 benchmarks Performance based on Single core Copyright Agrawal, 2011 Lecture 13: Multicore Design

21 Lecture 13: Multicore Design
Multicore Processors D. Geer, “Chip Makers Turn to Multicore Processors,” Computer, vol. 38, no. 5, pp , May 2005. A. Jerraya, H. Tenhunen and W. Wolf, “Multiprocessor Systems-on-Chips,” Computer, vol. 5, no. 7, pp , July 2005; this special issue contains three more articles on multicore processors. S. K. Moore, “Winner Multimedia Monster – Cell’s Nine Processors Make It a Supercomputer on a Chip,” IEEE Spectrum, vol. 43. no. 1, pp , January 2006. Copyright Agrawal, 2011 Lecture 13: Multicore Design

22 Cell - Cell Broadband Engine Architecture
Nine-processor chip: 192 Gflops © IEEE Spectrum, January 2006 L to R Atsushi Kameyama, Toshiba James Kahle, IBM Masakazu Suzoki, Sony Copyright Agrawal, 2011 Lecture 13: Multicore Design

23 Cell’s Nine-Processor Chip
© IEEE Spectrum, January 2006 Eight Identical Processors f = 5.6GHz (max) 44.8 Gflops Copyright Agrawal, 2011 Lecture 13: Multicore Design


Download ppt "CSV881: Low-Power Design Multicore Design for Low Power"

Similar presentations


Ads by Google