Download presentation

Presentation is loading. Please wait.

Published byCole Burris Modified over 2 years ago

1
Power Consumption by Integrated Circuits Lin Zhong ELEC518, Spring 2011

2
Power consumption of processing Dynamic power 2

3
Busy power vs. delay vs. energy Analysis and Design of Digital ICs, Hodges et al 3

4
Core 2 Duo for example Intel® Core™2 Duo processor – T7800 at 2.6GHz – T7700 at 2.4GHz available on Thinkpad T61p – 0.75-1.35V, 35Watts Intel® Core™2 Duo Low Voltage – L7500 at 1.6GHz available on Thinkpad X61 – 0.75-1.3V, 17Watts Intel® Core™2 Duo Ultra Low Voltage – U7500 at 1.06GHz available on Dell D430 – 0.75-0.975V, 10Watts 4

5
Switching energy e=1/2∙C ∙V 2 Switching power P= b∙C ∙V 2 = a∙C ∙V 2 ∙f 5

6
Higher integration Selling the chipset (or solution or platform) – Intel Centrino Centrino Duo includes Core 2 Duo processor, 9XX Express-series chipset, and Wi-Fi adapter – TI TCS2600 chipset 6 6

7
System-on-a-chip (SoC) TI OMAP 7

8
SiP: Multiple-chip product (MCP) Siemens SX66 PDA Phone Audiovox PPC6601KIT 32MB 400MHz Source: Intel.com 8

9
SiP: Stacked-die approach Qualcomm 3G CDMA2000 chip Seven power regimes 100 clock regimes ISSCC 2004 9

10
10 Moore’s Law known ExcitingUnknown

11
11 MOSFET at nanoscale Sunlin Chou, “Extending Moore’s Law in the Nanotechnology Era” (www.intel.com).www.intel.com

12
Given workload L and deadline T L measured by # of CPU cycles Clock speed f ≥ L/T Time to finish: t = L/f Energy to finish: P ∙ t= a∙C ∙V 2 ∙f ∙t= a∙C ∙V 2 ∙L 12

13
Effect of lower clock speed (f) Power consumption P= a∙C ∙V 2 ∙f Energy consumption E=P ∙ t= a∙C ∙V 2 ∙f ∙t= a∙C ∙V 2 ∙L 13

14
Effect of lower supply voltage (V) Power consumption P= a∙C ∙V 2 ∙f=k∙V 3 =x∙f 3 Energy consumption E=P ∙ t= a∙C ∙V 2 ∙f ∙t= a∙C ∙V 2 ∙L Maximum clock speed f= b∙V 14

15
Given workload L and deadline T single processor The processor can run at any frequency (voltage) – f= b∙V The processor can be complete off when work is done (zero power when idle) To minimize energy consumption, at which frequency should the processor run? – f ≥ L/T (in order to meet the deadline) – E=P ∙ t= a∙C ∙V 2 ∙f ∙t= a∙C ∙V 2 ∙L – f=???? 15

16
time f T f 1 =L/T f 2 =L/(T/2)=2f 1 16

17
time P T P 1 =x∙f 3 P 2 =2 3 P 1 17

18
Given workload L and deadline T M processors The workload can be divided without overhead: L = L 1 +L 2 +…+L M (L ≥ Li≥0) To minimize energy consumption, at which frequency should processor i run? – f i = L i /T and V = u ∙ L i – E i = a∙C ∙V 2 ∙L i =w∙L i 3 18

19
Given workload L and deadline T M processors The workload can be divided without overhead: L = L 1 +L 2 +…+L M (L ≥ Li≥0) To minimize the TOTAL energy consumption, how should the workload be allocated? – E= E 1 +E 2 +…+E M = w∙L 1 3 +w∙L 2 3 +…+w∙L M 3 – = w(L 1 3 +L 2 3 +…+L M 3 ) 19

20
From high school [(a+b)/2] 2 ≤ (a 2 +b 2 )/2 ≥ ≥≥ Quadratic mean Arithmetic mean Geometric meanharmonic mean 20

21
From high school (Contd.) [(a+b)/2] 3 ≤ (a 3 +b 3 )/2 ( for a, b ≥0) – E= w(L 1 3 +L 2 3 +…+L M 3 ) ??? (L 1 +L 2 +…+L M ) 3 21

22
From college: Convex (Concave) By definition of “convex” 22

23
Jensen’s Inequality (finite form) ϕ (x) is convex – ϕ (t∙x 1 +(1-t)∙x 2 )≤ t∙ ϕ (x 1 )+(1-t) ∙ϕ (x 2 ) http://en.wikipedia.org/wiki/Jensen%27s_inequality#Proof_1_.28finite_form.29 23

24
a i =1/n ϕ (x) =x 2 (Convex) ϕ (x) =x 3 (Convex for x≥0) – E= w(L 1 3 +L 2 3 +…+L M 3 )=w∙M ∙ (L 1 3 +L 2 3 +…+L M 3 )/M – ≥ w∙M ∙[(L 1 +L 2 +…+L M )/M] 3 =w∙L 3 /M 2 ≥ 24

25
More about Convexity Cost Return ExampleCostReturn Workload distributionEnergyWorkload finished within T EatingPrice of applesPleasure from eating apples Helicopter enginePrice of engineEngine thrust Law of diminishing marginal returns Cost of productionIncrease in production

26
More about Convexity Greedy optimization works Combine simpler/cheaper components Cost Return

27
Check the assumptions Power consumption is zero when the processor is not active 27

28
Idle power (Static power) When IC is idle but not powered off, e.g. SRAM 28

29
Leakage power

30
30 Scaling down

31
Scaling down (Contd.) 31 Thermodynamics: Gas Quantum dynamics: Individual molecules Uniform (central limit theorem) High variation and likely defectivel

32
Scaling: Not that simple (Contd.) 32 Tunneling effect

33
time f T f 1 =L/T f 2 =L/(T/2)=2f 1 33

34
time P T P 1 =x∙f 3 34

35
time P T P 1 =x∙f 3 +P static 35

36
time P T P 1 =x∙f 3 +P static P 2 =2 3 x∙f 3 +P static 36

37
Why is static power important? ITRS, 2009

38
Pentium II (Klamath) and III (Coppermine) 7.5M Transistors 28M Transistors 38

39
Core 2 Duo (Conroe) 64KB L1 cache, 4MB L2 cache, 291M Transistors 39 Core 1 Core 2

40
Solutions to “never-enough” challenge 234M transistors 24M go to L2 cache 8 SPE, each 20.9M transistors (167M transistors) Each has 4 64KB SRAM (12M transistors) SRAM takes 122M transistors (>50%) 40

41
Multiple power/clock domains TI OMAP 2 architecture, ISSCC 2005 Multimedia phone: NTT DoCoMo 3G FOMA 902i to be released with OMAP2420 41

42
Given workload L and deadline T single processor One processor can run at any frequency (voltage) – f= b∙V The processor can be complete off when work is done (zero power when idle) Given P static – Given energy overhead of shutting down the processor (E overhead ) To minimize energy consumption, at which frequency should the processor run? 42

43
time P T P 1 =x∙f 3 +P static P 2 =2 3 x∙f 3 +P static 43

44
Why is there overhead to power off circuit?

45
Clock generator Resonant circuit + amplifier Resonant circuit (Oscillator) – Crystal oscillator (>2x10 9 /yr) ~10KHz to ~10MHz Quartz, ceramics (low cost, low accuracy), surface acoustic wave (SAW) quartz crystal (expensive, accurate) Real-time clocks – 32.768KHz (2 15 ), 4.194304MHz (2 22 ) Application-specific – 4.9152MHz (4 x 1.2288MHz, CDMA baseband frequency)…… 45 Res A

46
LC/RLC circuit Ring oscillator – Application other than oscillator? Voltage-controlled oscillator (VCO) – Varicap: variable capacitance diode (tuning diode) – Phase-locked loop for high-speed clock (next slide) – Frequency scaling of IC for energy saving Oscillator (Contd.) 46

47
High-speed clock from a master oscillator Digital PLL Clock generation, recovery, synchronization – Digital computing, RF communication Phase-locked loop (PLL) 47 Phase- frequency detector Master oscillator VCO Frequency divider (N) voltage

48
Given workload L and deadline T single processor The processor can run at any frequency (voltage) – f= b∙V The processor can be complete off when work is done (zero power when idle) To minimize energy consumption, at which frequency should the processor run? – f ≥ L/T (in order to meet the deadline) – E=P ∙ t= a∙C ∙V 2 ∙f ∙t= a∙C ∙V 2 ∙L – f=???? 48

49
Threshold voltage

50
50 Vdd scales slow & Vth scales slower Vth is limited by the thermal voltage Vdd needs to stay considerable higher than Vth to curb leakage current End up with destroying the scaling rules – low channel mobility Plummer and Griffin, 2001 (Data from ITRS/NTRS)

51
Check the assumptions (Contd.) The workload can be divided without overhead: L = L 1 +L 2 +…+L M (L ≥ Li≥0) Communication cost between processors!!! 51

52
Quadrotor vs. Helicopter

53
De Bothezat Quadrotor, 1923.

54
Quadrotor vs. Helicopter A.R. Drone, 2010

55
Wire power consumption 55

56
Wire power consumption

57
Inter-processor communication

Similar presentations

Presentation is loading. Please wait....

OK

Clock will move after 1 minute

Clock will move after 1 minute

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google