Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2000 Morgan Kaufman Overheads for Computers as Components CPUs zCPU performance zCPU power consumption.

Similar presentations


Presentation on theme: "© 2000 Morgan Kaufman Overheads for Computers as Components CPUs zCPU performance zCPU power consumption."— Presentation transcript:

1

2 © 2000 Morgan Kaufman Overheads for Computers as Components CPUs zCPU performance zCPU power consumption.

3 © 2000 Morgan Kaufman Overheads for Computers as Components Elements of CPU performance zCycle time. zCPU pipeline. zMemory system.

4 © 2000 Morgan Kaufman Overheads for Computers as Components Pipelining zSeveral instructions are executed simultaneously at different stages of completion. zVarious conditions can cause pipeline bubbles that reduce utilization: ybranches; ymemory system delays; yetc.

5 © 2000 Morgan Kaufman Overheads for Computers as Components Pipeline structures zBoth ARM and SHARC have 3-stage pipes: yfetch instruction from memory; ydecode opcode and operands; yexecute.

6 © 2000 Morgan Kaufman Overheads for Computers as Components ARM pipeline execution add r0,r1,#5 sub r2,r3,r6 cmp r2,#3 fetch time decode fetch execute decode fetch execute decode execute 123

7 © 2000 Morgan Kaufman Overheads for Computers as Components Performance measures zLatency: time it takes for an instruction to get through the pipeline. zThroughput: number of instructions executed per time period. zPipelining increases throughput without reducing latency.

8 © 2000 Morgan Kaufman Overheads for Computers as Components Pipeline stalls zIf every step cannot be completed in the same amount of time, pipeline stalls. zBubbles introduced by stall increase latency, reduce throughput.

9 © 2000 Morgan Kaufman Overheads for Computers as Components ARM multi-cycle LDMIA instruction fetchdecode ex ld r2 ldmia r0,{r2,r3} sub r2,r3,r6 cmp r2,#3 ex ld r3 fetch time decode ex sub fetchdecode ex cmp

10 © 2000 Morgan Kaufman Overheads for Computers as Components Control stalls zBranches often introduce stalls (branch penalty). yStall time may depend on whether branch is taken. zMay have to squash instructions that already started executing. zDon’t know what to fetch until condition is evaluated.

11 © 2000 Morgan Kaufman Overheads for Computers as Components ARM pipelined branch time fetchdecode ex bne bne foo sub r2,r3,r6 fetchdecode foo add r0,r1,r2 ex bne fetchdecode ex add ex bne

12 © 2000 Morgan Kaufman Overheads for Computers as Components Delayed branch zTo increase pipeline efficiency, delayed branch mechanism requires n instructions after branch always executed whether branch is executed or not. zSHARC supports delayed and non-delayed branches. ySpecified by bit in branch instruction. y2 instruction branch delay slot.

13 © 2000 Morgan Kaufman Overheads for Computers as Components Example: SHARC code scheduling L1=5; DM(I0,M1)=R1; L8=8; DM(I8,M9)=R2; z CPU cannot use DAG on cycle just after loading DAG’s register. yCPU performs NOP between register assign and DM.

14 © 2000 Morgan Kaufman Overheads for Computers as Components Rescheduled SHARC code L1=5; L8=8; DM(I0,M1)=R1; DM(I8,M9)=R2; z Avoids two NOP cycles.

15 © 2000 Morgan Kaufman Overheads for Computers as Components Example: ARM execution time zDetermine execution time of FIR filter: for (i=0; i

16 © 2000 Morgan Kaufman Overheads for Computers as Components Superscalar execution zSuperscalar processor can execute several instructions per cycle. yUses multiple pipelined data paths. zPrograms execute faster, but it is harder to determine how much faster.

17 © 2000 Morgan Kaufman Overheads for Computers as Components Data dependencies zExecution time depends on operands, not just opcode. zSuperscalar CPU checks data dependencies dynamically: add r2,r0,r1 add r3,r2,r5 data dependency r0r1 r2r5 r3

18 © 2000 Morgan Kaufman Overheads for Computers as Components Memory system performance zCaches introduce indeterminacy in execution time. yDepends on order of execution. zCache miss penalty: added time due to a cache miss. zSeveral reasons for a miss: compulsory, conflict, capacity.

19 © 2000 Morgan Kaufman Overheads for Computers as Components CPU power consumption zMost modern CPUs are designed with power consumption in mind to some degree. zPower vs. energy: yheat depends on power consumption; ybattery life depends on energy consumption.

20 © 2000 Morgan Kaufman Overheads for Computers as Components CMOS power consumption zVoltage drops: power consumption proportional to V 2. zToggling: more activity means more power. zLeakage: basic circuit characteristics; can be eliminated by disconnecting power.

21 © 2000 Morgan Kaufman Overheads for Computers as Components CPU power-saving strategies zReduce power supply voltage. zRun at lower clock frequency. zDisable function units with control signals when not in use. zDisconnect parts from power supply when not in use.

22 © 2000 Morgan Kaufman Overheads for Computers as Components Power management styles zStatic power management: does not depend on CPU activity. yExample: user-activated power-down mode. zDynamic power management: based on CPU activity. yExample: disabling off function units.

23 © 2000 Morgan Kaufman Overheads for Computers as Components Application: PowerPC 603 energy features zProvides doze, nap, sleep modes. zDynamic power management features: yUses static logic. yCan shut down unused execution units. yCache organized into subarrays to minimize amount of active circuitry.

24 © 2000 Morgan Kaufman Overheads for Computers as Components PowerPC 603 activity zPercentage of time units are idle for SPEC integer/floating-point: unitSpecint92Specfp92 D cache29%28% I cache29%17% load/store35%17% fixed-point38%76% floating-point99%30% system register89%97%

25 © 2000 Morgan Kaufman Overheads for Computers as Components Power-down costs zGoing into a power-down mode costs: ytime; yenergy. zMust determine if going into mode is worthwhile. zCan model CPU power states with power state machine.

26 © 2000 Morgan Kaufman Overheads for Computers as Components Application: StrongARM SA-1100 power saving zProcessor takes two supplies: yVDD is main 3.3V supply. yVDDX is 1.5V. zThree power modes: yRun: normal operation. yIdle: stops CPU clock, with logic still powered.  Sleep: shuts off most of chip activity; 3 steps, each about 30  s; wakeup takes > 10 ms.

27 © 2000 Morgan Kaufman Overheads for Computers as Components SA-1100 power state machine run idle sleep P run = 400 mW P idle = 50 mW P sleep = 0.16 mW 10  s 90  s 160 ms 90  s


Download ppt "© 2000 Morgan Kaufman Overheads for Computers as Components CPUs zCPU performance zCPU power consumption."

Similar presentations


Ads by Google