Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律.

Similar presentations


Presentation on theme: "Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律."— Presentation transcript:

1 Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律

2 Growth in CPU transistor count

3 Consequences of Moore ’ s law Cost of a chip remains unchanged during the growth of in density => cost down Electrical path length is shortened => increase operating speed Computer becomes smaller Reduction in power More circuitry on each chip => fewer inter- chip connections => more reliable

4 Chap.4 The Role of Performance Jen-Chang Liu, Spring 2005

5 Hardware performance is often key to the effectiveness of an entire system of hardware and software. What do we mean by saying one computer has better performance than another?

6 Example: performance of airplanes

7 Performance of a hardware system What do we mean by better performance? Fast speed ? Response time (execution time): the time between the start and completion of a task 完成 工作所需的時間 Throughput : the total amount of work done in a given time 單位時間完成的工作 Ex. multi-user system

8 Performance measure Performance X 1 Execution time x = * Relative performance: Performance A Performance B = n = Machine A is n times faster than B Execution time B Execution time A Ex. machine A runs a program in 10 sec., machine B runs a program in 15 sec., Performance A Performance B = 1.5 Execution time B Execution time A = = 15 10 Quantitative relation of performance and execution time on machine x:

9 Problem with previous definition of performance The definition of execution time How about multiple tasks run concurrently? Use which programs to evaluate the performance of a computer ?

10 Execution Time ? The total time to complete a task – response time, elapsed time In a timeshared system, such as Unix, a processor work on several programs Including disk access, memory access, I/O, OS overhead … 執行時間的定義 使用者觀點 Program A swap Prog. BI/O Program A Response time for A

11 CPU time CPU execution time Does not include waiting for I/O, running other programs CPU exec. time = user CPU time + system CPU time user CPU time CPU time spent in the program system CPU time CPU time spent in the OS about our program 不含 I/O, 執行其他程式時間

12 Example : CPU time Unix command : time 90.7u 12.9s 2:39 65% user CPU system CPU elapsed time 90.7+12.9 159 = 0.65 We will discuss CPU performance, i.e. user CPU time in the following discussion

13 Unit of time Seconds Clock cycle Ex. Clock cycle time = 2ns Clock rate = 1 2x10 -6 = 500 MHz CPU time for a program CPU clock cycles for a program =x Clock cycle time Instructions for a program =x Average clock cycle per instruction x Clock cycle time (CPI)

14 Example 1 Machine A,B has the same ISA, for the same program Machine A: clock cycle = 1ns, CPI = 2 Machine B: clock cycle = 2ns, CPI = 1.2 CPU time A = Inst. count x CPI x clock cycle time = I x 2 x 1 = 2I CPU time B = I x 1.2 x 2 = 2.4 I Performance A Performance B Execution time B Execution time A = = 2.4I 2I = 1.2 A is 1.2 times faster than B

15 Quiz 5/9 Program P runs in 10s on computer A, which has 4GHz clock We want run program P in 6s. We design computer B with faster clock rate, but it requires 1.2 times as many clock cycles as computer A. What clock rate should we use in computer B?

16 Example 2 Instruction class CPI A B C 1 2 3 Code 1: 2 1 2 Code 2: 4 1 1 Compiler generate 2 different code sequences A B C CPU clock cycle 1 = 2x1 + 1x2 + 2x3 = 10 cycles CPU clock cycle 2 = 4x1 + 1x2 + 1x3 = 9 cycles Total inst. 5 6 faster? faster

17 Short conclusion Computer Performance software hardware Response time CPU time I/O, other prog.s Instruction count CPI Clock cycle length How to optimize them in a hardware design?

18 Problem with previous definition of performance The definition of execution time How about multiple tasks run concurrently? Use which programs to evaluate the performance of a computer ?

19 Choose programs to evaluate performance Benchmarks: programs chosen to measure performance SPEC (System Performance Evaluation Cooperative) suit of benchmarks Started in 1989 http://open.specbench.org/ SPEC95 in textbook is retired … SPECx contains a set of benchmark programs

20 SPEC – money …

21 SPEC95 benchmarks Integer benchmarks written in C floating-pt benchmarks written in Fortran 77

22 Summarize performance Which is faster? Computer AComputer B Program 1(sec)110 Program 2(sec)1000100 Total time(sec)1001110 Performance B Performance A Execution time A Execution time B = = 1001 110 = 9.1 * Assume the programs occur in equal probability.

23 SPEC ratio The execution time of a benchmark program is normalized (compared to a baseline system) SPECint95, SPECfp95 SPEC ratio = Exec. Time on Sun SPARCstation 10/40 Exec. Time on the measured machine SPECint95 = geometric mean of SPEC ratios

24 Example: SPECint95 for Pentium and Pentium Pro 1 1 Performance improvement 2 2 Clock rate x2 SPECint x 1.7 ?

25 Amdahl’s law in computing CPU time for a program CPU clock cycles for a program =x Clock rate 1 Clock rate => CPU time2 2 * Improvement of one aspect of a machine does not increase performance by the same ratio 部分的改進 * Ex. The bottleneck in the memory system does not improve Exec. time after improve. = Exec. time affected by improve. Amount of improvement Exec. time unaffected + as in previous example

26 Example: Amdahl’s law A program takes 100s to run 20% multiplication, 50% memory op., 30% others What ’ s the speed up for Multiply speed 4 Memory access 2 100 20/4 + 50 + 30 =1.18 100 20 + 50/2 + 30 =1.33

27 MIPS as a measurement (not good … ) MIPS = Million Instructions Per Second High MIPS => faster ? MIPS= Instruction count Execution time x 10 6 Pitfalls: MIPS cannot be used to compare computers with different instruction sets => inst. count differs MIPS varies between programs on the same computer => no single MIPS for a machine

28 Example: MIPS ? Example: 500 MHz machine Code 1 Code 2 Inst. Count(x10 9 ) for each inst. class A BC 5 1 1 10 1 1 2 compilers for the same source program: Instruction class CPI A B C 1 2 3

29 Example: MIPS? MIPS 1 = Inst. count Exec timex10 6 = (5+1+1)x10 9 20x10 6 =350 MIPS 2 = (10+1+1)x10 9 30x10 6 =400 Exec. time 1 < Exec. time 2 MIPS 1 < MIPS 2 Exec. Time 1 = (5x1+1x2+1x3)x10 9 cycles 500x10 6 cycles/sec = 20 sec. Exec. Time 2 = (10x1+1x2+1x3)x10 9 cycles 500x10 6 cycles/sec = 30 sec.


Download ppt "Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律."

Similar presentations


Ads by Google