Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4 Assessing and Understanding Performance Bo Cheng.

Similar presentations


Presentation on theme: "Chapter 4 Assessing and Understanding Performance Bo Cheng."— Presentation transcript:

1 Chapter 4 Assessing and Understanding Performance Bo Cheng

2 Which One Is Good? Airplane Passen gers Range (mi) Speed (mph) Boeing 737-100101630598 Boeing 7474704150610 BAC/Sud Concorde13240001350 Douglas DC-8-501468720544 Depends on measures of performance Cruising speed Longest range Largest capacity

3 Measuring Performance Elapsed Time, wall-clock time or response time – Total time to complete a task Including disk and memory accesses, I/O, etc. – a useful number, but often not good for comparison purposes CPU (execution) time – Doesn't count I/O or time spent running other programs – can be broken up into system CPU time, and user CPU time CPU time = user CPU time +system CPU time Our focus: user CPU time – time spent executing the lines of code that are "in" our program

4 CPU Performance Metrics Response time: the time between the start and the completion of a task (in time units) Throughput: the total amount of work done in a given time (in number of tasks per unit of time)

5 Performance Problem: Machine A runs a program in 10 sec. Machine B runs the same program in 15 sec. How much faster is A than B ? A is 1.5 times faster than B

6 Clock Rate Measurement NameExampleMeasurement Millisecond1 msec (ms)1.E-03 Microsecond1 usec (us)1.E-06 Nanosecond1 nsec (ns)1.E-09 Picosecond1 psec (ps)1.E-12 Femtosecond1 fsec (fs)1.E-15 10 nsec clock cycle => 100 MHz clock rate 1 nsec clock cycle => 1 GHz clock rate 500 psec clock cycle => 2 GHz clock rate 200 psec clock cycle => 5 GHz clock rate Clock cycle: The time for one clock period running at a constant rate Clock rate is given in Hz (=1/sec) clock_cycle_time = 1/clock_rate (in sec)

7 MHz One MHz represents one million cycles per second. The speed of microprocessors, called the clock speed, is measured in megahertz. – For example, a microprocessor that runs at 200 MHz executes 200 million cycles per second. One GHz represents 1 billion cycles per second. http://www.webopedia.com/TERM/M/MHz.html

8 CPU Time or CPU Execution Time The actual time the CPU spends computing for a specific task This time accounts for the time CPU is computing the given program, including operating system routines executed on the program’s behave, and it does not include the time waiting for I/O and running other programs. Performance of processor/memory = 1 / CPU_time

9 CPU Execution Time Formula E = CPU Execution time for a program N = Number of CPU clock cycles for a program T = clock cycle Time R = clock Rate

10 Example Computer A 4 GHz Job 10 seconds Computer B X GHz Job 6 seconds R = 8 GHz

11 Clock cycles Per Instruction (CPI) N = Number of CPU clock cycles for a program I = total Instructions for a program C = CPI The average number of clock cycles per instruction for a program or program fragment

12 The Big Picture Instruction count depends on the architecture, but not on the exact implementation Average CPI depends on design details and on the mix of types of instructions executed in an application

13 Understanding Program Performance Instruction Count CPI Clock Rate AlgorithmX Possibl y Programming Language XX CompilerXX ISAXXX

14 Using Performance Equation Clock Cycle Time CPI Computer A250 ps2 Computer B500 ps1.2 Which computer is faster for this program, and by how much?

15 Computing CPI Done by looking at the different types of instructions and using their individual cycle counts C i : The count of the number of instructions of class i executed CPI i : The average number of cycles per instruction for that instruction class l n: is the number of instruction classes

16 Example CPI for this instruction class ABC CPI123 Code Sequenc e CPI for this instruction class ABC 1212 2411

17 Workload A set of programs used for evaluating a computer or a system Benchmarks: programs specifically chosen to measure performance. SPEC 2000 benchmarks (12 integer, 14 floating- point programs). Performance results given by benchmarks may not be correct if the system (or the compiler of the system) is optimized for the benchmarks

18 Benchmark Programs specifically chosen to measure performance Best determined by running a real application – use programs typical of expected workload – e.g., compilers/editors, scientific applications, graphics... Small benchmarks – nice for architects and designers SPEC (System Performance Evaluation Cooperative) – companies have agreed on a set of real program and inputs

19 Simplest Approach Computer AComputer B Program 1 (sec)110 Program 2 (sec)1000100 Total (sec)1001110

20 Evaluating Performance Different classes and applications of computer require different types of benchmarks Desktop CPU Performance SPEC CPU benchmark to measure CPU performance and response time focusing on a specific task: DVD playback or graphic performance of games Server depend on the nature of intended application Throughput requirements on response time to individual events: database query and web page request SPECweb99 Embedde d Computin g EEMBC Reproducibility: list everything another experimenter need to duplicate the results

21 SPEC CPU2000 Benchmark

22 SPEC: CINT2000 and CFP2000

23 Relative Performance in Three Different Modes

24 Relative Energy Efficiency Comparison

25 Amdahl’s Law Execution Time After Improvement = ( Execution Time Affected/ Amount of Improvement) + Execution Time Unaffected Example: Suppose a program runs in 100 seconds on a machine, with multiply operation responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 5 times faster?" Principle: Make the common case fast

26 MIPS (million instructions per second) Instruction class CPI A1 B2 C3 Code from Instruction counts (in billion) ABC Compiler 1 511 Compiler 2 1011

27 Always trust execution time metric! http://www.faculty.uaf.edu/ffdr/EE443/Handouts/Set5_Sp05_3pp.pdf

28 A Complete Example (I) http://www.faculty.uaf.edu/ffdr/EE443/Handouts/Set5_Sp05_3pp.pdf

29 A Complete Example (II)

30 A Complete Example (III)

31 Three problems with using MIPS MIPS specifies the instruction execution rate but does not take into account the capabilities of the instructions. – We cannot compare computers with different instruction sets using MIPS, since the instruction counts will certainly differ. MIPS varies between programs on the same computer; – a computer cannot have a single MIPS rating for all programs. MIPS can vary inversely with performance.


Download ppt "Chapter 4 Assessing and Understanding Performance Bo Cheng."

Similar presentations


Ads by Google