Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance.

Similar presentations


Presentation on theme: "Performance."— Presentation transcript:

1 Performance

2 Performance What is performance? How to measure performance?
Performance metrics Performance evaluation Why does some hardware perform better than others for different programs? What factors in hardware are related to performance? How does the machine's instruction set affect performance?

3 Airplane Analogy Which of these airplanes has the best performance?
600 8400 656 Airbus A 3xx 544 8720 146 Douglas DC-8-50 1350 4000 132 Concorde 610 4150 470 Boeing 747 4630 375 Boeing 777 Speed (m.p.h) Range (miles) Passenger Capacity Airplane 393600 79424 178200 268700 228750 Passenger throughput (passenger x m.p.h)

4 Computer Performance Response time (latency)
How long does it take for my job to run? How long does it take to execute a program? How long must I wait for a database query? Throughput How many jobs can the machine run at once? What is the average execution rate? How much work is getting done? If we upgrade the processor of a machine which metric do we improve? If we add a new machine to a network which metric do we improve?

5 Which Time to Measure? Elapsed Time (Wall clock time, response time)
Counts everything (disk and memory access, I/O, operating system overhead, work on other processes) Useful but not always good for comparison purposes CPU (execution) time The time CPU spends computing for the user task Does not include time spent waiting for I/O, running other programs user CPU time CPU time spent within the program, system CPU time CPU time spent in the operating system performing tasks on behalf of the program

6 CPU Time Unix time command reflects this breakdown by returning the following when prompted: 90.7u 12.9s 2:39 65% Interpretation: User CPU time is 90.7 s System CPU time is 12.9s Elapsed time is 159 s ( ) CPU time is 65% of total elapsed time

7 A Definition of Performance
For some program running on machine X PerformanceX = 1/Execution_timeX The machine X is said to be “n times faster” than the machine Y if PerformanceX/PerformanceY = n Execution_timeY/Execution_timeX = n Example: Machine A runs a program in 10 seconds and machine B runs the same program in 15 seconds, how much faster is A than B?

8 Metrics of Performance
“Time to execute a program” is the ultimate metric in determining the performance However, it is convenient to inspect other metrics as well when we examine the details of a machine. Computers use a clock that runs at a constant rate and determines when an event takes place in hardware. These discrete time intervals are called clock cycles (or ticks, clock ticks, clock periods). Clock rate (frequency) is the inverse of clock period.

9 Clock Cycles Clock “ticks” indicate when to start activities
time Start of events often the rising edge of the clock Instead of reporting execution time in seconds, we often use cycles

10 Clock Cycle cycle time (CT) = time between ticks = seconds per cycle
Cycle Count (CC): the number of clock cycles to execute a program clock rate (frequency) = cycles per second (1 Hz = 1 cycle/sec) A 200 MHz clock has a 1/(200·106) = ? nanosecond cycle time A 4 GHz clock has a 1/(4· 109) = ? nanosecond cycle time

11 The CPI Metric CPI Clocks Per Instruction
Number of cycles spent on an instruction on average. CC = IC  CPI Hard to compute. It is useful when comparing the performances of two machines with the same ISA. (Why?) Example: two machines with the same ISA. For a certain program we have Machine A: CPI = 2.0 Machine B: CPI = 1.2 Which machine is faster? What if machine A uses 250 ps and machine B 500 ps cycle time

12 Improving Performance
So, to improve performance Increase the clock frequency (i.e. decrease the clock period) Reduce the number of the clock cycles per program (IC  CPI)

13 Instruction  Cycle ? No ! The number of cycles per instruction depends on the implementations of the instructions in hardware The number differs for each processor (even with the same ISA)

14 The Reason Operations take different number of cycles
Multiplication takes longer than addition Floating point operations take longer than integer operations The access time to a register is much shorter than access to the main memory.

15 Simple Formulae for CPU Time
CPU execution time = CPU clock cycles for a program  Clock cycle time (CC  CT) CPU execution time = CPU clock cycles for a program/Clock rate We can write CPU clock cycles for a program = IC  CPI Then CPU execution time = (IC  CPI)/Clock rate

16 Example Computer A of 800 MHz It runs our favorite program in 15 s
Our goal Design computer B with the same ISA It will run the same program in 8 s. We may use a new process technology (>Ghz) can increase the clock rate; however, it will also increase CPI by 1.25. What clock rate should we aim to use?

17 Performance Performance is determined by execution time (CPU time)
We have also other indicators # of cycles to execute program # of instructions in program (IC) # of cycles per second average # of cycles per instruction (CPI) average # of instructions per second Common pitfall: thinking one of the above is indicative of performance when it really isn’t.

18 Number of Instructions Example
A compiler designer has the following two alternatives to generate a certain piece of code with instructions A(1 cycle) , B (2 cycles), and C(3 cycles): 2106 of A, 106 of B, and 2106 of C (IC = 5106) 4106 of A, 106 of B, and 106 of C (IC = 6106) Which code sequence is faster?

19 The MIPS Metric A faster machine has a higher MIPS
Millions Instructions Per Second = MIPS = IC/(Execution_time  106) MIPS = IC/(CC  cycle time  106) MIPS = (IC  clock rate)/(IC  CPI  106) MIPS = clock rate/(CPI  106) A faster machine has a higher MIPS Execution_time = IC/(MIPS  106)

20 A MIPS Example A computer with 500 MHz clock
Three different classes of instructions: A (1 cycle), B (2 cycles), C (3 cycles) Two compilers used to produce code for a large piece of software. Compiler 1: 5 billion A, 1 billion B, and 1 billion C instructions. Compiler 2: 10 billion A, 1 billion B, and 1 billion C instructions. Which sequence will be faster according to MIPS? Which sequence will be faster according to execution time?

21 CPI example CPI Machine A: CPI = 10/7 = 1.43
Machine B: CPI = 15/12 = 1.25 CPU time CPU time = (IC  CPI) / clock rate CPI changes according to instruction mix and freq. When multiplied with clock cycle time gives accurate execution time.

22 Problems of MIPS MIPS specifies instruction execution rate
MIPS does not take into account the capabilities of the instructions Thus, it is impossible to compare computers with different ISA using MIPS. MIPS is not constant, even on a single machine, depends on the application. As we saw in the previous example, MIPS can vary inversely with performance.

23 Overview A given program will require Vocabulary
Some number of instructions Some number of clock cycles Some number of seconds Vocabulary Cycle time: (micro or nano) seconds per cycle Clock rate (frequency): cycles per second CPI: clock cycles per instruction MIPS: millions of instruction per second MFLOPS: millions of floating point operations per second

24 Performance Performance is ultimately determined by EXECUTION TIME
Is any of the following metrics good to measure performance by itself? Why? # of cycles to execute a program # of instructions in a program # of cycles per second Average # of cycles per instruction Average # number of instructions per second

25 Question Assuming two machines have the same ISA, which of the following quantities are identical? Clock rate CPI Execution time # of instructions MIPS

26 Program Performance Affects what? How? HW or SW component Algorithm
IC, possibly CPI Programming Language IC, CPI Compiler IC, CPI ISA IC, clock rate, CPI


Download ppt "Performance."

Similar presentations


Ads by Google