Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 7: 9/17/2002CS170 Fall 20021 CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.

Similar presentations


Presentation on theme: "Lecture 7: 9/17/2002CS170 Fall 20021 CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University."— Presentation transcript:

1 Lecture 7: 9/17/2002CS170 Fall 20021 CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University Lecture 7: 9/17/2002

2 CS170 Fall 20022 Outline Benchmarks Comparing and summarizing performance SPEC95 benchmarks as an example. Should cover sections 2.4-2.6

3 Lecture 7: 9/17/2002CS170 Fall 20023 Benchmarks 1/3 Concept of Workload Informally, set of programs that the user runs day in and day out Benchmarks Programs specifically chosen to measure performance Form a workload that the user hopes will predict the performance of the actual workload Best benchmark types are real programs Use of benchmarks whose performance depends on small code segments encourages optimizations in either the architecture or compiler A problem: Compilers with special-purpose optimizations targeted at specific benchmarks. Will such optimizations produce good or correct code with a real application?

4 Lecture 7: 9/17/2002CS170 Fall 20024 Benchmarks 2/3 COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED Matrix 300 in SPEC suite in 1989 SPEC is System Performance Evaluation Cooperative For matrix 300, the enhanced compiler improves performance by a factor of more than 9!. Although not that much improvement with other benchmarks. SPEC benchmark web site http://www.specbench.orgSPEC benchmark web site http://www.specbench.org

5 Lecture 7: 9/17/2002CS170 Fall 20025 Benchmarks 3/3 Why real programs are not used to measure performance? Small size of benchmark (easier compilation and simulation) Compilers might not be available for a new machine Numerous published performance results are available for small benchmarks Benchmarks are OK for the initial design phase, but a working computer system should be evaluated with a real program Writing Performance reports Reproducibility Include everything needed to be able to duplicate the experiment Example of a performance report for SPEC benchmark. Figure 2.4

6 Lecture 7: 9/17/2002CS170 Fall 20026 Comparing and Summarizing Performance 1/4 Selected benchmark Agreed to use response time or throughput How to summarize performance of a group of benchmarks? M/C AM/C B P1110 P21000100 Total1001110 A is 10 times faster than B for P1 B is 10 times faster than A for P2 What is the relative performance of A & B? Use Total Execution Time

7 Lecture 7: 9/17/2002CS170 Fall 20027 Comparing and Summarizing Performance 2/4 B is 9.1 times faster than A for P1 and P2 together One figure as Summary of performance directly proportional to execution time If the workload consists of running P1 and P2 an equal number of times, this statement would predict the relative execution times for the workload on each machine Average of execution times that is directly proportional to total execution time is arithmetic mean (AM) Time(i): execution time for i th program n: total number of programs in the workload A Smaller mean means smaller average execution time and thus improved performance

8 Lecture 7: 9/17/2002CS170 Fall 20028 Comparing and Summarizing Performance 3/4 Arithmetic mean proportional to execution time, if programs in workload are each run an equal number of times. What happens if not the case? Assign a weighting factor w(i) to each program to indicate frequency of the program in the workload Weighted arithmetic mean AM special case of weighted AM when all weights are equal

9 Lecture 7: 9/17/2002CS170 Fall 20029 Comparing and Summarizing Performance 4/4 ProgramM/C AM/C BM/C C P111020 P2100010020 Table shows runtimes of P1 and P2 on three machines A, B, and C Workload consists of P1 and P2. P1 is run 10 times as often as P2 Find which machine is fastest for this workload and by how much?

10 Lecture 7: 9/17/2002CS170 Fall 200210 SPEC95 Benchmarks CPU benchmark Created by a set of computer companies in 1989 SPEC95 (8 integer and 10 floating point programs). Figure 2.6 SPEC95 web site (http://www.specbench.org/osg/cpu95/news/cpu95descr.html)SPEC95 web site http://www.specbench.org/osg/cpu95/news/cpu95descr.html SPEC ratio for xxx.benchmark = xxx.benchmark reference time /xxx.benchmark run time Normalized measure. Higher results indicate faster performance Reference machine is a Sun SPARCstation 10/40 SPECint95 or SPECfp95 summary measurement is obtained by taking geometric mean of the SPEC ratios Product of a 1 * a 2 *..* a n

11 Lecture 7: 9/17/2002CS170 Fall 200211 SPEC95 Benchmark results for Pentium and Pentium Pro At same clock rate, Pentium Pro is 1.4 to 1.5 times faster When clock rate increased by a certain factor, processor performance increases by a lower factor Pentium clock rate from 100 to 200 MHz. SPECint95 performance improves by only 1.7 (Why?)

12 Lecture 7: 9/17/2002CS170 Fall 200212 SPEC95 Benchmark results for Pentium and Pentium Pro At same clock rate, Pentium Pro is 1.7 to 1.8 times faster Clock rate from 100 to 200 MHz, SPECfp95 improves by only 1.4 (Why?) Bottleneck at memory system due to increase of processor speed, which effect is more evident on floating point benchmarks because of size.


Download ppt "Lecture 7: 9/17/2002CS170 Fall 20021 CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University."

Similar presentations


Ads by Google