Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 1 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Part 6 Fundamentals.

Similar presentations


Presentation on theme: "Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 1 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Part 6 Fundamentals."— Presentation transcript:

1 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 1 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Part 6 Fundamentals in Performance Evaluation Computer Architecture Slide Sets WS 2011/2012 Prof. Dr. Uwe Brinkschulte Prof. Dr. Klaus Waldschmidt

2 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 2 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Why performance evaluation? Comparison of computers Selection of a computer Changes in the configuration of an existing computer (tuning) Design of computers Verification or validation of design desicions Methods for performance evaluation: (1)analytical methods (2)measurements

3 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 3 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Aspects for evaluation

4 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 4 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Analytical methods Performance measures: (hypothetical maximaum performance !!) MIPS (Millions of Instructions per Second) MFLOPS (Millions of Floating Point Operations per Sec.) Mix: (as well calculated, not measured) In a mix, the average execution time for each instruction is calculated and scaled by a characteristical weight. Core-Programs: Typical application programs, written for the evaluated computer No measurements, the overall execution time is calculated using the execution times of the single machine instructions

5 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 5 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Performance measures runtime = # clock cycles * clock period MIPS (million instruction per second) instruction count MIPS = runtime 10 6 instruction count instruction count clock frequency MIPS = = # clock cycles clock period 10 6 # clock cycles 10 6 clock frequency clock frequency IPC MIPS = = CPI 10 6 10 6 MFLOPS (million floating point operations per second) # executed floating point instruction MFLOPS = runtime 10 6 CPI (cycles per instruction) # clock cycles CPI = instruction count IPC (instructions per cycle) ICP = 1 / CPI

6 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 6 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt CPI, IPC, MIPS and MFLOPS are dependent on the instruction set. CPI, IPC, MIPS and MFLOPS are dependent on the program. CPI, IPC, MIPS and MFLOPS are dependent on the microarchitecture Drawbacks of performance measures Conclusions: Greater MIPS or MFLOPS ratings do not implicitly mean more performance! It is of vital importance to chose well-suited test applications (benchmarks)!

7 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 7 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Measurements Benchmarks Use of existing or synthetic programs to measure the performance These programs are translated and executed on the evaluated computer Therefore, not only the computer hardware, but as well the compiler influences the outcome of a benchmark Monitoring: Monitors are used to observe parts of the computer at run-time Therefore, interesting quantities inside the computer can be measured beside the overall outcome of a benchmark (e.g. cache utilization, network traffic, …) Monitoring can be done by hardware or software

8 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 8 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Benchmark terminology benchmark A test program. benchmark suite A set of benchmarks. synthetic benchmark A test program only useful as benchmark. kernel benchmark A very small synthetic benchmark. Usually a time intensive part of a real program is chosen. Kernel benchmarks are well suited for design and simulation but normally unqualified to compare complete systems. benchmark application A complete program additionally used as benchmark. Opposite to synthetic benchmark.

9 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 9 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt SPEC-Benchmarks SPEC Standard Performance Evaluation Corporation since 1989, consortium of different manufacturer, general purpose computer applications, mainly to measure speed and throughput Several benchmark suites, e.g. SPEC95, SPECweb96, SPEC JVM98 SPEC JBB2000 SPEC CINT 2006 SPEC CFP 2006

10 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 10 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt SPECmarks Goal: comparable values for different systems But: single values don't always reflect real relations, therefore only a first indication to select or judge a computer CPU performance plus cache, memory and compiler is measured, the operating system and IO is less relevant – Integer test-programs (ANSI C) – Floating-point test-programs (Fortran77) – SPECmark: this characteristic is the geometric mean of the individual program characteristics contained in the suite

11 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 11 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt SPEC-CINT2006: 12 Integer test programs (C, C++) namedescription perlbenchPERL interpreter bzip2bzip compressionsprogram gccGNU-C-Compiler version 3.2 mcfSimplex algorithm for traffic planning gobmkAI implementation of the game Go hmmerProtein sequence analysis based on a hidden Markov model sjengChess program libquantumQuantum computer simulator h264refH.264 codec omnetppOMNET++ discrete event simulator astarRoute planning xalancbmkXML translator

12 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 12 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt SPEC-CFP2006: 17 Floating-point test programs (C, C++, FORTRAN) namedescription bwavesFluid dynamics algorithm gamessQuantum chemistry algorithm milcPhysics algorithm zeusmpFluid dynamics algorithm gromacsNewton's equations of motion cactusADMEquation solver for Einstein's evolutionary equation leslie3dFluid dynamics algorithm namdBiomolecular simulation dealllFinite-Elements soplexSimplex algorithm povrayImage rendering calculixFinite-Elements GemsFDTDMaxwell equation solver tontoQuantum chemistry lbmLattice-Bolzmann-simulator wrfWeather modeling Shinx3Speach recognition

13 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 13 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt More popular benchmark suites Basic Linear Algebra Subprograms (BLAS): For numerical applications Core of the LINPACK software package to solve lienar equation systems TOP 500 list of the fastest parallel computers Whetstone-Benchmark: Developed in the seventies, a single program with lot of floating-point calculations Dhrystone-Benchmark: Improvement of Whetstone, developed in the eighties Powerstone-Benchmark-Suite: To compare the energy consumption of microprocessors and microcontrollers

14 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 14 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Powerstone benchmark suite namedescription autoVehicle control bilvLogical and shift operations biltGraphical application compressUNIX compression program crcCRC error detection desData encryption dhryDhrystone engineEngine control fir_intInteger FIR filter g3faxFAX group 3 g721Audio compression jpegJPEG 24-Bit compression pocsagCommunication protocol for pagers servoHard disc control summinHand writing recognition ucbqsortQuick sort v42bitsModem operation whetWhetstone

15 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 15 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Monitors are components recording the states of a system during its normal operation. Contents of registers, flags, buffers and traffic in data paths are recorded. Monitors are used to observe and debug systems. Monitoring

16 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 16 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Generally, monitors can be classified in: a) Hardware monitors A hardware monitor is a separate component which is physically connected to the locations of the target system where measurements take place. Hardware monitors typically consist of comparators and counters to create data, memories to store it and busses for data transport. Thus, hardware monitors use its own resources. Monitoring

17 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 17 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Monitoring b) Software monitors A software monitor is a program, implemented to collect measuring data through interfaces provided by the operation system, the programming languages or application program. A software monitor uses the resources of the observed system to collect, transport and store data. c) Hybrid monitors A hybrid monitor is a mixed hardware and software monitor. Often simple elements like counters and memories are implemented in hardware while more complex observation functions are implemented in software.

18 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 18 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Monitoring constraints 1.Accessing information Ideally monitoring is integrated into the hardware and software components of a system during design. Software monitors are cheaper than hardware monitors but they may influence the systems run time behavior. 2.Reaction less monitoring Hardware and most hybrid monitors store the recorded data in their own memories. Software monitors have to use the memories of the observed system. Thus, hardware monitors are more reaction less than software monitors.

19 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 19 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Monitoring constraints: 3.Amount of recorded data and its further processing Most purposes, especially debugging, require observations with high resolution. For the accurate analysis of program errors the causing machine instruction has to be identified. For other purposes, e.g. a global performance analysis, a coarser resolution is sufficient. Although it often seems necessary to record observable data on the level of machine instruction execution, this would generate traces much greater than the memory usage of the observed application. Thus, the cost to store this high amount of data and the general difficulties of processing the trace data prohibit a complete recording of traces at machine instruction level.

20 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 20 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Instrumentation One way of software monitoring is to insert measuring commands into program code e.g. loop or time counters. This is called instrumentation. Instrumentation can be performed by the user, the compiler, the class library or the operation system. instrumented program computer measure system results measure results

21 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 21 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt hardware very high Hardware monitor hardware high instrumented program hard- and satisfactory simulation program software + hardware Trace software sufficient simulation program system state accuracy tools Montitoring overview method direct instrumentation trace driven simulation

22 Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 22 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Typical load-dependent parameters throughput Defines the average number of jobs completed per time unit. A job may be: execution of an instruction or a program, saving a data block or sending a message. utilization Defines the throughput (average number of jobs completed) divided by the maximum possible throughput. response time Defines the average time needed to complete a job. utilization ratio Defines the time spent working on the jobs divided by whole operating time.


Download ppt "Hier wird Wissen Wirklichkeit Computer Architecture – Part 6 – page 1 of 22 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Part 6 Fundamentals."

Similar presentations


Ads by Google