Benchmarks Programs specifically chosen to measure performance Must reflect typical workload of the user Benchmark types Real applications Small benchmarks.

Slides:



Advertisements
Similar presentations
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Yaohang Li.
Advertisements

CS2100 Computer Organisation Performance (AY2014/2015) Semester 2.
Computer Abstractions and Technology
Power calculation for transistor operation What will cause power consumption to increase? CS2710 Computer Organization1.
ECE-C355 Computer Structures Winter 2008 Chapter 04: Understanding Performance Slides are adapted from Professor Mary Jane Irwin (
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
Performance COE 308 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals.
Lecture 7: 9/17/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
Performance ICS 233 Computer Architecture and Assembly Language Dr. Aiman El-Maleh College of Computer Sciences and Engineering King Fahd University of.
Chapter 4 Assessing and Understanding Performance Bo Cheng.
1 CSE SUNY New Paltz Chapter 2 Performance and Its Measurement.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.5 Comparing and Summarizing Performance.
Computer Performance Evaluation: Cycles Per Instruction (CPI)
Computer ArchitectureFall 2007 © September 17, 2007 Karem Sakallah CS-447– Computer Architecture.
CS/ECE 3330 Computer Architecture Chapter 1 Performance / Power.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
Chapter 4 Assessing and Understanding Performance
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
1 ECE3055 Computer Architecture and Operating Systems Lecture 2 Performance Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Chapter 1 Section 1.4 Dr. Iyad F. Jafar Evaluating Performance.
Computer Organization and Design Performance Montek Singh Mon, April 4, 2011 Lecture 13.
1 Computer Performance: Metrics, Measurement, & Evaluation.
CSE 340 Computer Architecture Summer 2014 Understanding Performance
Computer Performance Computer Engineering Department.
5/26/2016Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
Computer Architecture
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Lecture2: Performance Metrics Computer Architecture By Dr.Hadi Hassan 1/3/2016Dr. Hadi Hassan Computer Architecture 1.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /03/2013 Lecture 3: Computer Performance Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
L12 – Performance 1 Comp 411 Computer Performance He said, to speed things up we need to squeeze the clock Study
Performance COE 301 Computer Organization Dr. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals.
ECE/CS 552: Benchmarks, Means and Amdahl’s Law © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi,
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
Computer Architecture CSE 3322 Web Site crystal.uta.edu/~jpatters/cse3322 Send to Pramod Kumar, with the names and s.
BITS Pilani, Pilani Campus Today’s Agenda Role of Performance.
Chapter 1 Performance & Technology Trends. Outline What is computer architecture? Performance What is performance: latency (response time), throughput.
CSE 340 Computer Architecture Summer 2016 Understanding Performance.
June 20, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 1: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed.
Measuring Performance II and Logic Design
Lecture 2: Performance Evaluation
CS161 – Design and Architecture of Computer Systems
September 2 Performance Read 3.1 through 3.4 for Tuesday
Defining Performance Which airplane has the best performance?
Prof. Hsien-Hsin Sean Lee
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
Computer Performance He said, to speed things up we need to squeeze the clock.
CMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture
January 25 Did you get mail from Chun-Fa about assignment grades?
Benchmarks Programs specifically chosen to measure performance
CS2100 Computer Organisation
Presentation transcript:

Benchmarks Programs specifically chosen to measure performance Must reflect typical workload of the user Benchmark types Real applications Small benchmarks Benchmark suites Synthetic benchmarks

Real Applications Workload: Set of programs a typical user runs day in and day out. To use these real applications for metrics is a direct way of comparing the execution time of the workload on two machines. Using real applications for metrics has certain restrictions: They are usually big Takes time to port to different machines Takes considerable time to execute Hard to observe the outcome of a certain improvement technique

Comparing & Summarizing Performance A is 100 times faster than B for program 1 B is 10 times faster than A for program 2 For total performance, arithmetic mean is used: Computer AComputer B Program 11 s100 s Program s100 s Total time1001 s200 s

Arithmetic Mean If each program, in the workload, are not run equal # times, then we have to use weighted arithmetic mean: weight Computer AComputer B Program 1 (seconds) Program 2 (seconds) Weighted AM -?? Suppose that the program 1 runs 10 times as often as the program 2. Which machine is faster?

Small Benchmarks Small code segments which are common in many applications For example, loops with certain instruction mix for (j = 0; j<8; j++) S = S + A j  B i-j Good for architects and designers Since small code segments are easy to compile and simulate even by hand, designers use these kind of benchmarks while working on a novel machine Can be abused by compiler designers by introducing special-purpose optimizations targeted at specific benchmark.

Benchmark Suites SPEC (Standard Performance Evaluation Corporation) Non-profit organization that aims to produce "fair, impartial and meaningful benchmarks for computers” Began in SPEC89 (CPU intensive) Companies agreed on a set of real programs and inputs which they hope reflect a typical user’s workload best. Valuable indicator of performance Can still be abused Updates are required as the applications and their workload change by time

SPEC Benchmark Sets CPU Performance (SPEC CPU2006) Graphics (SPECviewperf) High-performance computing (HPC2002, MPI200 7, OMP2001) Java server applications (jAppServer2004) a multi-tier benchmark for measuring the performance of Java 2 Enterprise Edition (J2EE) technology-based application servers. Mail systems (MAIL2001, SPECimap2003) Network File systems (SFS97_R1 (3.0)) Web servers (SPEC WEB99, SPEC WEB99 SSL) More information:

SPECInt Integer Benchmarks NameDescription 400.perlbenchProgramming Language 401.bzip2Compression 403.gccC Compiler 429.mcfCombinatorial Optimization 445.gobmkArtificial Intelligence 456.hmmerSearch Gene Sequence 458.sjengArtificial Intelligence 462.libquantumPhysics / Quantum Computing 464.h264refVideo Compression 471.omnetppDiscrete Event Simulation 473.astarPath-finding Algorithms 483.xalancbmkXML Processing

SPECfp Floating Point Benchmarks NameType wupwiseQuantum chromodynamics swimShallow water model mgridmultigrid solver in 3D potential field appluParabolic/elliptic partial dif. equation mesaThree-dimensional graphics library galgelComputational fluid dynamics artImage recognition using neural nets equakeSeismic wave propagation simulation facerecImage recognition of faces ammpComputational chemistry lucasPrimality testing fma3dCrash simulation sixtrackHigh-energy nuclear physics acceleration design apsimeteorology; pollutant distribution

SPEC CPU2006 – Summarizing SPEC ratio: the execution time measurements are normalized by dividing the measured execution time by the execution time on a reference machine Sun Microsystems Fire V20z, which has an AMD Opteron 252 CPU, running at 2600 MHz. 164.gzip benchmark executes in 90.4 s. The reference time for this benchmark is 1400 s, benchmark is 1400/90.4 × 100 = 1548 (a unitless value) Performances of different programs in the suites are summarized using “geometric mean” of SPEC ratios.

Pentium III & Pentium 4

Comparing Pentium III and Pentium 4 RatioPentium IIIPentium 4 CINT2000/Clock rate in MHz CFP2000/Clock rate in MHz Implementation efficiency?

SPEC WEB99 SystemProcessor# of disk drivers # of CPUs # of networks Clock rate (GHz) Result 1550/1000Pentium III Pentium III Pentium III Pentium III Pentium 4 Xeon Pentium 4 Xeon /700Pentium III Xeon Pentium 4 Xeon MP /700Pentium III Xeon

Power Consumption Concerns Performance studied at different levels: 1. Maximum power 2. Intermediate level that conserves battery life 3. Minimum power that maximizes battery life Intel Mobile Pentium & Pentium M: two available clock rates 1. Maximum 2. Reduced clock rate Pentium 1.6/0.6 GHz Pentium 2.4/1.2 GHz Pentium 1.2/0.8 GHz

Three Intel Mobile Processors

Energy Efficiency

Synthetic Benchmarks Artificial programs constructed to try to match the characteristics of a large set of program. Goal: Create a single benchmark program where the execution frequency of instructions in the benchmark simulates the instruction frequency in a large set of benchmarks. Examples: Dhrystone, Whetstone They are not real programs Compiler and hardware optimizations can inflate the improvement far beyond what the same optimization would do with real programs

Amdahl’s Law in Computing Improving one aspect of a machine by a factor of n does not improve the overall performance by the same amount. Speedup = (Performance after imp.) / (Performance before imp.) Speedup = (Execution time before imp.)/ (Execution time after imp.) Execution Time After Improvement = Execution Time Unaffected + (Execution Time Affected/n)

Amdahl’s Law Example: Suppose a program runs in 100 s on a machine, with multiplication responsible for 80 s of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster? Can we improve the performance by a factor 5?

Amdahl’s Law The performance enhancement possible due to a given improvement is limited by the amount that the improved feature is used. In previous example, it makes sense to improve multiplication since it takes 80% of all execution time. But after certain improvement is done, the further effort to optimize the multiplication more will yield insignificant improvement. Law of Diminishing Returns A corollary to Amdahl’s Law is to make a common case faster.

Examples Suppose we enhance a machine making all floating-point instructions run five times faster. If the execution time of some benchmark before the floating-point enhancement is 10 seconds, what will the speedup be if half of the 10 seconds is spent executing floating-point instructions? We are looking for a benchmark to show off the new floating-point unit described above, and want the overall benchmark to show a speedup of 3. One benchmark we are considering runs for 90 seconds with the old floating-point hardware. How much of the execution time would floating- point instructions have to account for in this program in order to yield our desired speedup on this benchmark?

Remember Total execution time is a consistent summary of performance Execution Time = (IC  CPI)/f For a given architecture, performance increases come from: 1. increases in clock rate (without too much adverse CPI effects) 2. improvements in processor organization that lower CPI 3. compiler enhancements that lower CPI and/or IC