Performance Evaluation of Architectures Vittorio Zaccaria.

Slides:



Advertisements
Similar presentations
Computer Organization Lab 1 Soufiane berouel. Formulas to Remember CPU Time = CPU Clock Cycles x Clock Cycle Time CPU Clock Cycles = Instruction Count.
Advertisements

CS1104: Computer Organisation School of Computing National University of Singapore.
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
Computer Organization and Architecture 18 th March, 2008.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
ENGS 116 Lecture 21 Performance and Quantitative Principles Vincent H. Berk September 26 th, 2008 Reading for today: Chapter , Amdahl article.
CIS629 Fall Lecture Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important.
Computer Performance Evaluation: Cycles Per Instruction (CPI)
CIS429.S00: Lec2- 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important quantitative.
Computer Architecture Lecture 2 Instruction Set Principles.
Chapter 4 Assessing and Understanding Performance
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
CIS429/529 Winter 07 - Performance - 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
1 Measuring Performance Chris Clack B261 Systems Architecture.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Where Has This Performance Improvement Come From? Technology –More transistors per chip –Faster logic Machine Organization/Implementation –Deeper pipelines.
Lecture 2: Computer Performance
Copyright 1995 by Coherence LTD., all rights reserved (Revised: Oct 97 by Rafi Lohev, Oct 99 by Yair Wiseman, Sep 04 Oren Kapah) IBM י ב מ 7-1 Measuring.
1 CHAPTER 2 THE ROLE OF PERFORMANCE. 2 Performance Measure, Report, and Summarize Make intelligent choices Why is some hardware better than others for.
Operation Frequency No. of Clock cycles ALU ops % 1 Loads 25% 2
PerformanceCS510 Computer ArchitecturesLecture Lecture 3 Benchmarks and Performance Metrics Lecture 3 Benchmarks and Performance Metrics.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
Digital System Architecture 1 28 ต.ค ต.ค ต.ค ต.ค ต.ค. 58 Lecture 2a Computer Performance and Cost Pradondet Nilagupta.
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
1 CS465 Performance Revisited (Chapter 1) Be able to compare performance of simple system configurations and understand the performance implications of.
Computer Architecture
Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Morgan Kaufmann Publishers
CPU Performance using Different Parameters CS 250: Andrei D. Coronel, MS,CEH,PhD Cand.
Performance Enhancement. Performance Enhancement Calculations: Amdahl's Law The performance enhancement possible due to a given design improvement is.
Lecture2: Performance Metrics Computer Architecture By Dr.Hadi Hassan 1/3/2016Dr. Hadi Hassan Computer Architecture 1.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Chapter 4. Measure, Report, and Summarize Make intelligent choices See through the marketing hype Understanding underlying organizational aspects Why.
L12 – Performance 1 Comp 411 Computer Performance He said, to speed things up we need to squeeze the clock Study
EGRE 426 Computer Organization and Design Chapter 4.
Computer Engineering Rabie A. Ramadan Lecture 2. Table of Contents 2 Architecture Development and Styles Performance Measures Amdahl’s Law.
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
Computer Architecture CSE 3322 Web Site crystal.uta.edu/~jpatters/cse3322 Send to Pramod Kumar, with the names and s.
Performance COE 301 / ICS 233 Computer Organization Prof. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum.
EEL-4713 Ann Gordon-Ross.1 EEL-4713 Computer Architecture Performance.
Performance. Moore's Law Moore's Law Related Curves.
CpE 442 Introduction to Computer Architecture The Role of Performance
Measuring Performance II and Logic Design
September 2 Performance Read 3.1 through 3.4 for Tuesday
Performance Performance The CPU Performance Equation:
Defining Performance Which airplane has the best performance?
CSCE 212 Chapter 4: Assessing and Understanding Performance
Chapter 1 Fundamentals of Computer Design
Performance COE 301 Computer Organization
Defining Performance Section /14/2018 9:52 PM.
Computer Performance He said, to speed things up we need to squeeze the clock.
CMSC 611: Advanced Computer Architecture
Performance Cycle time of a computer CPU speed speed = 1 / cycle time
Arrays versus Pointers
August 30, 2000 Prof. John Kubiatowicz
CMSC 611: Advanced Computer Architecture
Parameters that affect it How to improve it and by how much
Computer Performance Read Chapter 4
Chapter 2: Performance CS 447 Jason Bakos Fall 2001 CS 447.
Computer Organization and Design Chapter 4
Presentation transcript:

Performance Evaluation of Architectures Vittorio Zaccaria

Vittorio Zaccaria, Architectures Performance Evaluation From the client perspective: response time (or latency): time to run the task. From the server perspective: Throughput (or bandwidth): tasks executed per second.

Vittorio Zaccaria, Architectures Speedup X is n% faster than Y if: ExTime(y) Speedup(x,y)= = 1+n/100 ExTime(x)

Vittorio Zaccaria, Architectures Performance and Speedup Performance(A)=1/ExTime(A). Speedup(x,y)= Performance(x)/Performance(y)

Vittorio Zaccaria, Architectures Excercise: A executes a task in 10 secs. B executes the same task in 15 secs What is true? 1) A is 50% faster than B 2) A is 33% faster than B

Vittorio Zaccaria, Architectures Excercise (15 min) Linpack and Dhrystone benchmarks on several VAX models: ModelYearLinpack ExTime Dhrystone ExTime VAX-11/ VAX VAX

Vittorio Zaccaria, Architectures Excercise: Calculate: In the Linpack case: Total speedup and average per-year speedup from VAX8600 to VAX780 The same for VAX8550 and VAX8600 In the Dhrystone case: Total speedup and average per-year speedup from VAX8600 to VAX780 The same for VAX8550 and VAX8600

Vittorio Zaccaria, Architectures Excercise speedup Average per Year speedup

Vittorio Zaccaria, Architectures Amdahl's Law

Vittorio Zaccaria, Architectures Amdahl’s Law ExTime new = ExTime old x (1 - Fraction enhanced ) + Fraction enhanced Speedup overall = ExTime old ExTime new Speedup enhanced = 1 (1 - Fraction enhanced ) + Fraction enhanced Speedup enhanced If speedup-enhanced goes to infinity, speedup-oveall reaches 1/(1-fraction_enhanced)

Vittorio Zaccaria, Architectures Excercise on Amdhal’s Law Floating point instructions improved to run 2X; but only 10% of actual instructions are FP Speedup overall = ?

Vittorio Zaccaria, Architectures Excercise on Amdhal’s Law Speedup overall = =1.053 ExTime new = ExTime old x ( /2) = 0.95 x ExTime old Solution:

Vittorio Zaccaria, Architectures nd Excercise on Amdhal’s Law Suppose to improve the CPU speed 5X (with a 5X cost) Suppose that the CPU is used 50% of the time and that the base CPU cost is 1/3 of the entire system It is worth to upgrade the CPU? Compare speedup and costs!

Vittorio Zaccaria, Architectures nd Excercise on Amdhal’s Law Speedup=1/( /5)=1.67 Increased= (2/3)+(1/3)*5=2.33  It is not worth to upgrade the CPU!

Vittorio Zaccaria, Architectures Performance Indexes Response time = latency due to the completion of a task including disk accesses, memory accesses, I/O Activity and other parallel tasks. CPU time = does not include I/O wait time and corresponds to CPU user time and the CPU system time (OS)

Vittorio Zaccaria, Architectures CPU time CPUtime(P)= Clock Cycles needed to exec P clock frequency

Vittorio Zaccaria, Architectures Average CPI The average Clock Cycles per Instruction (CPI) can be defined as: clock cycles needed to exec. P CPI(P)= number of instructions CPUtime= Tclock*CPI*Ninst = (CPI*Ninst)/f

Vittorio Zaccaria, Architectures Aspects of CPU performance CPU time= Seconds= Instructions x Cycles x Seconds Program Program Instruction Cycle CPU time= Seconds= Instructions x Cycles x Seconds Program Program Instruction Cycle

Vittorio Zaccaria, Architectures Aspects of CPU performance The CPI can vary among instructions: CPI_i is the number of clock cycles needed by instruction type i IC_i is the number of times that instruction i is executed. CPU time =CycleTime * Σ CPI * IC i = 1 n i i

Vittorio Zaccaria, Architectures Overall CPI The overall CPI can be expressed as (CPU clock cycles)/Instructions: CPI = Σ CPI i *(I i / instructions) i = 1 n Invest Resources where time is Spent!

Vittorio Zaccaria, Architectures Excercise Base Machine (Reg / Reg) OpFreqCycles ALU50%1 Load20%5 Store10%3 Branch20%2 A RISC processor shows the following statistics: Calculate the average CPI and the speedup w.r.t.: The same machine with an improved D$ (Load Cycles=2) The same machine with a branch CPI=1 The same machine with 2 ALUs working in parallel.

Vittorio Zaccaria, Architectures Solution Average CPI: 0.5x1+0.2x5+0.1x3+0.2x2=2.2 Use Amdhal’s law to compute overall speedup: Cache improved Speedup: 1.13 Branch improved Speedup: 1.11 ALU improved Speedup: 1.33

Vittorio Zaccaria, Architectures Excercise Procedure calls in architecture A are very expensive. Suppose to introduce a new architecture B similar to A such that: A has a clock 5% faster than B. The fraction of loads/stores of A is 30%. B executes 30% loads/stores less than A Loads/stores require 1 clock cycle. Compare CPU times of A and B.

Vittorio Zaccaria, Architectures Solution Number of instr. of B NB = [1-(0.3x0.3)]*NA=0.9*NA Clock Period of B: TB=TA*1.05 CPUtimeA=1*NA*TA CPUtimeB=0.9*NA*TA*1.05*1 =0.945*CPUtimeA

Vittorio Zaccaria, Architectures MIPS MIPS= millions of instructions per second. number of instructions frequency of the clock = execution time(in sec) * 10^6 CPI * 10^6

Vittorio Zaccaria, Architectures MIPS (cont.) Problem: depends heavily on the ISA. Difficult to compare different ISAs It depends on the program It can be the inverse of the performance!! A complex instruction set can have a MIPS lower than a simple instruction set but can execute in less time programs.

Vittorio Zaccaria, Architectures Relative MIPS Relative MIPS of an architecture A: TCPU_A x MIPS_reference_arch TCPU_reference_arch In the 80’s the reference architecture was the VAX_11/780