Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 ©2006-2009 McQuain & Ribbens Defining Performance Which airplane has.

Slides:



Advertisements
Similar presentations
CS1104: Computer Organisation School of Computing National University of Singapore.
Advertisements

CS2100 Computer Organisation Performance (AY2014/2015) Semester 2.
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
1  1998 Morgan Kaufmann Publishers Chapter 2 Performance Text in blue is by N. Guydosh Updated 1/25/04*
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
Computer Organization and Architecture 18 th March, 2008.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
1 Introduction Rapidly changing field: –vacuum tube -> transistor -> IC -> VLSI (see section 1.4) –doubling every 1.5 years: memory capacity processor.
1 CSE SUNY New Paltz Chapter 2 Performance and Its Measurement.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
Computer ArchitectureFall 2007 © September 17, 2007 Karem Sakallah CS-447– Computer Architecture.
CS/ECE 3330 Computer Architecture Chapter 1 Performance / Power.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
1  1998 Morgan Kaufmann Publishers and UCB Performance CEG3420 Computer Design Lecture 3.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Chapter 4 Assessing and Understanding Performance
Lecture 3: Computer Performance
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
1 ECE3055 Computer Architecture and Operating Systems Lecture 2 Performance Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia.
Introduction to Computer Architecture SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING SUMMER 2015 RAMYAR SAEEDI.
Computer Organization and Design Performance Montek Singh Mon, April 4, 2011 Lecture 13.
1 Computer Performance: Metrics, Measurement, & Evaluation.
1 Embedded Systems Computer Architecture. Embedded Systems2 Memory Hierarchy Registers Cache RAM Disk L2 Cache Speed (faster) Cost (cheaper per-byte)
순천향대학교 정보기술공학부 이 상 정 1 4. Accessing and Understanding Performance.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
1 CPS4150 Chapter 4 Assessing and Understanding Performance.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
Performance.
1 CS465 Performance Revisited (Chapter 1) Be able to compare performance of simple system configurations and understand the performance implications of.
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Morgan Kaufmann Publishers
1 COMS 361 Computer Organization Title: Performance Date: 10/02/2004 Lecture Number: 3.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
1  1998 Morgan Kaufmann Publishers Where we are headed Performance issues (Chapter 2) vocabulary and motivation A specific instruction set architecture.
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
1  1998 Morgan Kaufmann Publishers Lectures for 2nd Edition Note: these lectures are often supplemented with other materials and also problems from the.
Chapter 4. Measure, Report, and Summarize Make intelligent choices See through the marketing hype Understanding underlying organizational aspects Why.
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
L12 – Performance 1 Comp 411 Computer Performance He said, to speed things up we need to squeeze the clock Study
EGRE 426 Computer Organization and Design Chapter 4.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Yaohang Li.
Chapter 1 Performance & Technology Trends. Outline What is computer architecture? Performance What is performance: latency (response time), throughput.
COD Ch. 1 Introduction + The Role of Performance.
BITS Pilani, Pilani Campus Today’s Agenda Role of Performance.
CPEN Digital System Design Assessing and Understanding CPU Performance © Logic and Computer Design Fundamentals, 4 rd Ed., Mano Prentice Hall © Computer.
Computer Architecture & Operations I
CS161 – Design and Architecture of Computer Systems
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
September 2 Performance Read 3.1 through 3.4 for Tuesday
Defining Performance Which airplane has the best performance?
Computer Architecture & Operations I
Prof. Hsien-Hsin Sean Lee
Morgan Kaufmann Publishers
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
Computer Performance He said, to speed things up we need to squeeze the clock.
Morgan Kaufmann Publishers Computer Performance
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
January 25 Did you get mail from Chun-Fa about assignment grades?
Computer Performance Read Chapter 4
Performance.
Computer Organization and Design Chapter 4
CS2100 Computer Organisation
Presentation transcript:

Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has the best performance?

Performance Computer Organization II 2 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Response Time (latency): how long it takes to perform a task Throughput: total work done per unit time For now, we will focus on response time… Computer Performance: TIME, TIME, TIME

Performance Computer Organization II 3 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Elapsed Time -counts everything (disk and memory accesses, I/O, etc.) -a useful number, but often not good for comparison purposes CPU time -doesn't count I/O or time spent running other programs -can be broken up into system time, and user time Our focus: user CPU time -time spent executing the lines of code that are "in" our program Execution Time

Performance Computer Organization II 4 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Relative Performance Example: time taken to run a program – 10s on A, 15s on B – Execution Time B / Execution Time A = 15s / 10s = 1.5 – So A is 1.5 times faster than B

Performance Computer Organization II 5 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens CPU Clocking Operation of digital hardware are governed by a constant-rate clock. Clock period: duration of a clock cycle – e.g., 250ps = 0.25ns = 250×10 –12 s Clock frequency (rate): cycles per second – e.g., 4.0GHz = 4000MHz = 4.0×10 9 Hz Clock (cycles) Data transfer and computation Update state Clock period

Performance Computer Organization II 6 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens CPU Time Performance improved by – Reducing number of clock cycles – Increasing clock rate – Hardware designer must often trade off clock rate against cycle count

Performance Computer Organization II 7 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens CPU Time Example Computer A: 2GHz clock, 10s CPU time Designing Computer B – Aim for 6s CPU time – Can do faster clock, but causes 1.2 × clock cycles How fast must Computer B clock be?

Performance Computer Organization II 8 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Could assume that number of cycles equals number of instructions: This assumption is incorrect, different instructions take different amounts of time on different machines. Why? hint: remember that these are machine instructions, not lines of C code time 1st instruction2nd instruction3rd instruction4th 5th6th... How many cycles are required for a program?

Performance Computer Organization II 9 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens A given program will require -some number of instructions (machine instructions) -some number of cycles -some number of seconds We have a vocabulary that relates these quantities: -cycle time (seconds per cycle) -clock rate (cycles per second) -CPI (cycles per instruction) a floating point intensive application might have a higher CPI -MIPS (millions of instructions per second) this would be higher for a program using simple instructions Now that we understand cycles…

Performance Computer Organization II 10 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Performance Performance is determined by execution time Do any of the other variables equal performance? # of cycles to execute program? # of instructions in program? # of cycles per second? average # of cycles per instruction? average # of instructions per second? Common pitfall: thinking one of the variables is indicative of performance when it really isn’t.

Performance Computer Organization II 11 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens CPI Example Computer A: Cycle Time = 250ps, CPI = 2.0 Computer B: Cycle Time = 500ps, CPI = 1.2 Same ISA Which is faster, and by how much? …by this much A is faster…

Performance Computer Organization II 12 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens CPI in More Detail If different instruction classes take different numbers of cycles Weighted average CPI Relative frequency

Performance Computer Organization II 13 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens CPI Example Alternative compiled code sequences using instructions in classes A, B, C ClassABC CPI for class123 IC in sequence 1212 IC in sequence 2411 Sequence 1: IC = 5 – Clock Cycles = 2×1 + 1×2 + 2×3 = 10 – Avg. CPI = 10/5 = 2.0 Sequence 2: IC = 6 – Clock Cycles = 4×1 + 1×2 + 1×3 = 9 – Avg. CPI = 9/6 = 1.5

Performance Computer Organization II 14 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens A compiler designer is trying to decide between two code sequences for a particular machine. Based on the hardware implementation, there are three different classes of instructions: Class A, Class B, and Class C, and they require one, two, and three cycles (respectively). The first code sequence has 5 instructions: 2 of A, 1 of B, and 2 of C The second sequence has 6 instructions: 4 of A, 1 of B, and 1 of C. Which sequence will be faster? How much? What is the CPI for each sequence? # of Instructions Example

Performance Computer Organization II 15 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Two different compilers are being tested for a 4 GHz. machine with three different classes of instructions: Class A, Class B, and Class C, which require one, two, and three cycles (respectively). Both compilers are used to produce code for a large piece of software. The first compiler's code uses 5 million Class A instructions, 1 million Class B instructions, and 2 million Class C instructions. The second compiler's code uses 10 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions. Which sequence will be faster according to MIPS? Which sequence will be faster according to execution time? MIPS example

Performance Computer Organization II 16 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Execution Time After Improvement = Execution Time Unaffected +( Execution Time Affected / Amount of Improvement ) Example: Suppose a program runs in 100 seconds on a machine, with multiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster? How about making it 5 times faster? Principle: Make the common case fast Amdahl's Law

Performance Computer Organization II 17 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Suppose we enhance a machine making all floating-point instructions run five times faster. If the execution time of some benchmark before the floating-point enhancement is 10 seconds, what will the speedup be if half of the 10 seconds is spent executing floating- point instructions? We are looking for a benchmark to show off the new floating-point unit described above, and want the overall benchmark to show a speedup of 3. One benchmark we are considering runs for 100 seconds with the old floating-point hardware. How much of the execution time would floating-point instructions have to account for in this program in order to yield our desired speedup on this benchmark? Example

Performance Computer Organization II 18 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Performance is specific to a particular program/s -Total execution time is a consistent summary of performance For a given architecture performance increases come from: -increases in clock rate (without adverse CPI affects) -improvements in processor organization that lower CPI -compiler enhancements that lower CPI and/or instruction count -algorithm/language choices that affect instruction count Pitfall: expecting improvement in one aspect of a machine’s performance to affect the total performance Remember

Performance Computer Organization II 19 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Performance Summary Performance depends on – Algorithm: affects IC, possibly CPI – Programming language: affects IC, CPI – Compiler: affects IC, CPI – Instruction set architecture: affects IC, CPI, T c