BITS Pilani, Pilani Campus Today’s Agenda Role of Performance.

Slides:



Advertisements
Similar presentations
CS1104: Computer Organisation School of Computing National University of Singapore.
Advertisements

TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
1  1998 Morgan Kaufmann Publishers Chapter 2 Performance Text in blue is by N. Guydosh Updated 1/25/04*
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Patterson.
1 Introduction Rapidly changing field: –vacuum tube -> transistor -> IC -> VLSI (see section 1.4) –doubling every 1.5 years: memory capacity processor.
Chapter 4 Assessing and Understanding Performance Bo Cheng.
1 CSE SUNY New Paltz Chapter 2 Performance and Its Measurement.
Chapter 4 Assessing and Understanding Performance
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
Computer ArchitectureFall 2007 © September 17, 2007 Karem Sakallah CS-447– Computer Architecture.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
1  1998 Morgan Kaufmann Publishers and UCB Performance CEG3420 Computer Design Lecture 3.
Computer ArchitectureFall 2007 © September 19, 2007 Karem Sakallah CS-447– Computer Architecture.
EECE476: Computer Architecture Lecture 11: Understanding and Assessing Performance Chapter 4.1, 4.2 The University of British ColumbiaEECE 476© 2005 Guy.
Assessing and Understanding Performance B. Ramamurthy Chapter 4.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Chapter 4 Assessing and Understanding Performance
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
Lecture 3: Computer Performance
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
1 ECE3055 Computer Architecture and Operating Systems Lecture 2 Performance Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia.
Computer Organization and Design Performance Montek Singh Mon, April 4, 2011 Lecture 13.
1 Computer Performance: Metrics, Measurement, & Evaluation.
1 Embedded Systems Computer Architecture. Embedded Systems2 Memory Hierarchy Registers Cache RAM Disk L2 Cache Speed (faster) Cost (cheaper per-byte)
1 CHAPTER 2 THE ROLE OF PERFORMANCE. 2 Performance Measure, Report, and Summarize Make intelligent choices Why is some hardware better than others for.
순천향대학교 정보기술공학부 이 상 정 1 4. Accessing and Understanding Performance.
1 CPS4150 Chapter 4 Assessing and Understanding Performance.
Performance.  Measure, Report, and Summarize  Make intelligent choices  See through the marketing hype  Key to understanding underlying organizational.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
1 Acknowledgements Class notes based upon Patterson & Hennessy: Book & Lecture Notes Patterson’s 1997 course notes (U.C. Berkeley CS 152, 1997) Tom Fountain.
Performance.
1 CS465 Performance Revisited (Chapter 1) Be able to compare performance of simple system configurations and understand the performance implications of.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Morgan Kaufmann Publishers
1 COMS 361 Computer Organization Title: Performance Date: 10/02/2004 Lecture Number: 3.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors
1  1998 Morgan Kaufmann Publishers Lectures for 2nd Edition Note: these lectures are often supplemented with other materials and also problems from the.
Chapter 4. Measure, Report, and Summarize Make intelligent choices See through the marketing hype Understanding underlying organizational aspects Why.
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
L12 – Performance 1 Comp 411 Computer Performance He said, to speed things up we need to squeeze the clock Study
EGRE 426 Computer Organization and Design Chapter 4.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Yaohang Li.
COD Ch. 1 Introduction + The Role of Performance.
CPEN Digital System Design Assessing and Understanding CPU Performance © Logic and Computer Design Fundamentals, 4 rd Ed., Mano Prentice Hall © Computer.
Computer Organization
Computer Architecture & Operations I
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
September 2 Performance Read 3.1 through 3.4 for Tuesday
Defining Performance Which airplane has the best performance?
Computer Architecture & Operations I
Prof. Hsien-Hsin Sean Lee
Morgan Kaufmann Publishers
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
Computer Performance He said, to speed things up we need to squeeze the clock.
Morgan Kaufmann Publishers Computer Performance
Performances of Computer Systems
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Computer Performance Read Chapter 4
Performance.
Chapter 2: Performance CS 447 Jason Bakos Fall 2001 CS 447.
Computer Organization and Design Chapter 4
CS2100 Computer Organisation
Presentation transcript:

BITS Pilani, Pilani Campus Today’s Agenda Role of Performance

BITS Pilani, Pilani Campus Performance Performance is the key to understanding underlying motivation for the hardware and its organization Measure, report, and summarize performance to enable users to – make intelligent choices – see through the marketing hype! Why is some hardware better than others for different programs? What factors of system performance are hardware related? (e.g., do we need a new machine, or a new operating system?) How does the machine's instruction set affect performance?

BITS Pilani, Pilani Campus Airplane Passengers Range (mi) Speed (mph) Boeing Boeing BAC/Sud Concorde Douglas DC What do we measure? Define performance…. How much faster is the Concorde compared to the 747? How much bigger is the Boeing 747 than the Douglas DC-8? So which of these airplanes has the best performance?

BITS Pilani, Pilani Campus Defining Performance Which airplane has the best performance?

BITS Pilani, Pilani Campus Computer Performance: TIME, TIME, TIME!!! Response Time (elapsed time, latency): – how long does it take for my job to run? – how long does it take to execute (start to finish) my job? – how long must I wait for the database query? Throughput: – how many jobs can the machine run at once? – what is the average execution rate? – how much work is getting done? Individual user concerns… Systems manager concerns…

BITS Pilani, Pilani Campus Execution Time Elapsed Time – counts everything (disk and memory accesses, waiting for I/O, running other programs, etc.) from start to finish – a useful number, but often not good for comparison purposes elapsed time = CPU time + wait time (I/O, other programs, etc.) CPU time – doesn't count waiting for I/O or time spent running other programs – can be divided into user CPU time and system CPU time (OS calls) CPU time = user CPU time + system CPU time  elapsed time = user CPU time + system CPU time + wait time Our focus: user CPU time (CPU execution time or, simply, execution time) – time spent executing the lines of code that are in our program

BITS Pilani, Pilani Campus Definition of Performance For some program running on machine X: Performance X = 1 / Execution time X X is n times faster than Y means: Performance X / Performance Y = n

BITS Pilani, Pilani Campus Clock Cycles Instead of reporting execution time in seconds, we often use cycles. In modern computers hardware events progress cycle by cycle: in other words, each event, e.g., multiplication, addition, etc., is a sequence of cycles Clock ticks indicate start and end of cycles: cycle time = time between ticks = seconds per cycle clock rate (frequency) = cycles per second (1 Hz. = 1 cycle/sec, 1 MHz. = 10 6 cycles/sec) Example: A 200 Mhz. clock has a cycle time time cycle tick

BITS Pilani, Pilani Campus Performance Equation I So, to improve performance one can either: – reduce the number of cycles for a program, or – reduce the clock cycle time, or, equivalently, – increase the clock rate CPU execution time CPU clock cycles Clock cycle time for a program =  equivalently

BITS Pilani, Pilani Campus How many cycles are required for a program? Could assume that # of cycles = # of instructions Ti me 1st instruction2nd instruction3rd instruction4th 5th6th... This assumption is incorrect! Because: Different instructions take different amounts of time (cycles)

BITS Pilani, Pilani Campus How many cycles are required for a program? Multiplication takes more time than addition Floating point operations take longer than integer ones Accessing memory takes more time than accessing registers Important point: changing the cycle time often changes the number of cycles required for various instructions because it means changing the hardware design. time

BITS Pilani, Pilani Campus Example Our favorite program runs in 10 seconds on computer A, which has a 4GHz. clock. We are trying to help a computer designer build a new machine B, that will run this program in 6 seconds. The designer can use new (or perhaps more expensive) technology to substantially increase the clock rate, but has informed us that this increase will affect the rest of the CPU design, causing machine B to require 1.2 times as many clock cycles as machine A for the same program. What clock rate should we tell the designer to target? Clock cycles(A)=40x10 9 6=1.2x 40x10 9 /Clock rate(B)  Clock rate (B)=8GHz

BITS Pilani, Pilani Campus Terminology A given program will require: – some number of instructions (machine instructions) – some number of cycles – some number of seconds We have a vocabulary that relates these quantities: – cycle time (seconds per cycle) – clock rate (cycles per second) – (average) CPI (cycles per instruction) a floating point intensive application might have a higher average CPI – MIPS (millions of instructions per second) this would be higher for a program using simple instructions

BITS Pilani, Pilani Campus Performance Measure Performance is determined by execution time Do any of these other variables equal performance? – # of cycles to execute program? – # of instructions in program? – # of cycles per second? – average # of cycles per instruction? – average # of instructions per second? Common pitfall : thinking one of the variables is indicative of performance when it really isn’t

BITS Pilani, Pilani Campus Performance Equation II CPU execution time Instruction count average CPI Clock cycle time for a program Derive the above equation from Performance Equation I = 

BITS Pilani, Pilani Campus CPI Example Computer A: Cycle Time = 250ps, CPI = 2.0 Computer B: Cycle Time = 500ps, CPI = 1.2 Same ISA Which is faster, and by how much? A is faster… …by this much

BITS Pilani, Pilani Campus CPI in More Detail If different instruction classes take different numbers of cycles Weighted average CPI Relative frequency

BITS Pilani, Pilani Campus CPI Example Alternative compiled code sequences using instructions in classes A, B, C ClassABC CPI for class123 IC in sequence 1212 IC in sequence 2411 Sequence 1: IC = 5 Clock Cycles = 2×1 + 1×2 + 2×3 = 10 Avg. CPI = 10/5 = 2.0 Sequence 2: IC = 6 Clock Cycles = 4×1 + 1×2 + 1×3 = 9 Avg. CPI = 9/6 = 1.5

BITS Pilani, Pilani Campus MIPS Example Two different compilers are being tested for a 4 GHz. machine with three different classes of instructions: Class A, Class B, and Class C, which require 1, 2 and 3 cycles (respectively). Both compilers are used to produce code for a large piece of software. Compiler 1 generates code with 5 billion Class A instructions, 1 billion Class B instructions, and 1 billion Class C instructions. Compiler 2 generates code with 10 billion Class A instructions, 1 billion Class B instructions, and 1 billion Class C instructions. Which sequence will be faster according to MIPS? Which sequence will be faster according to execution time?

BITS Pilani, Pilani Campus Performance Summary Performance depends on – Algorithm: affects IC, possibly CPI – Programming language: affects IC, CPI – Compiler: affects IC, CPI – Instruction set architecture: affects IC, CPI, T c