CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.

Slides:



Advertisements
Similar presentations
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Advertisements

CS1104: Computer Organisation School of Computing National University of Singapore.
Performance What differences do we see in performance? Almost all computers operate correctly (within reason) Most computers implement useful operations.
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
Computer Organization and Architecture 18 th March, 2008.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Chapter 4 Assessing and Understanding Performance
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
Lecture 3: Computer Performance
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Chapter 1 Section 1.4 Dr. Iyad F. Jafar Evaluating Performance.
1 Computer Performance: Metrics, Measurement, & Evaluation.
Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( ) 2005.
Copyright 1995 by Coherence LTD., all rights reserved (Revised: Oct 97 by Rafi Lohev, Oct 99 by Yair Wiseman, Sep 04 Oren Kapah) IBM י ב מ 7-1 Measuring.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
CMSC 611 Evaluating Cost Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from David Culler, UC Berkeley.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
1 Acknowledgements Class notes based upon Patterson & Hennessy: Book & Lecture Notes Patterson’s 1997 course notes (U.C. Berkeley CS 152, 1997) Tom Fountain.
Performance.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Morgan Kaufmann Publishers
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors
Lecture 5: 9/10/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
L12 – Performance 1 Comp 411 Computer Performance He said, to speed things up we need to squeeze the clock Study
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 2: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed. Oct. 4,
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Yaohang Li.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CSE 340 Computer Architecture Summer 2016 Understanding Performance.
Lecture 3. Performance Prof. Taeweon Suh Computer Science & Engineering Korea University COSE222, COMP212, CYDF210 Computer Architecture.
June 20, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 1: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed.
Additional Examples CSE420/598, Fall 2008.
Measuring Performance II and Logic Design
Computer Architecture & Operations I
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
September 2 Performance Read 3.1 through 3.4 for Tuesday
How do we evaluate computer architectures?
Defining Performance Which airplane has the best performance?
Computer Architecture & Operations I
Morgan Kaufmann Publishers
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
Defining Performance Section /14/2018 9:52 PM.
CMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture
Performance of computer systems
CMSC 611: Advanced Computer Architecture
Performance of computer systems
CMSC 611: Advanced Computer Architecture
Parameters that affect it How to improve it and by how much
Performance.
Chapter 2: Performance CS 447 Jason Bakos Fall 2001 CS 447.
CS2100 Computer Organisation
Presentation transcript:

CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / © 2003 Elsevier Science

Advances of the IC technology affect H/W and S/W design philosophy Integrated Circuits: Fueling Innovation Technology innovations over time 2

Moore’s Law Transistors double every year –Revised to every two years –Sometimes revised to performance instead of transistors (Wgsimon - Wikipedia) (Source: Intel) 3

Criteria of performance evaluation differs among users and designers Defining Performance Performance means different things to different people, therefore its assessment is subtle Analogy from the airlines industry: How to measure performance for an airplane? –Cruising speed (How fast it gets to the destination) –Flight range(How far it can reach) –Passenger capacity(How many passengers it can carry) 4

Performance Metrics Response (execution) time: –The time between the start and the completion of a task –Measures user perception of the system speed –Common in reactive and time critical systems –Single-user computer Throughput: –The total number of tasks done in a given time –Relevant to batch processing (billing, credit card processing) –Also many-user services (web servers) Power: –Power consumed or battery life –Especially relevant for mobile 5

Response-time Metric Maximizing performance means minimizing response (execution) time 6

Response-time Metric Performance of Processor P1 is better than P2 if –For a given work load L –P1 takes less time to execute L than P2 7

Response-time Metric Relative performance captures the performance ratio –For the same work load 8

Designer’s Performance Metrics Users and designers measure performance using different metrics –Users: quotable metrics (GHz) –Designers: program execution Designer focuses on reducing the clock cycle time and the number of cycles per program Many techniques to decrease the number of clock cycles also increase the clock cycle time or the average number of cycles per instruction (CPI) 9

A program runs in 10 seconds on a computer “A” with a 400 MHz clock. We desire a faster computer “B” that could run the program in 6 seconds. The designer has determined that a substantial increase in the clock speed is possible, however it would cause computer “B” to require 1.2 times as many clock cycles as computer “A”. What should be the clock rate of computer “B”? Example Why would this happen? –Maybe one operation takes one full clock cycle at 400 MHz (1/400 MHz = 2.5 ns) –All the rest take less than one cycle –Split this operation across multiple cycles –Can now increase clock rate, but also increase total cycles 10

A program runs in 10 seconds on a computer “A” with a 400 MHz clock. We desire a faster computer “B” that could run the program in 6 seconds. The designer has determined that a substantial increase in the clock speed is possible, however it would cause computer “B” to require 1.2 times as many clock cycles as computer “A”. What should be the clock rate of computer “B”? To get the clock rate of the faster computer, we use the same formula Example 11

Or Calculation of CPU Time 12

CPU Time (Cont.) CPU execution time can be measured by running the program The clock cycle is usually published by the manufacture Measuring the CPI and instruction count is not trivial –Instruction counts can be measured by: software profiling, using an architecture simulator, using hardware counters on some architecture –The CPI depends on many factors including: processor structure, memory system, the mix of instruction types and the implementation of these instructions 13

Where: C i is the count of number of instructions of class i executed CPI i is the average number of cycles per instruction for that instruction class n is the number of different instruction classes CPU Time (Cont.) Designers sometimes uses the following formula: 14

Suppose we have two implementation of the same instruction set architecture. Machine “A” has a clock cycle time of 1 ns and a CPI of 2.0 for some program, and machine “B” has a clock cycle time of 2 ns and a CPI of 1.2 for the same program. Which machine is faster for this program and by how much? Both machines execute the same instructions for the program. Assume the number of instructions is “I”, CPU clock cycles (A) = I  2.0CPU clock cycles (B) = I  1.2 The CPU time required for each machine is as follows: CPU time (A) = CPU clock cycles (A)  Clock cycle time (A) = I  2.0  1 ns = 2  I ns CPU time (B) = CPU clock cycles (B)  Clock cycle time (B) = I  1.2  2 ns = 2.4  I ns Therefore machine A will be faster by the following ratio: Example 15

A compiler designer is trying to decide between two code sequences for a particular machine. The hardware designers have supplied the following facts: For a particular high-level language statement, the compiler writer is considering two code sequences that require the following instruction counts: Which code sequence executes the most instructions? Which will be faster? What is the CPI for each sequence? Answer: Sequence 1:executes = 5 instructions Sequence 2:executes = 6 instructions  Comparing Code Segments 16

Sequence 1:CPU clock cycles = (2  1) + (1  2) + (2  3) = 10 cycles Sequence 2:CPU clock cycles = (4  1) + (1  2) + (1  3) = 9 cycles + Therefore Sequence 2 is faster although it executes more instructions Using the formula: Sequence 1:CPI = 10/5 = 2 Sequence 2:CPI = 9/6 = 1.5 Using the formula: + Since Sequence 2 takes fewer overall clock cycles but has more instructions it must have a lower CPI Comparing Code Segments 17

The Role of Performance Hardware performance is a key to the effectiveness of the entire system Performance has to be measured and compared to evaluate designs To optimize the performance, major affecting factors have to be known For different types of applications –different performance metrics may be appropriate –different aspects of a computer system may be most significant Instructions use and implementation, memory hierarchy and I/O handling are among the factors that affect the performance 18

Where: C i is the count of number of instructions of class i executed CPI i is the average number of cycles per instruction for that instruction class n is the number of different instruction classes Calculation of CPU Time 19

Important Equations (so far) 20

A common theme in Hardware design is to make the common case fast Increasing the clock rate would not affect memory access time Using a floating point processing unit does not speed integer ALU operations Example: Floating point instructions improved to run 2X; but only 34% of actual instructions are floating point Exec-Time new = Exec-Time old x ( /2) = 0.83 x Exec-Time old Speedup overall = Exec-Time old / Exec-Time new = 1/0.83 = The performance enhancement possible with a given improvement is limited by the amount that the improved feature is used Amdahl’s Law 21

Ahmdal’s Law Visually Time old Time new 22

Ahmdal’s Law for Speedup 23