1 Lecture 2: System Metrics and Pipelining Today’s topics: (Sections 1.6, 1.7, 1.9, A.1)  Quantitative principles of computer design  Measuring cost.

Slides:



Advertisements
Similar presentations
Lecture: Pipelining Basics
Advertisements

Lecture 2: Modern Trends 1. 2 Microprocessor Performance Only 7% improvement in memory performance every year! 50% improvement in microprocessor performance.
Computer Organization and Architecture 18 th March, 2008.
1 Lecture 2: System Metrics and Pipelining Today’s topics: (Sections 1.5 – 1.10)  Power/Energy examples  Performance summaries  Measuring cost and dependability.
1 Lecture 17: Basic Pipelining Today’s topics:  5-stage pipeline  Hazards and instruction scheduling Mid-term exam stats:  Highest: 90, Mean: 58.
1 Lecture 3: Pipelining Basics Biggest contributors to performance: clock speed, parallelism Today: basic pipelining implementation (Sections A.1-A.3)
1 Lecture 2: System Metrics and Pipelining Today’s topics: (Sections 1.6, 1.7, 1.9, A.1)  Performance summaries  Quantitative principles of computer.
1 Lecture 11: Digital Design Today’s topics:  Evaluating a system  Intro to boolean functions.
Lecture 16: Basic CPU Design
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
Lecture: Pipelining Basics
1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  logistics  why computer organization is important  modern trends.
1 Introduction Background: CS 3810 or equivalent, based on Hennessy and Patterson’s Computer Organization and Design Text for CS/EE 6810: Hennessy and.
1 Lecture 10: FP, Performance Metrics Today’s topics:  IEEE 754 representations  FP arithmetic  Evaluating a system Reminder: assignment 4 due in a.
1 Lecture 2: Measuring Performance/Cost/Power Today’s topics: (Sections 1.6, 1.4, 1.7, 1.8)  Quantitative principles of computer design  Measuring cost.
Lecture 24: CPU Design Today’s topic –Multi-Cycle ALU –Introduction to Pipelining 1.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI CSCI.
Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( ) 2005.
The Computer Systems By : Prabir Nandi Computer Instructor KV Lumding.
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
Introduction CSE 410, Spring 2008 Computer Systems
1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  Why computer organization is important  Logistics  Modern trends.
Economics and Sustainability Financial Factors Influencing Success.
1 CS/EE 6810: Computer Architecture Class format:  Most lectures on YouTube *BEFORE* class  Use class time for discussions, clarifications, problem-solving,
Computer Organization & Assembly Language © by DR. M. Amer.
Pipelining and Parallelism Mark Staveley
1 Lecture 2: Performance, MIPS ISA Today’s topics:  Performance equations  MIPS instructions Reminder: canvas and class webpage:
Lecture 16: Basic Pipelining
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO CS 219 Computer Organization.
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
1 Lecture 23: Storage Systems Topics: disk access, bus design, evaluation metrics, RAID (Sections )
Computer Organization Yasser F. O. Mohammad 1. 2 Lecture 1: Introduction Today’s topics:  Why computer organization is important  Logistics  Modern.
What is it and why do we need it? Chris Ward CS147 10/16/2008.
1 Lecture: Static ILP Topics: predication, speculation (Sections C.5, 3.2)
1 Lecture: Out-of-order Processors Topics: branch predictor wrap-up, a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
CSE 102 Introduction to Computer Engineering Central Processing Unit.
1 How will execution time grow with SIZE? int array[SIZE]; int sum = 0; for (int i = 0 ; i < ; ++ i) { for (int j = 0 ; j < SIZE ; ++ j) { sum +=
1 Lecture 3: Pipelining Basics Today: chapter 1 wrap-up, basic pipelining implementation (Sections C.1 - C.4) Reminders:  Sign up for the class mailing.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Chapter 7 Input/Output and Storage Systems. 2 Chapter 7 Objectives Understand how I/O systems work, including I/O methods and architectures. Become familiar.
1 Lecture 20: OOO, Memory Hierarchy Today’s topics:  Out-of-order execution  Cache basics.
SPRING 2012 Assembly Language. Definition 2 A microprocessor is a silicon chip which forms the core of a microcomputer the concept of what goes into a.
Performance. Moore's Law Moore's Law Related Curves.
Introduction to Computers - Hardware
Lecture 2: Performance Today’s topics:
Lecture 3: MIPS Instruction Set
How will execution time grow with SIZE?
Lecture 16: Basic Pipelining
Lecture: Pipelining Basics
Assembly Language for Intel-Based Computers, 5th Edition
Architecture Background
Microcomputer Architecture
Lecture: Pipelining Basics
Lecture 16: Basic Pipelining
Lecture 2: Performance Today’s topics: Technology wrap-up
Computer Architecture
Lecture: Static ILP, Branch Prediction
Introduction to Computing
Lecture 5: Pipelining Basics
Lecture: Branch Prediction
Lecture: Static ILP Topics: predication, speculation (Sections C.5, 3.2)
Lecture 17: Pipelining Today’s topics: 5-stage pipeline Hazards.
Lecture 20: OOO, Memory Hierarchy
Lecture 20: OOO, Memory Hierarchy
Lecture 3: MIPS Instruction Set
CS 704 Advanced Computer Architecture
Lecture 4: Instruction Set Design/Pipelining
Lecture: Pipelining Basics
Chapter 2: Performance CS 447 Jason Bakos Fall 2001 CS 447.
Presentation transcript:

1 Lecture 2: System Metrics and Pipelining Today’s topics: (Sections 1.6, 1.7, 1.9, A.1)  Quantitative principles of computer design  Measuring cost and dependability  Introduction to pipelining Class web-page and class mailing list are now functional: Assignment 1 will be posted later today; due in 12 days

2 Amdahl’s Law Architecture design is very bottleneck-driven – make the common case fast, do not waste resources on a component that has little impact on overall performance/power Amdahl’s Law: performance improvements through an enhancement is limited by the fraction of time the enhancement comes into play Example: a web server spends 40% of time in the CPU and 60% of time doing I/O – a new processor that is ten times faster results in a 36% reduction in execution time (speedup of 1.56) – Amdahl’s Law states that maximum execution time reduction is 40% (max speedup of 1.66)

3 Principle of Locality Most programs are predictable in terms of instructions executed and data accessed The Rule: a program spends 90% of its execution time in only 10% of the code Temporal locality: a program will shortly re-visit X Spatial locality: a program will shortly visit X+1

4 Exploit Parallelism Most operations do not depend on each other – hence, execute them in parallel At the circuit level, simultaneously access multiple ways of a set-associative cache At the organization level, execute multiple instructions at the same time At the system level, execute a different program while one is waiting on I/O

5 Factors Determining Cost Cost: amount spent by manufacturer to produce a finished good High volume  faster learning curve, increased manufacturing efficiency (10% lower cost if volume doubles), lower R&D cost per produced item Commodities: identical products sold by many vendors in large volumes (keyboards, DRAMs) – low cost because of high volume and competition among suppliers

6 Wafers and Dies An entire wafer is produced and chopped into dies that undergo testing and packaging

7 Integrated Circuit Cost Cost of an integrated circuit = (cost of die + cost of packaging and testing) / final test yield Cost of die = cost of wafer / (dies per wafer x die yield) Dies/wafer = wafer area / die area -  wafer diam / die diag Die yield = wafer yield x (1 + (defect rate x die area) /  ) -  Thus, die yield depends on die area and complexity arising from multiple manufacturing steps (  ~ 4.0)

8 Integrated Circuit Cost Examples A 30 cm diameter wafer cost $5-6K in 2001 Such a wafer yields about 366 good 1 cm 2 dies and 1014 good 0.49 cm 2 dies (note the effect of area and yield) Die sizes: Alpha cm 2, Itanium 3.0 cm 2, embedded processors are between 0.1 – 0.25 cm 2

9 Contribution of IC Costs to Total System Cost SubsystemFraction of total cost Cabinet: sheet metal, plastic, power supply, fans, cables, nuts, bolts, manuals, shipping box 6% Processor22% DRAM (128 MB)5% Video card5% Motherboard5% Processor board subtotal37% Keyboard and mouse3% Monitor19% Hard disk (20 GB)9% DVD drive6% I/O devices subtotal37% Software (OS + Office)20%

10 Defining Fault, Error, and Failure A fault produces a latent error; it becomes effective when activated; it leads to failure when the observed actual behavior deviates from the ideal specified behavior Example I : a programming mistake is a fault; the buggy code is the latent error; when the code runs, it is effective; if the buggy code influences program output/behavior, a failure occurs Example II : an alpha particle strikes DRAM (fault); if it changes the memory bit, it produces a latent error; when the value is read, the error becomes effective; if program output deviates, failure occurs

11 Defining Reliability and Availability A system toggles between  Service accomplishment: service matches specifications  Service interruption: services deviates from specs The toggle is caused by failures and restorations Reliability measures continuous service accomplishment and is usually expressed as mean time to failure (MTTF) Availability measures fraction of time that service matches specifications, expressed as MTTF / (MTTF + MTTR)

12 The Assembly Line A Start and finish a job before moving to the next Time Jobs Break the job into smaller stages BC ABC ABC ABC Unpipelined Pipelined

13 Performance Improvements? Does it take longer to finish each individual job? Does it take shorter to finish a series of jobs? What assumptions were made while answering these questions? Is a 10-stage pipeline better than a 5-stage pipeline?

14 Quantitative Effects As a result of pipelining:  Time in ns per instruction goes up  Number of cycles per instruction goes up (note the increase in clock speed)  Total execution time goes down, resulting in lower time per instruction  Average cycles per instruction increases slightly  Under ideal conditions, speedup = ratio of elapsed times between successive instruction completions = number of pipeline stages = increase in clock speed

15 A 5-Stage Pipeline

16 A 5-Stage Pipeline Use the PC to access the I-cache and increment PC by 4

17 A 5-Stage Pipeline Read registers, compare registers, compute branch target; for now, assume branches take 2 cyc (there is enough work that branches can easily take more)

18 A 5-Stage Pipeline ALU computation, effective address computation for load/store

19 A 5-Stage Pipeline Memory access to/from data cache, stores finish in 4 cycles

20 A 5-Stage Pipeline Write result of ALU computation or load into register file

21 Title Bullet