Lecture 16: Basic Pipelining

Slides:



Advertisements
Similar presentations
Lecture 4: CPU Performance
Advertisements

1 Lecture: Pipelining Hazards Topics: Basic pipelining implementation, hazards, bypassing HW2 posted, due Wednesday.
Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining I (1) Fall 2005 Lecture 18: Pipelining I.
Lecture: Pipelining Basics
1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.
1 Lecture: Pipeline Wrap-Up and Static ILP Topics: multi-cycle instructions, precise exceptions, deep pipelines, compiler scheduling, loop unrolling, software.
S. Barua – CPSC 440 CHAPTER 6 ENHANCING PERFORMANCE WITH PIPELINING This chapter presents pipelining.
1 Lecture 17: Basic Pipelining Today’s topics:  5-stage pipeline  Hazards and instruction scheduling Mid-term exam stats:  Highest: 90, Mean: 58.
1 Lecture 3: Pipelining Basics Biggest contributors to performance: clock speed, parallelism Today: basic pipelining implementation (Sections A.1-A.3)
Mary Jane Irwin ( ) [Adapted from Computer Organization and Design,
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
1 Lecture 2: System Metrics and Pipelining Today’s topics: (Sections 1.6, 1.7, 1.9, A.1)  Quantitative principles of computer design  Measuring cost.
1 Lecture 18: Pipelining Today’s topics:  Hazards and instruction scheduling  Branch prediction  Out-of-order execution Reminder:  Assignment 7 will.
Lecture 16: Basic CPU Design
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 Lecture 14: FSM and Basic CPU Design Today’s topics:  Finite state machines  Single-cycle CPU Reminder: midterm on Tue 10/24  will cover Chapters.
Appendix A Pipelining: Basic and Intermediate Concepts
Lecture: Pipelining Basics
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
Lecture 24: CPU Design Today’s topic –Multi-Cycle ALU –Introduction to Pipelining 1.
Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.
1 COMP541 Multicycle MIPS Montek Singh Apr 4, 2012.
Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.
Pipeline Hazards. CS5513 Fall Pipeline Hazards Situations that prevent the next instructions in the instruction stream from executing during its.
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
CMPE 421 Parallel Computer Architecture
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
1. Building A CPU  We’ve built a small ALU l Add, Subtract, SLT, And, Or l Could figure out Multiply and Divide  What about the rest l How do.
COMP541 Multicycle MIPS Montek Singh Mar 25, 2010.
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
LECTURE 7 Pipelining. DATAPATH AND CONTROL We started with the single-cycle implementation, in which a single instruction is executed over a single cycle.
11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.
CSE431 L06 Basic MIPS Pipelining.1Irwin, PSU, 2005 MIPS Pipeline Datapath Modifications  What do we need to add/modify in our MIPS datapath? l State registers.
1 Lecture 3: Pipelining Basics Today: chapter 1 wrap-up, basic pipelining implementation (Sections C.1 - C.4) Reminders:  Sign up for the class mailing.
1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.
1 Lecture 20: OOO, Memory Hierarchy Today’s topics:  Out-of-order execution  Cache basics.
Lecture 18: Pipelining I.
Stalling delays the entire pipeline
Lecture 15: Basic CPU Design
Lecture 16: Basic Pipelining
Lecture: Pipelining Basics
Lecture 17: Pipelining Today’s topics: 5-stage pipeline Hazards
ECE232: Hardware Organization and Design
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
Lecture: Pipelining Basics
Lecture 16: Basic Pipelining
CS 704 Advanced Computer Architecture
Lecture: Pipelining Hazards
Lecture: Pipelining Hazards
Lecture 5: Pipelining Basics
Lecture 18: Pipelining Today’s topics:
Lecture: Pipelining Hazards
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
Lecture 17: Pipelining Today’s topics: 5-stage pipeline Hazards.
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
Lecture: Pipelining Extensions
Lecture 20: OOO, Memory Hierarchy
Lecture 18: Pipelining Today’s topics:
Lecture 17: Pipelining Today’s topics: 5-stage pipeline Hazards.
Lecture 4: Advanced Pipelines
Lecture: Pipelining Hazards
Lecture: Pipelining Basics
Presentation transcript:

Lecture 16: Basic Pipelining Today’s topics: 1-stage design 5-stage design 5-stage pipeline Hazards Mid-term exam stats:

View from 5,000 Feet Source: H&P textbook

Latches and Clocks in a Single-Cycle Design PC Instr Mem Reg File ALU Addr Data Memory The entire instruction executes in a single cycle Green blocks are latches At the rising edge, a new PC is recorded At the rising edge, the result of the previous cycle is recorded At the falling edge, the address of LW/SW is recorded so we can access the data memory in the 2nd half of the cycle

Multi-Stage Circuit Instead of executing the entire instruction in a single cycle (a single stage), let’s break up the execution into multiple stages, each separated by a latch PC Instr Mem L2 Reg File L3 ALU L4 Data Memory L5 Reg File

The Assembly Line Unpipelined Pipelined Start and finish a job before moving to the next Jobs Time A B C Break the job into smaller stages A B C A B C A B C Pipelined

Performance Improvements? Does it take longer to finish each individual job? Does it take shorter to finish a series of jobs? What assumptions were made while answering these questions? Is a 10-stage pipeline better than a 5-stage pipeline?

Quantitative Effects As a result of pipelining: Time in ns per instruction goes up Each instruction takes more cycles to execute But… average CPI remains roughly the same Clock speed goes up Total execution time goes down, resulting in lower average time per instruction Under ideal conditions, speedup = ratio of elapsed times between successive instruction completions = number of pipeline stages = increase in clock speed

A 5-Stage Pipeline Source: H&P textbook

A 5-Stage Pipeline Use the PC to access the I-cache and increment PC by 4

A 5-Stage Pipeline Read registers, compare registers, compute branch target; for now, assume branches take 2 cyc (there is enough work that branches can easily take more)

A 5-Stage Pipeline ALU computation, effective address computation for load/store

A 5-Stage Pipeline Memory access to/from data cache, stores finish in 4 cycles

A 5-Stage Pipeline Write result of ALU computation or load into register file

Pipeline Summary RR ALU DM RW ADD R1, R2,  R3 Rd R1,R2 R1+R2 -- Wr R3 BEQ R1, R2, 100 Rd R1, R2 -- -- -- Compare, Set PC LD 8[R3]  R6 Rd R3 R3+8 Get data Wr R6 ST 8[R3]  R6 Rd R3,R6 R3+8 Wr data --

Conflicts/Problems I-cache and D-cache are accessed in the same cycle – it helps to implement them separately Registers are read and written in the same cycle – easy to deal with if register read/write time equals cycle time/2 Branch target changes only at the end of the second stage -- what do you do in the meantime?

Hazards Structural hazards: different instructions in different stages (or the same stage) conflicting for the same resource Data hazards: an instruction cannot continue because it needs a value that has not yet been generated by an earlier instruction Control hazard: fetch cannot continue because it does not know the outcome of an earlier branch – special case of a data hazard – separate category because they are treated in different ways

Structural Hazards Example: a unified instruction and data cache  stage 4 (MEM) and stage 1 (IF) can never coincide The later instruction and all its successors are delayed until a cycle is found when the resource is free  these are pipeline bubbles Structural hazards are easy to eliminate – increase the number of resources (for example, implement a separate instruction and data cache)

Example 1 add R1, R2, R3 lw R4, 8(R1) Source: H&P textbook

Example 2 lw R1, 8(R2) lw R4, 8(R1) Source: H&P textbook

Example 3 lw R1, 8(R2) sw R1, 8(R3) Source: H&P textbook

Title Bullet