Ch6a- 2 EE/CS/CPE 3760 - Computer Organization  Seattle Pacific University Automobile Manufacturing 1. Build frame. 60 min. 2. Add engine. 50 min. 3.

Slides:



Advertisements
Similar presentations
Pipeline Example: cycle 1 lw R10,9(R1) sub R11,R2, R3 and R12,R4, R5 or R13,R6, R7.
Advertisements

Pipelined Processor II (cont’d) CPSC 321
EECE476 Lecture 7: Single-Cycle CPU Instruction Processing & Control Chapter 5, Sections 5.3, 5.4 The University of British ColumbiaEECE 476© 2005 Guy.
The Processor: Datapath & Control
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
1 Recap (Pipelining). 2 What is Pipelining? A way of speeding up execution of tasks Key idea : overlap execution of multiple taks.
EECE476 Lecture 9: Multi-cycle CPU Datapath Chapter 5: Section 5.5 The University of British ColumbiaEECE 476© 2005 Guy Lemieux.
©UCB CS 162 Computer Architecture Lecture 3: Pipelining Contd. Instructor: L.N. Bhuyan
331 W9.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 9 Building a Single-Cycle Datapath [Adapted from Dave Patterson’s.
331 Lec 14.1Fall 2002 Review: Abstract Implementation View  Split memory (Harvard) model - single cycle operation  Simplified to contain only the instructions:
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Datapath and Control Andreas Klappenecker CPSC321 Computer Architecture.
Morgan Kaufmann Publishers
Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
Chapter 4 Sections 4.1 – 4.4 Appendix D.1 and D.2 Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
Supplementary notes for pipelining LW ____,____ SUB ____,____,____ BEQ ____,____,____ ; assume that, condition for branch is not satisfied OR ____,____,____.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Automobile Manufacturing 1. Build frame. 60 min. 2. Add engine. 50 min. 3. Build body. 80 min. 4. Paint. 40 min. 5. Finish.45 min. 275 min. Latency: Time.
1 A pipeline diagram  A pipeline diagram shows the execution of a series of instructions. —The instruction sequence is shown vertically, from top to bottom.
Datapath and Control: MultiCycle Implementation. Performance of Single Cycle Machines °Assume following operation times: Memory units : 200 ps ALU and.
1 COMP541 Multicycle MIPS Montek Singh Apr 4, 2012.
1 CS/COE0447 Computer Organization & Assembly Language Multi-Cycle Execution.
Electrical and Computer Engineering University of Cyprus LAB3: IMPROVING MIPS PERFORMANCE WITH PIPELINING.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
1 A single-cycle MIPS processor  An instruction set architecture is an interface that defines the hardware operations which are available to software.
1. Building A CPU  We’ve built a small ALU l Add, Subtract, SLT, And, Or l Could figure out Multiply and Divide  What about the rest l How do.
COMP541 Multicycle MIPS Montek Singh Mar 25, 2010.
Performance of Single-cycle Design
ECE-C355 Computer Structures Winter 2008 The MIPS Datapath Slides have been adapted from Prof. Mary Jane Irwin ( )
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
PC Instruction Memory Address Instr. [31-0] 4 Fig 4.6 p 309 Instruction Fetch.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 2.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Single-cycle CPU Control
EE204 Computer Architecture
Chapter 5: A Multi-Cycle CPU.
Single Cycle CPU.
Stalling delays the entire pipeline
Single-Cycle Datapath and Control
Note how everything goes left to right, except …
Computer Architecture
IT 251 Computer Organization and Architecture
Performance of Single-cycle Design
Single Cycle Processor
D.4 Finite State Diagram for the Multi-cycle processor
ECS 154B Computer Architecture II Spring 2009
CS/COE0447 Computer Organization & Assembly Language
Forwarding Now, we’ll introduce some problems that data hazards can cause for our pipelined processor, and show how to handle them with forwarding.
CSCI206 - Computer Organization & Programming
CS/COE0447 Computer Organization & Assembly Language
Single-cycle datapath, slightly rearranged
Single-Cycle CPU DataPath.
CS/COE0447 Computer Organization & Assembly Language
A pipeline diagram Clock cycle lw $t0, 4($sp) IF ID
CS/COE0447 Computer Organization & Assembly Language
Systems Architecture II
The Processor Lecture 3.3: Single-cycle Implementation
The Processor Lecture 3.2: Building a Datapath with Control
Pipelining Appendix A and Chapter 3.
Introduction to Computer Organization and Architecture
CS/COE0447 Computer Organization & Assembly Language
A relevant question Assuming you’ve got: One washer (takes 30 minutes)
The Processor: Datapath & Control.
COMS 361 Computer Organization
Processor: Datapath and Control
Pipelined datapath and control
Presentation transcript:

Ch6a- 2 EE/CS/CPE Computer Organization  Seattle Pacific University Automobile Manufacturing 1. Build frame. 60 min. 2. Add engine. 50 min. 3. Build body. 80 min. 4. Paint. 40 min. 5. Finish.80 min. 310 min. Latency: Time from start to finish for one car. Throughput: Number of finished cars per time unit. 1 car/310 min = 0.19 cars/hour 310 minutes per car. Issues: How can we make the process better by adding more workers? (smaller is better) (larger is better) 6.1

Ch6a- 3 EE/CS/CPE Computer Organization  Seattle Pacific University An Assembly line Short stages can’t produce faster than one car/80 min or a backlog will occur at longer stages. 80 Latency: 400 min/car Throughput: 4 cars/640 min (1 car/160 min) time Will approach 1 car/80 min as time goes on

Ch6a- 4 EE/CS/CPE Computer Organization  Seattle Pacific University Applying Assembly Lines to CPUs The single-cycle design did everything “at once” Can we break the single-cycle design up into stages? Use the multi-cycle design to help us decide what can go together 6.1 Issues: Why not base the design on multi-cycle? Car assembly works well. Will it be so easy to do the same technique to a CPU?

Ch6a- 5 EE/CS/CPE Computer Organization  Seattle Pacific University Instruction Memory Data Memory Add 4 Read address Instruction [31-0] Read address Write address Write data Read data Result Zero Result Sh. Left sign extend PC Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A 0 1 Imm: [15-0] Rs:[25-21] Rt:[20-16] Rd: [15-11] 1 0 Instr. Fetch, PC=PC+4 Instr. Decode Register Fetch Execute, Address Calc. Memory Reg. Write- back Breaking up the Single-Cycle Datapath 6.2 Stages from multi-cycle design

Ch6a- 6 EE/CS/CPE Computer Organization  Seattle Pacific University Instruction Memory Data Memory Add 4 Read address Instruction [31-0] Read address Write address Write data Read data Result Zero Result Sh. Left sign extend PC Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A 0 1 Imm: [15-0] Rs:[25-21] Rt:[20-16] Rd: [15-11] 1 0 Instr. Fetch, PC=PC+4 Instr. Decode Register Fetch Execute, Address Calc. Memory Reg. Write- back The Key - Pipeline Registers 6.2 clock PC+4 If only one instruction is processed at a time, this is similar to multi-cycle

Ch6a- 7 EE/CS/CPE Computer Organization  Seattle Pacific University Instruction Memory Data Memory Add 4 Read address Instruction [31-0] Read address Write address Write data Read data Result Zero Result Sh. Left sign extend PC Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A 0 1 Imm: [15-0] Rs:[25-21] Rt:[20-16] Rd: [15-11] 1 0 Example: ADD Instruction 6.2 PC+4 Writes the correct data to the wrong register In general, arrows that go backwards across pipeline stages may be bad news... A new instruction enters the IF stage each cycle ADD $Rd, $Rs, $Rt

Ch6a- 8 EE/CS/CPE Computer Organization  Seattle Pacific University Instruction Memory Data Memory Add 4 Read address Instruction [31-0] Read address Write address Write data Read data Result Zero Result Sh. Left sign extend PC Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Imm: [15-0] Rs:[25-21] Rt:[20-16] 0 1 Rd: [15-11] 1 0 Correcting the Write Register Problem 6.2 PC+4 Rt:[20-16] Rd:[15-11]

Ch6a- 9 EE/CS/CPE Computer Organization  Seattle Pacific University Assembly-line Control Signals In an assembly line, the manufacturing instructions can be attached to the car. The instructions then move along with the car. F: Standard E: 135 HP B: 2-door P: Green F: Leather E: 190 HP B: 4-door P: Blue F: Cotton B: 2-door P: Lavender F: Leather P: Green F: Vinyl F: Leather 2 By separating the control signals by stages, only the signals needed for the current stage must be decoded. All signals for later stages must be passed along. 6.1

Ch6a- 10 EE/CS/CPE Computer Organization  Seattle Pacific University Instruction Memory Data Memory Add 4 Read address Instruction [31-0] Read address Write address Write data Read data Result Zero Result Sh. Left sign extend PC Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Imm: [15-0] Rs:[25-21] Rt:[20-16] 1 0 The Pipelined Control Logic 6.3 PC Rt:[20-16] Rd:[15-11] ALU control ALUOp RegWrite MemToReg MemWrite MemRead ALUSrc PCSrc RegDest Op:[31-26] W M E Control W M W Branch

Ch6a- 11 EE/CS/CPE Computer Organization  Seattle Pacific University How’d we do? Compared to Single-cycle 5 stages --> Potentially 5x speedup Not likely Stages won’t all be equally long Pipeline registers will cause some delays Latency --> Greater than in single-cycle design More complexity, but nicely divided up Compared to Multi-cycle Smaller speedup since some multi-cycle instructions are shorter Complexity may be simpler (but wait…)