Morgan Kaufmann Publishers Dr. Zhao Zhang Iowa State University

Slides:



Advertisements
Similar presentations
Morgan Kaufmann Publishers The Processor
Advertisements

CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Pipeline Hazards See: P&H Chapter 4.7.
Pipeline Hazards Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H Appendix 4.7.
ECE 445 – Computer Organization
Review: MIPS Pipeline Data and Control Paths
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
©UCB CS 162 Computer Architecture Lecture 3: Pipelining Contd. Instructor: L.N. Bhuyan
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
331 Lec 14.1Fall 2002 Review: Abstract Implementation View  Split memory (Harvard) model - single cycle operation  Simplified to contain only the instructions:
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
 The actual result $1 - $3 is computed in clock cycle 3, before it’s needed in cycles 4 and 5  We forward that value to later instructions, to prevent.
Lec 9: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University.
Pipelined Datapath and Control (Lecture #15) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
1 Stalls and flushes  So far, we have discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Computer Organization CS224 Fall 2012 Lesson 26. Summary of Control Signals addsuborilwswbeqj RegDst ALUSrc MemtoReg RegWrite MemWrite Branch Jump ExtOp.
1 Pipelining Reconsider the data path we just did Each instruction takes from 3 to 5 clock cycles However, there are parts of hardware that are idle many.
Pipeline Data Hazards: Detection and Circumvention Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.
Chapter 4 CSF 2009 The processor: Building the datapath.
Chapter 4 The Processor CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Zhao Zhang Iowa State University Revised from original.
Automobile Manufacturing 1. Build frame. 60 min. 2. Add engine. 50 min. 3. Build body. 80 min. 4. Paint. 40 min. 5. Finish.45 min. 275 min. Latency: Time.
Pipelined Datapath and Control
Processor: Datapath and Control
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 8: MIPS Pipelined.
Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.
Chapter 4 The Processor CprE 381 Computer Organization and Assembly Level Programming, Fall 2012 Revised from original slides provided by MKP.
CMPE 421 Parallel Computer Architecture
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /19/2013 Lecture 17: The Processor - Overview Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER.
Datapath Design Computer Organization I 1 August 2009 © McQuain, Feng & Ribbens Composing the Elements First-cut data path does an instruction.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
2/15/02CSE Data Hazzards Data Hazards in the Pipelined Implementation.
CSE431 L07 Overcoming Data Hazards.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 07: Overcoming Data Hazards Mary Jane Irwin (
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and
Introduction to Computer Organization Pipelining.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
MIPS processor continued
COM181 Computer Hardware Lecture 6: The MIPs CPU.
Single Cycle Controller Design
Morgan Kaufmann Publishers The Processor
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
Spr 2016, Mar 9... ELEC / Lecture 7 1 ELEC / Computer Architecture and Design Spring 2016 Pipeline Control and Performance.
CSE 340 Computer Architecture Spring 2016 Overcoming Data Hazards.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
CS 230: Computer Organization and Assembly Language
Stalling delays the entire pipeline
Note how everything goes left to right, except …
Morgan Kaufmann Publishers
ECS 154B Computer Architecture II Spring 2009
CDA 3101 Spring 2016 Introduction to Computer Organization
ECE232: Hardware Organization and Design
Forwarding Now, we’ll introduce some problems that data hazards can cause for our pipelined processor, and show how to handle them with forwarding.
Chapter 4 The Processor Part 3
Review: MIPS Pipeline Data and Control Paths
Morgan Kaufmann Publishers The Processor
Lecture 9. MIPS Processor Design – Pipelined Processor Design #2
Pipelining in more detail
CSCI206 - Computer Organization & Programming
Rocky K. C. Chang 6 November 2017
Data Hazards Data Hazard
The Processor Lecture 3.6: Control Hazards
Guest Lecturer: Justin Hsia
©2003 Craig Zilles (derived from slides by Howard Huang)
ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.
Presentation transcript:

Morgan Kaufmann Publishers Dr. Zhao Zhang Iowa State University April 17, 2017 CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Midterm Review 2 Dr. Zhao Zhang Iowa State University Chapter 1 — Computer Abstractions and Technology

Announcement No quiz today No homework this Friday Exam on Monday 9:00-9:50 HW9 deadline extended to next Friday HW8 solutions will be posted today Chapter 1 — Computer Abstractions and Technology — 2

Exam 2 Coverage Coverage: Ch. 4, The Processor Datapath and control Simple MIPS pipeline Data hazards and forwarding Load-use hazard and pipeline stall Control hazards Arithmetic will NOT be covered Will be covered in the final exam Final exam is comprehensive Chapter 1 — Computer Abstractions and Technology — 3

Question Styles and Coverage Morgan Kaufmann Publishers April 17, 2017 Question Styles and Coverage Short answer True/False or multi-choice Design and Analysis Signal values in the datapath and control Identify critical path Support a new MIPS instruction Performance analysis and optimization Identify pipeline bubbles in program execution Reorder instructions to improve performance And others Chapter 1 — Computer Abstractions and Technology — 4 Chapter 1 — Computer Abstractions and Technology

Nine-Instruction MIPS They’re enough to illustrate the most aspects of CPU design, particularly datapath and control design Some questions will use it as the baseline design Memory reference: LW and SW Arithmetic/logic: ADD, SUB, AND, OR, SLT Branch: BEQ, J Chapter 1 — Computer Abstractions and Technology — 5

Datapath With Jumps Added Morgan Kaufmann Publishers 17 April, 2017 Datapath With Jumps Added Chapter 4 — The Processor — 6 Chapter 4 — The Processor

The Control Control signals for the nine-instruction implementation Reg-Dst ALU-Src Mem-toReg Reg-Write MemRead MemWrite Branch ALUOp1 ALUOp0 Jump R- 1 lw sw X beq j Note: “R-” means R-format Chapter 1 — Computer Abstractions and Technology — 7

Morgan Kaufmann Publishers 17 April, 2017 ALU Control Truth table for ALU Control Extend it as a secondary control unit in projects B & C, with more control signal output opcode ALUOp Operation funct ALU function ALU control lw 00 load word XXXXXX add 0010 sw store word beq 01 branch equal subtract 0110 R-type 10 100000 100010 AND 100100 0000 OR 100101 0001 set-on-less-than 101010 0111 Chapter 4 — The Processor — 8 Chapter 4 — The Processor

Extend the Single-Cycle Processor For each instruction, do we need Any new or revised datapath element(s)? Any new control signal(s)? Then revise, if necessary, Datapath: Add new elements or revise existing ones, add new connections Control Unit: Add/extend control signals, extend the truth table ALU Control: Extend the truth table Chapter 1 — Computer Abstractions and Technology — 9

Support JAL jal target PC = JumpAddr R[31] = PC_plus_4 PC_plus_4 = PC+4 JumpAddr = PC_plus_4[31:28] & Inst[25:0] & “00” 000011 address 31:26 25:0 Chapter 1 — Computer Abstractions and Technology — 10

Morgan Kaufmann Publishers 17 April, 2017 Support JAL Make what changes to the datapath? Chapter 4 — The Processor — 11 Chapter 4 — The Processor

Support JAL Analyze the instruction execution Analyze datapath Writes register $ra ($31) Update PC with jump target This part already done for supporting J Analyze datapath Needs another input, fixed at 31, to “Write register” port of register file Needs another input, PC+4, to “Write data” port of register file Revise control Add a “link” signal The (main) control unit can tell it by reading the opcode Chapter 1 — Computer Abstractions and Technology — 12

Morgan Kaufmann Publishers 17 April, 2017 SCPv1 + JAL Revises the two muxes Add another input Extend the select signals Alternatively, use extra mux Chapter 4 — The Processor — 13 Chapter 4 — The Processor

Control Signals Control signals for the nine-instruction implementation Inst Reg-Dst ALU-Src Mem-toReg Reg-Write MemRead MemWrite Branch ALUOp1 ALUOp0 Jump Link R- 1 lw sw X beq j jal Add a new row for jal Extend RegDst Add a control line link Chapter 1 — Computer Abstractions and Technology — 14

Control Signals Control signals for the nine-instruction implementation Inst Reg-Dst ALU-Src Mem-toReg Reg-Write MemRead MemWrite Branch ALUOp1 ALUOp0 Jump Link R- 1 lw sw X beq j jal Extend control input to RegDst Mux: RegDst & Link Extend control input to MemtoReg Mux: MemtoReg & Link Chapter 1 — Computer Abstractions and Technology — 15

Morgan Kaufmann Publishers 17 April, 2017 Simple Pipeline Add pipeline registers hold information produced in each cycle Chapter 4 — The Processor — 16 Chapter 4 — The Processor

Morgan Kaufmann Publishers 17 April, 2017 Pipelined Control Chapter 4 — The Processor — 17 Chapter 4 — The Processor

Morgan Kaufmann Publishers 17 April, 2017 Hazards Situations that prevent starting the next instruction safely in the next cycle The simple pipeline won’t work correctly Structure hazards A required resource is busy Data hazard Need to wait for previous instruction to complete its data read/write Control hazard Deciding on control action depends on previous instruction Chapter 4 — The Processor — 18 Chapter 4 — The Processor

Data Hazards Program with data dependence sub $2, $1,$3 and $12,$2,$5 or $13,$6,$2 add $14,$2,$2 sw $15,100($2) Program with control dependence beq $1, $3, +4 addi $2, $2, 1 addi $4, $4, 1 Chapter 1 — Computer Abstractions and Technology — 19

Data Forwarding sub $2, $1,$3 # MEM=>EX forwarding and $12,$2,$5 # WB =>EX forwarding or $13,$6,$2 add $14,$2,$2 sw $15,100($2) IF ID EX MEM WB or and sub … … AND gets forwarded new $2 value add or and sub … sw add or and sub SUB gets forwarded new $2 value Chapter 1 — Computer Abstractions and Technology — 20

Morgan Kaufmann Publishers 17 April, 2017 Data Forwarding Paths Chapter 4 — The Processor — 21 Chapter 4 — The Processor

Detecting the Need to Forward Morgan Kaufmann Publishers 17 April, 2017 Detecting the Need to Forward Input rs and rt from EX rd and RegWrite from MEM rd and RegWrite from WB Output FwdA, FwdB Caveats Check RegWrite Check if rd = 0 Forwarding from MEM wins over WB Review slides and textbook for details Chapter 4 — The Processor — 22 Chapter 4 — The Processor

Morgan Kaufmann Publishers 17 April, 2017 Load-Use Data Hazard lw $s0, 20($t1) sub $t2, $s0, $t3 Can’t always avoid stalls by forwarding Must stall pipeline by one cycle Chapter 4 — The Processor — 23 Chapter 4 — The Processor

Datapath with Hazard Detection Morgan Kaufmann Publishers 17 April, 2017 Datapath with Hazard Detection Chapter 4 — The Processor — 24 Chapter 4 — The Processor

Morgan Kaufmann Publishers 17 April, 2017 Hazard Detection Unit Input rs and rt from ID rt and MemRead from EX Output PCWrite, IF/IDWrite (0 for holding instructions) Select signal to a MUX to insert bubble in EX Read slides/textbook for details Chapter 4 — The Processor — 25 Chapter 4 — The Processor

Morgan Kaufmann Publishers 17 April, 2017 Pipeline Stall The nop has all control signals set to zero It does nothing at EX, MEM and WB Prevent update of PC and IF/ID register Using instruction is decoded again (OK) Following instruction is fetched again (OK) 1-cycle stall allows MEM to read data for lw Can subsequently forward from WB to EX Chapter 4 — The Processor — 26 Chapter 4 — The Processor

Code Scheduling to Avoid Stalls Morgan Kaufmann Publishers 17 April, 2017 Code Scheduling to Avoid Stalls Reorder code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; lw $t1, 0($t0) lw $t2, 4($t0) add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) add $t5, $t1, $t4 sw $t5, 16($t0) lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 sw $t3, 12($t0) add $t5, $t1, $t4 sw $t5, 16($t0) stall stall 13 cycles 11 cycles Chapter 4 — The Processor — 27 Chapter 4 — The Processor

Morgan Kaufmann Publishers 17 April, 2017 Control Hazards Branch determines flow of control Two branch outcomes: Taken or Not-Taken The CPU doesn’t recognize a branch until it reaches the end of the ID stage Every cycle, the CPU has to fetch one instruction Chapter 4 — The Processor — 28 Chapter 4 — The Processor

Morgan Kaufmann Publishers 17 April, 2017 Control Hazards The MIPS pipeline in textbook always predict “not-taken” Pipeline flush on every taken branch OK to flush because mis-fetched instructions don’t write to register/memory But this incurs pipeline bubbles (performance penalty) The revised MIPS pipeline move branch comparison to the ID stage Doable for BEQ and BNE Reduce pipeline bubbles from 3 to 1 per taken branch Complicate data forwarding and hazard detection Chapter 4 — The Processor — 29 Chapter 4 — The Processor

Morgan Kaufmann Publishers 17 April, 2017 Revised MIPS Pipeline Chapter 4 — The Processor — 30 Chapter 4 — The Processor

Morgan Kaufmann Publishers 17 April, 2017 Revised MIPS Pipeline Note: Branch does nothing in EX, MEM and WB Chapter 4 — The Processor — 31 Chapter 4 — The Processor

Performance Penalty Any pipeline bubbles? loop: addi $1, $1, -1 lw $1, addr add $4, $5, $6 add $4, $5, $6 beq $1, $zero, loop beq $1, $4, target Chapter 1 — Computer Abstractions and Technology — 32

Delayed Branch Delayed branch may remove the one-cycle stall The instruction right after the beq is executed no matter the branch is taken or not (sub instruction in the example) Alternatingly saying, the execution of beq is delayed by one cycle sub $10, $4, $8 beq $1, $3, 7 beq $1, $3, 7 => sub $10, $4, $8 and $12, $2, $5 and $12, $2, $5 Must find an independent instruction, otherwise May have to fill in a nop instruction, or Need two variants of beq, delayed and not delayed Chapter 1 — Computer Abstractions and Technology — 33

Other Topics Exception handling Multi-issue pipeline Those topics will be covered in the final exam Exam 2 will NOT cover them Chapter 1 — Computer Abstractions and Technology — 34