CSL718 : Superscalar Processors

Slides:



Advertisements
Similar presentations
Hardware-Based Speculation. Exploiting More ILP Branch prediction reduces stalls but may not be sufficient to generate the desired amount of ILP One way.
Advertisements

1 Lecture 11: Modern Superscalar Processor Models Generic Superscalar Models, Issue Queue-based Pipeline, Multiple-Issue Design.
Anshul Kumar, CSE IITD CSL718 : VLIW - Software Driven ILP Hardware Support for Exposing ILP at Compile Time 3rd Apr, 2006.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture VLIW Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Lecture: Out-of-order Processors Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ.
THE MIPS R10000 SUPERSCALAR MICROPROCESSOR Kenneth C. Yeager IEEE Micro in April 1996 Presented by Nitin Gupta.
Spring 2003CSE P5481 Reorder Buffer Implementation (Pentium Pro) Hardware data structures retirement register file (RRF) (~ IBM 360/91 physical registers)
CS 211: Computer Architecture Lecture 5 Instruction Level Parallelism and Its Dynamic Exploitation Instructor: M. Lancaster Corresponding to Hennessey.
CPE 731 Advanced Computer Architecture ILP: Part IV – Speculative Execution Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
Chapter 8. Pipelining. Instruction Hazards Overview Whenever the stream of instructions supplied by the instruction fetch unit is interrupted, the pipeline.
Computer Architecture 2011 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
1 Lecture 18: Core Design Today: basics of implementing a correct ooo core: register renaming, commit, LSQ, issue queue.
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )
Review of CS 203A Laxmi Narayan Bhuyan Lecture2.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections )
Computer Architecture 2010 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
1 Lecture 7: Branch prediction Topics: bimodal, global, local branch prediction (Sections )
Ch2. Instruction-Level Parallelism & Its Exploitation 2. Dynamic Scheduling ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department.
Computer Architecture Computer Architecture Superscalar Processors Ola Flygt Växjö University +46.
Anshul Kumar, CSE IITD CSL718 : Superscalar Processors Issue and Despatch 23rd Jan, 2006.
Anshul Kumar, CSE IITD CSL718 : Pipelined Processors  Types of Pipelines  Types of Hazards 16th Jan, 2006.
1 Lecture 5 Overview of Superscalar Techniques CprE 581 Computer Systems Architecture, Fall 2009 Zhao Zhang Reading: Textbook, Ch. 2.1 “Complexity-Effective.
1 Lecture 6 Tomasulo Algorithm CprE 581 Computer Systems Architecture, Fall 2009 Zhao Zhang Reading:Textbook 2.4, 2.5.
Spring 2003CSE P5481 Precise Interrupts Precise interrupts preserve the model that instructions execute in program-generated order, one at a time If an.
1 Lecture 5: Dependence Analysis and Superscalar Techniques Overview Instruction dependences, correctness, inst scheduling examples, renaming, speculation,
1 Lecture 7: Speculative Execution and Recovery Branch prediction and speculative execution, precise interrupt, reorder buffer.
Anshul Kumar, CSE IITD CSL718 : Superscalar Processors Speculative Execution 2nd Feb, 2006.
1 Lecture: Out-of-order Processors Topics: a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
Out-of-order execution Lihu Rappoport 11/ MAMAS – Computer Architecture Out-Of-Order Execution Dr. Lihu Rappoport.
CS203 – Advanced Computer Architecture ILP and Speculation.
Ch2. Instruction-Level Parallelism & Its Exploitation 2. Dynamic Scheduling ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department.
CSL718 : Pipelined Processors
Instruction-Level Parallelism and Its Dynamic Exploitation
Lecture: Out-of-order Processors
Dynamic Scheduling Why go out of style?
Computer Architecture
/ Computer Architecture and Design
PowerPC 604 Superscalar Microprocessor
CIS-550 Advanced Computer Architecture Lecture 10: Precise Exceptions
Out of Order Processors
Tomasulo Loop Example Loop: LD F0 0 R1 MULTD F4 F0 F2 SD F4 0 R1
CS203 – Advanced Computer Architecture
Lecture: Out-of-order Processors
Microprocessor Microarchitecture Dynamic Pipeline
Sequential Execution Semantics
CMSC 611: Advanced Computer Architecture
Lecture 6: Advanced Pipelines
Lecture 16: Core Design Today: basics of implementing a correct ooo core: register renaming, commit, LSQ, issue queue.
Out of Order Processors
Computer Architecture Lecture 3
Lecture 18: Core Design Today: basics of implementing a correct ooo core: register renaming, commit, LSQ, issue queue.
Lecture 8: ILP and Speculation Contd. Chapter 2, Sections 2. 6, 2
Lecture 11: Memory Data Flow Techniques
Lecture: Out-of-order Processors
Lecture 8: Dynamic ILP Topics: out-of-order processors
Adapted from the slides of Prof
15-740/ Computer Architecture Lecture 5: Precise Exceptions
Lecture 7: Dynamic Scheduling with Tomasulo Algorithm (Section 2.4)
How to improve (decrease) CPI
Instruction Level Parallelism (ILP)
Adapted from the slides of Prof
Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/30/2011
Chapter 3: ILP and Its Exploitation
September 20, 2000 Prof. John Kubiatowicz
How to improve (decrease) CPI
CSL718 : Superscalar Processors
Lecture 9: Dynamic ILP Topics: out-of-order processors
Conceptual execution on a processor which exploits ILP
Presentation transcript:

CSL718 : Superscalar Processors Renaming and Reordering 30th Jan, 2006 Anshul Kumar, CSE IITD

Why Renaming and Reordering? Register Renaming Removes false dependencies (WAR and WAW) Reordering Buffer (ROB) Ensures sequential consistency of interrupts (precise vs imprecise interrupts) Facilitates speculative execution Anshul Kumar, CSE IITD

RAW, WAR and WAW (in Static Pipeline) IF D RF EX WB RAW IF D RF EX WB IF D RF EX WB WAR IF D RF EX WB IF D RF EX EX EX WB WAW IF D RF EX WB Anshul Kumar, CSE IITD

RAW, WAR and WAW (in Superscalar) write IF IS DP EX WB RAW read IF IS DP EX WB WAW WAR write IF IS DP EX WB Anshul Kumar, CSE IITD

Implementation using scoreboard bit write IF IS DP EX WB RAW read IF IS DP EX WB WAW WAR write IF IS DP EX WB b  0 Anshul Kumar, CSE IITD

CDC 6600 like Implementation b  0 b  1 write IF IS DP EX WB RAW read IF IS WAW DP EX WB WAR write IF IS DP EX WB b  0 Anshul Kumar, CSE IITD

IBM 360 like Implementation write IF IS DP EX WB RAW read IF IS WAW DP EX WB WAR write IF IS DP EX WB b  0 Anshul Kumar, CSE IITD

Use of Renaming write read write IF IS DP EX WB RAW IF IS DP EX WB WAW WAR write IF IS DP EX WB Anshul Kumar, CSE IITD

Register renaming write R5 RAW read R5 WAR write R5 RAW read R5 Anshul Kumar, CSE IITD

Who does renaming? Compiler Hardware Done statically Limited by registers visible to compiler Hardware Done dynamically Limited by registers available to hardware Anshul Kumar, CSE IITD

Types of renaming buffers Separate renaming register file and architectural register file Combined renaming and architectural register file Renaming combined with reordering Renaming combined with shelving and reordering Anshul Kumar, CSE IITD

How renaming works? (in context of combined reg file) register address from instruction mapping physical register file (larger than architectural register file) Anshul Kumar, CSE IITD

Types of mapping Indexed Associative Inexpensive Two steps required Look up index Read value Associative Expensive Single step associative access Anshul Kumar, CSE IITD

Renaming with indexed access entry index valid value value valid register number mapping table physical register file Anshul Kumar, CSE IITD

Renaming with associative access match register number entry reg valid num value value latest valid physical register file (associative) Anshul Kumar, CSE IITD

Handling interrupts these can “commit” status of instruction execution at the time of interrupt completed under execution not started program order Anshul Kumar, CSE IITD

Speculative execution predicted branch speculative execution don’t commit till correctness of prediction is determined Anshul Kumar, CSE IITD

instructions commit/retire Reordering instruction enter i i x x i: issued x: in execution f: finished x f x f instructions commit/retire Anshul Kumar, CSE IITD

Using ROB with RF Register to reservation stations/FUs File from FUs Anshul Kumar, CSE IITD

Future file and history file Register File ROB use in case of interrupts from FUs Future File to reservation stations/FUs update in case of interrupts displaced values History File Future File to reservation stations/FUs from FUs Anshul Kumar, CSE IITD

Combining renaming and reordering Use physical register file as ROB as well Maintain status about committed and uncommitted values Anshul Kumar, CSE IITD

How much to speculate? Handle exceptions in speculated instructions? handle only low cost exception events such as first level cache miss wait if expensive exceptional event occurs such as second level cache miss or TLB miss Speculating through multiple branches needed when branches are frequent or clustered even handling multiple branches in a cycle may be required Anshul Kumar, CSE IITD