Tomasulo With Reorder buffer:

Slides:



Advertisements
Similar presentations
Hardware-Based Speculation. Exploiting More ILP Branch prediction reduces stalls but may not be sufficient to generate the desired amount of ILP One way.
Advertisements

EECS 252 Graduate Computer Architecture Lec 9 – Precise Exceptions David Culler Electrical Engineering and Computer Sciences University of California,
Lec18.1 Step by step for Dynamic Scheduling by reorder buffer Copyright by John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
COMP25212 Advanced Pipelining Out of Order Processors.
CSE 502 Graduate Computer Architecture Lec 11 – More Instruction Level Parallelism Via Speculation Larry Wittie Computer Science, StonyBrook University.
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 7: Dynamic Scheduling and Branch Prediction * Jeremy R. Johnson Wed. Nov. 8, 2000.
Computer Architecture Lec 8 – Instruction Level Parallelism.
Copyright 2001 UCB & Morgan Kaufmann ECE668.1 Adapted from Patterson, Katz and Kubiatowicz © UCB Csaba Andras Moritz UNIVERSITY OF MASSACHUSETTS Dept.
Spring 2003CSE P5481 Reorder Buffer Implementation (Pentium Pro) Hardware data structures retirement register file (RRF) (~ IBM 360/91 physical registers)
CS136, Advanced Architecture Speculation. CS136 2 Outline Speculation Speculative Tomasulo Example Memory Aliases Exceptions VLIW Increasing instruction.
CPE 731 Advanced Computer Architecture ILP: Part IV – Speculative Execution Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
CSE 502 Graduate Computer Architecture Lec – More Instruction Level Parallelism Via Speculation Larry Wittie Computer Science, StonyBrook University.
1 EE524 / CptS561 Computer Architecture Speculation: allow an instruction to issue that is dependent on branch predicted to be taken without any consequences.
1 COMP 740: Computer Architecture and Implementation Montek Singh Tue, Mar 17, 2009 Topic: Instruction-Level Parallelism (Multiple-Issue, Speculation)
Computer Architecture
1 Zvika Guz Slides modified from Prof. Dave Patterson, Prof. John Kubiatowicz, and Prof. Nancy Warter-Perez Out Of Order Execution.
Computer Architecture Lecture 18 Superscalar Processor and High Performance Computing.
1 Overcoming Control Hazards with Dynamic Scheduling & Speculation.
1 Chapter 2: ILP and Its Exploitation Review simple static pipeline ILP Overview Dynamic branch prediction Dynamic scheduling, out-of-order execution Hardware-based.
1 Lecture 5 Overview of Superscalar Techniques CprE 581 Computer Systems Architecture, Fall 2009 Zhao Zhang Reading: Textbook, Ch. 2.1 “Complexity-Effective.
Chapter 3 Instruction Level Parallelism 2 Dr. Eng. Amr T. Abdel-Hamid Elect 707 Spring 2014 Computer Applications Text book slides: Computer Architec ture:
1 Lecture 6 Tomasulo Algorithm CprE 581 Computer Systems Architecture, Fall 2009 Zhao Zhang Reading:Textbook 2.4, 2.5.
Professor Nigel Topham Director, Institute for Computing Systems Architecture School of Informatics Edinburgh University Informatics 3 Computer Architecture.
1 Lecture 5: Dependence Analysis and Superscalar Techniques Overview Instruction dependences, correctness, inst scheduling examples, renaming, speculation,
2/24; 3/1,3/11 (quiz was 2/22, QuizAns 3/8) CSE502-S11, Lec ILP 1 Tomasulo Organization FP adders Add1 Add2 Add3 FP multipliers Mult1 Mult2 From.
EECS 252 Graduate Computer Architecture Lec 8 – Instruction Level Parallelism David Patterson Electrical Engineering and Computer Sciences University of.
CIS 662 – Computer Architecture – Fall Class 11 – 10/12/04 1 Scoreboarding  The following four steps replace ID, EX and WB steps  ID: Issue –
CS 5513 Computer Architecture Lecture 6 – Instruction Level Parallelism continued.
Ch2. Instruction-Level Parallelism & Its Exploitation 2. Dynamic Scheduling ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department.
Code Example LD F6,34(R2) LD F2,45(R3) MULTI F0,F2,F4 SUBD F8,F6,F2
/ Computer Architecture and Design
CPE 731 Advanced Computer Architecture ILP: Part V – Multiple Issue
COMP 740: Computer Architecture and Implementation
Instruction-Level Parallelism and Its Exploitation
Tomasulo’s Algorithm Born of necessity
Out of Order Processors
Dynamic Scheduling and Speculation
Step by step for Tomasulo Scheme
CS203 – Advanced Computer Architecture
CS5100 Advanced Computer Architecture Hardware-Based Speculation
Lecture 10 Tomasulo’s Algorithm
Lecture 12 Reorder Buffers
CPSC 614 Computer Architecture Lec 5 – Instruction Level Parallelism
CSCE 430/830 Computer Architecture Advanced HW Approaches: Speculation
Chapter 3: ILP and Its Exploitation
David Patterson Electrical Engineering and Computer Sciences
CS152 Computer Architecture and Engineering Lecture 18 Dynamic Scheduling (Cont), Speculation, and ILP.
John Kubiatowicz Electrical Engineering and Computer Sciences
Tomasulo With Reorder buffer:
A Dynamic Algorithm: Tomasulo’s
Out of Order Processors
CS203 – Advanced Computer Architecture
CS252 Graduate Computer Architecture Lecture 7 Dynamic Scheduling 2: Precise Interrupts February 9th, 2010 John Kubiatowicz Electrical Engineering.
John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
Adapted from the slides of Prof
Lecture 7: Dynamic Scheduling with Tomasulo Algorithm (Section 2.4)
John Kubiatowicz Electrical Engineering and Computer Sciences
The University of Adelaide, School of Computer Science
Larry Wittie Computer Science, StonyBrook University and ~lw
CC423: Advanced Computer Architecture ILP: Part V – Multiple Issue
Tomasulo Organization
Reduction of Data Hazards Stalls with Dynamic Scheduling
CPSC 614 Computer Architecture Lec 5 – Instruction Level Parallelism
Adapted from the slides of Prof
John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
Chapter 3: ILP and Its Exploitation
John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
Overcoming Control Hazards with Dynamic Scheduling & Speculation
The University of Adelaide, School of Computer Science
Presentation transcript:

Tomasulo With Reorder buffer: Done? FP Op Queue ROB7 ROB6 ROB5 ROB4 ROB3 ROB2 ROB1 Newest Reorder Buffer Oldest F0 LD F0,10(R2) N Registers To Memory Dest Resolve RAW memory conflict? (address in memory buffers) Integer unit executes in parallel Dest from Memory Dest Reservation Stations 1 10+R2 FP adders FP multipliers 4/14/2017 CS252 S06 Lec8 ILPB

Tomasulo With Reorder buffer: Done? FP Op Queue F10 F0 ADDD F10,F4,F0 LD F0,10(R2) N ROB7 ROB6 ROB5 ROB4 ROB3 ROB2 ROB1 Newest Reorder Buffer Oldest Registers To Memory Dest Resolve RAW memory conflict? (address in memory buffers) Integer unit executes in parallel Dest from Memory 2 ADDD R(F4),ROB1 Dest Reservation Stations 1 10+R2 FP adders FP multipliers 4/14/2017 CS252 S06 Lec8 ILPB

Tomasulo With Reorder buffer: Done? FP Op Queue F2 F10 F0 DIVD F2,F10,F6 ADDD F10,F4,F0 LD F0,10(R2) N ROB7 ROB6 ROB5 ROB4 ROB3 ROB2 ROB1 Newest Reorder Buffer Oldest Registers To Memory Dest Resolve RAW memory conflict? (address in memory buffers) Integer unit executes in parallel Dest from Memory 2 ADDD R(F4),ROB1 3 DIVD ROB2,R(F6) Dest Reservation Stations 1 10+R2 FP adders FP multipliers 4/14/2017 CS252 S06 Lec8 ILPB

Tomasulo With Reorder buffer: Done? FP Op Queue F0 ADDD F0,F4,F6 N F4 LD F4,0(R3) -- BNE F2,<…> F2 F10 DIVD F2,F10,F6 ADDD F10,F4,F0 LD F0,10(R2) ROB7 ROB6 ROB5 ROB4 ROB3 ROB2 ROB1 Newest Reorder Buffer Oldest Registers To Memory Dest Resolve RAW memory conflict? (address in memory buffers) Integer unit executes in parallel Dest from Memory 2 ADDD R(F4),ROB1 6 ADDD ROB5, R(F6) 3 DIVD ROB2,R(F6) Dest Reservation Stations 1 10+R2 5 0+R3 FP adders FP multipliers 4/14/2017 CS252 S06 Lec8 ILPB

Tomasulo With Reorder buffer: Done? FP Op Queue -- F0 ROB5 ST 0(R3),F4 ADDD F0,F4,F6 N F4 LD F4,0(R3) BNE F2,<…> F2 F10 DIVD F2,F10,F6 ADDD F10,F4,F0 LD F0,10(R2) ROB7 ROB6 ROB5 ROB4 ROB3 ROB2 ROB1 Newest Reorder Buffer Oldest Registers To Memory Dest Resolve RAW memory conflict? (address in memory buffers) Integer unit executes in parallel Dest from Memory 2 ADDD R(F4),ROB1 6 ADDD ROB5, R(F6) 3 DIVD ROB2,R(F6) Dest Reservation Stations 1 10+R2 5 0+R3 FP adders FP multipliers 4/14/2017 CS252 S06 Lec8 ILPB

Tomasulo With Reorder buffer: Done? FP Op Queue -- F0 M[10] ST 0(R3),F4 ADDD F0,F4,F6 Y N F4 LD F4,0(R3) BNE F2,<…> F2 F10 DIVD F2,F10,F6 ADDD F10,F4,F0 LD F0,10(R2) ROB7 ROB6 ROB5 ROB4 ROB3 ROB2 ROB1 Newest Reorder Buffer Oldest Registers To Memory Dest Resolve RAW memory conflict? (address in memory buffers) Integer unit executes in parallel Dest from Memory 2 ADDD R(F4),ROB1 6 ADDD M[10],R(F6) 3 DIVD ROB2,R(F6) Dest Reservation Stations 1 10+R2 FP adders FP multipliers 4/14/2017 CS252 S06 Lec8 ILPB

Tomasulo With Reorder buffer: Done? FP Op Queue -- F0 M[10] <val2> ST 0(R3),F4 ADDD F0,F4,F6 Y Ex F4 LD F4,0(R3) BNE F2,<…> N F2 F10 DIVD F2,F10,F6 ADDD F10,F4,F0 LD F0,10(R2) ROB7 ROB6 ROB5 ROB4 ROB3 ROB2 ROB1 Newest Reorder Buffer Oldest Registers To Memory Dest Resolve RAW memory conflict? (address in memory buffers) Integer unit executes in parallel Dest from Memory 2 ADDD R(F4),ROB1 3 DIVD ROB2,R(F6) Dest Reservation Stations 1 10+R2 FP adders FP multipliers 4/14/2017 CS252 S06 Lec8 ILPB

Tomasulo With Reorder buffer: Done? FP Op Queue -- F0 M[10] <val2> ST 0(R3),F4 ADDD F0,F4,F6 Y Ex F4 LD F4,0(R3) BNE F2,<…> N ROB7 ROB6 ROB5 ROB4 ROB3 ROB2 ROB1 Newest What about memory hazards??? Reorder Buffer F2 DIVD F2,F10,F6 N F10 ADDD F10,F4,F0 N Oldest F0 LD F0,10(R2) N Registers To Memory Dest Resolve RAW memory conflict? (address in memory buffers) Integer unit executes in parallel Dest from Memory 2 ADDD R(F4),ROB1 3 DIVD ROB2,R(F6) Dest Reservation Stations 1 10+R2 FP adders FP multipliers 4/14/2017 CS252 S06 Lec8 ILPB