We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byLandon Green
Modified about 1 year ago
11 Dynamic scheduling Kosarev Nikolay MIPT Apr, 2010
22 Agenda In-order execution Out-of-order execution. Tomasulo’s algorithm Implementation in hardware Demo Hardware speculation Demo
33 In-order execution Data hazards - RAW, WAW. No WAR. Pipeline DIVR1 = R2, R3 ADDR9 = R1, R4 SUBR8 = R4, R5 DIVR1 = R2, R3 ADDR1 = R2, R4 SUBR6 = R1, R5 (but code has no sense)
44 Out-of-order execution Split ID into 2 stages: Issue - IS Decode, check for structural hazards Read operands - RO Wait until no data hazards, read operands Pipeline Out-of-order execution implies out-of-order completion (WB) Hazards – RAW, WAW, WAR DIVR0 = R2, R4 ADDR6 = R0, R8 SUBR8 = R10, R14 MULR6 = R10, R8
55 Tomasulo’s algorithm How are data hazards avoided? RAW – wait for availability of operands WAR, WAW – register renaming (переименование регистров) DIVR0 = R2, R4 ADDR6 = R0, R8 ADDR9 = R6, R1 SUBR8 = R10, R14 MULR6 = R10, R8 DIVR0 = R2, R4 ADDA = R0, R8 ADDR9 = A, R1 SUBB = R10, R14 MULR6 = R10, B
66 Implementation in HW
77 Demo Tomasulo's algorithm for dynamic scheduling LDF6 = R2, 2 LDF2 = R3, 4 MULF0 = F2, F4 SUBF8 = F2, F6 DIVF10 = F0, F6 ADDF6 = F8, F2
88 Hardware speculation Based on 3 key ideas: Dynamic branch prediction Speculative execution Dynamic scheduling Extra stage: instruction commit New buffer: ROB (reorder buffer) Pipeline
99 Hardware speculation
10 Demo Reorder buffer
Fetch Q MOB RS ROB Execute Retire RAT R1 R2 R3 R4 R5 R6 R7.
Instruction Level Parallelism Taewook Oh. Instruction Level Parallelism Measure of how many of the operations in a computer program can be performed simultaneously.
1 Instruction-Level Parallelism CS Instruction Level Parallelism (ILP) Pipelining –Limited form of ILP –Overlapping instructions, these instructions.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 30, 2002 Topic: Instruction-Level Parallelism (Dynamic Scheduling: Tomasulo’s.
Lecture 9 – OOO execution © Avi Mendelson, 5/ MAMAS – Computer Architecture Lecture 9 – Out Of Order (OOO) Dr. Avi Mendelson Some of the slides.
1 Lecture: Out-of-order Processors Topics: a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
Lec18.1 Step by step for Dynamic Scheduling by reorder buffer Copyright by John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
Out-of-Order Execution & Register Renaming Krste Asanovic Laboratory for Computer Science Massachusetts Institute of Technology Asanovic/Devadas Spring.
Lecture 1: Introduction Instruction Level Parallelism & Processor Architectures.
Chapter 3 – Dynamic Scheduling CSCI/ EENG – W01 Computer Architecture 1 Prof. Babak Beheshti Slides based on the PowerPoint Presentations created.
Topics Left Superscalar machines IA64 / EPIC architecture Multithreading (explicit and implicit) Multicore Machines Clusters Parallel Processors Hardware.
Data Dependencies A dependency type that can cause a stall.
CMSC 611: Advanced Computer Architecture Tomasulo Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
04/03/2016 slide 1 Dynamic instruction scheduling Key idea: allow subsequent independent instructions to proceed DIVDF0,F2,F4; takes long time ADDDF10,F0,F8;
1 Lecture 9: Dynamic ILP Topics: out-of-order processors (Sections )
1 Lecture: Out-of-order Processors Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ.
Review of CS 203A Laxmi Narayan Bhuyan Lecture2.
Hardware-Based Speculation. Exploiting More ILP Branch prediction reduces stalls but may not be sufficient to generate the desired amount of ILP One way.
Spring 2003CSE P5481 Out-of-Order Execution Several implementations out-of-order completion CDC 6600 with scoreboarding IBM 360/91 with Tomasulos algorithm.
Scoreboarding & Tomasulos Approach Bazat pe slide-urile lui Vincent H. Berk.
The IA-64 Architectural Innovations Hardware Support for Software Pipelining José Nelson Amaral 1.
EXAMPLE 3 DIV Unit is not Pipelined. So second instruction waits in ID stage although it is independent. DIV.D F0,F1,F2 IFID DIV1DIV1 DIV2DIV2 DIV3DIV3.
CS203 – Advanced Computer Architecture ILP and Speculation.
ILP: Software Approaches Bazat pe slide-urile lui Vincent H. Berk.
SE-292 High Performance Computing Pipelining R. Govindarajan
Spring 2003CSE P5481 Reorder Buffer Implementation (Pentium Pro) Hardware data structures retirement register file (RRF) (~ IBM 360/91 physical registers)
1 Lecture 7: Speculative Execution and Recovery Branch prediction and speculative execution, precise interrupt, reorder buffer.
CPUs CPU performance CPU power consumption Elements of CPU performance Cycle time CPU pipeline Memory system.
EE524/CptS561 Advanced Computer Architecture Dynamic Scheduling A scheme to overcome data hazards.
1 Review of Chapters 3 & 4 Copyright © 2012, Elsevier Inc. All rights reserved.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections )
Dynamic ILP: Scoreboard Professor Alvin R. Lebeck Computer Science 220 / ECE 252 Fall 2008.
Tomasulo without Re-order Buffer Opcode Operand1 Operand2 Reservation station MUL1 RS MUL2RS Store1 Multiply unit 1 Mul unit 2 Store unit 1 RS Store2 Store.
ENGS 116 Lecture 101 Tomasulo’s Approach and Hardware Based Speculation Vincent H. Berk October 22nd Reading for Today: 3.1 – 3.7.
7/2/ _23 1 Pipelining ECE-445 Computer Organization Dr. Ron Hayne Electrical and Computer Engineering.
2 x /10/2015 Know Your Facts!. 8 x /10/2015 Know Your Facts!
1 Zvika Guz Slides modified from Prof. Dave Patterson, Prof. John Kubiatowicz, and Prof. Nancy Warter-Perez Out Of Order Execution.
William Stallings Computer Organization and Architecture 8 th Edition Chapter 14 Instruction Level Parallelism and Superscalar Processors.
EENG449b/Savvides Lec /20/04 February 12, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG.
1 Advanced Computer Architecture Limits to ILP Lecture 3.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )
UTCS CS352, S07 Lecture 10 1 Pipelining Cycle F Instruction RXMW FRXMW FRXMW FRXMW FRXM FRX
Out-of-order execution Lihu Rappoport 11/ MAMAS – Computer Architecture Out-Of-Order Execution Dr. Lihu Rappoport.
Asanovic/Devadas Spring Advanced Superscalar Architectures Krste Asanovic Laboratory for Computer Science Massachusetts Institute of Technology.
Nov. 9, Lecture 6: Dynamic Scheduling with Scoreboarding and Tomasulo Algorithm (Section 2.4)
EENG449b/Savvides Lec /22/05 March 22, 2005 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
COMP4611 Tutorial 6 Instruction Level Parallelism October 22 nd /23 rd
CH14 Instruction Level Parallelism and Superscalar Processors CH01 TECH Computer Science Decode and issue more and one instruction at a time Executing.
Professor Nigel Topham Director, Institute for Computing Systems Architecture School of Informatics Edinburgh University Informatics 3 Computer Architecture.
CPE 731 Advanced Computer Architecture ILP: Part IV – Speculative Execution Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
© 2017 SlidePlayer.com Inc. All rights reserved.