We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byBraden Cousens
Modified about 1 year ago
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 ILP, cont. Maintaining Sequential Appearance –Precise Interrupts –RUU approach to OoO Scheduling
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Superscalar Processors: The Big Picture Program Form Processing Phase Static program dynamic inst. Stream (trace) execution window completed instructions Fetch and CT prediction Dispatch/ dataflow inst. Issue inst execution inst. Reorder & commit
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 A Generic Superscalar OOO Processor Pre-decode I-CACHE buffer Rename Dispatch scheduler Reorder buffer RF FUs Memory Interface
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Maintaining Sequential Semantics What if execution gets interrupted at an arbitrary point? –All insts. before commit –None thereafter We’ll focus on interrupts Same mechanisms used today to support SPECULATIVE EXECUTION “Definition”: Instr. executes speculatively up to complete. We don’t know yet if we should have executed this instr. Verification happens at commit (if ever).
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Interrupts Examples –Power Failing, Arithmetic Overflow –I/O Device Request, OS Call, Page Fault –Invalid Opcode, Breakpoint, Protection Viol. Aka Faults, Exceptions, or Traps Requirements –Surprise Jump (to vectored Address) –Linking Return Address –Saving State –Changing State (e.g., kernel mode)
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Classifying Interrupts 1a: Synchronous –Function of program state –overflow, page fault, etc. 1b. Asynchronous –e.g., External device or malfunction 2. Use Request –OS Call 2b. Coersed –From OS or hardware –page fault, protection violation 3a. User Maskable –Use can disable processing 3b. Non-Maskable –Guess!!!
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Classifying Interrupts, contd. 4a. Between Instructions –Usually Asynchronous 4b. Within an Instruction –Usually Synchronous –Harder to deal with, why??? 5a. Resume –As if nothing happened as far as the program is concerned 5b. Catastrophic –Say, bye bye, program is leaving us
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Restartable Pipelines Interrupts within an instruction are not catastrophic Most machines support this –Needed for virtual memory Some machines did not support this –Cost & Slowdown PRECISE INTERRUPTS is key –As if the interrupt happened at a well defined point in the original sequential order –First let’s consider a simple DLX-style pipeline
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Precise Interrupts Sequential Semantics Complete instructions before the offending instruction Squash (effects of) instructions after Save PC Force trap instruction into FETCH stage –divert execution to interrupt handler
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Precise Interrupts Jim Smith and Andrew Plezkun Paper Original work was for a “simple” pipeline Today the same principles are used in virtually all modern microprocessors –Support for SPECULATIVE EXECUTION executing instruction without knowing whether we should more on this later –and of course, precise interrupts We’ll stick to precise interrupts for the time being
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Do the Simple Thing First Modify State only when all preceding insts. are KNOWN to be exception free. Mechanism: Result Shift Register Stage = cycle At FETCH: Reserve all stages for the duration of the instruction
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Simple Solution Discussion Essentially In-Order Completion –Simple Easy to implement –Performance? Execution overlap still possible Writebacks in order Amplifies latencies Dependent Instructions wait longer
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Allowing out-of-order completes Add one more state for instruction execution: –COMPLETE & COMMIT COMPLETE: –Result calculated –Dependent instructions can use –BUT, don’t know if preceding instructions are all OK –I.e., don’t know if this instruction should have executed now based on the original program order COMMIT: –All preceding instructions executed with no problems –Can safely commit stage changes
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 OOO Complete & IO Commit Want: Out-of-Order Completion –Allow OOO completion –Maintain in-order COMMIT –Allow maximum overlap –Guarantee precise state if needed How does this improve performance? In-Order Complete OOO Complete Time DIV R3, _, _ ADD R1, _, _ ADD _, R1, _ In-order commits
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Reorder Buffer Result Shift Register: –Reserve Result Bus –Out-of-Order Completion Reorder Buffer –Defer Commits and do them in-order –Allow OOO Completes by buffering state motion res = result v = valid e = result NYA When to complete When to commit
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Reorder Buffer Complications State is kept in the reorder buffer Have to bypass from every entry –Need to determine latest write w/ respect to the consuming instruction RF RB Essentially: 1. In-Order Commits 2. Buffer speculative state till commit
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Speculative State Updates Two fundamental approaches – Do changes but keep a record of old state –Everything OK? Just discard record of changes HISTORY BUFFER – Keep two states: Architectural and Speculative On COMPLETE write state to Speculative On ISSUE read from speculative On COMMIT write to Architectural On Error, throw out Speculative state FUTURE FILE
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 History Buffer Allow out-of-order register file updates At decode record current value of target register in RB –notice that this is the previous value the register had On Commit? –Do nothing, state is fine On Exception? –Use History to UNDO changes made RF HB results Source operands Destination registers Exception
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 History Buffer Discussion Simple Mechanism Additional Register File Port Single Source for Input Operands Normal Instruction processing Not changed by much –Control mostly unchanged –Nothing to do on Commit for the common case Slow response to Interrupts –Need to scan through HB –Complex?
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Future File: The Optimist’s View Two Register Files: –One updated Out-of-Order (FUTURE) assume no exception will occur –One updated in Order (ARCHITECTURAL) Advantage: No delay to restore state on exception RF RB Source operands FF results
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 How These Relate to Register Renaming? Physical Registers provide sufficient storage for both speculative and architectural storage It’s the register map table that determines what is the current state On interrupt we have to restore the map table –Values are there in the physical register file History and Future approaches still valid –History: keep track of changes to register map table –On interrupt undo them one by one –Future: keep two tables Speculative: updated at decode Architectural: updated at commit
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 RUU Sohi’s Paper Common Mechanism for Precise Interrupts and OOO Execution Register Update Unit –A collection of Reservation stations –Organized as a FIFO queue –Instructions Enter In-order at FETCH –They Exit In-Order at COMMIT Register File updates happen at this point. Simplescalar follows this model –Well, mostly –Cut’s corners on when Completes become visible
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 RUU: OOO Execution Decode: –Check RUU for most recent write to register –If none found, read value from RF Do it in parallel really –If found, link to producer with a TAG RUU number is the TAG Issue –Wait till all input operands are ready Complete –Broadcast value and RUU ID Waiting instructions will pick value up Commit –Head and Tail pointer for FIFO operation –Only when everyone before has committed
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Where is the Rename Table? It’s the RUU –@ decode insts scan for the most recent update to register –If none found, then register in register file –Otherwise, get RUU entry # as tag Interrupts? –Simply flush RUU Pros/Cons: –Associative lookup for decode –RUU ports limit when consumers can read a value
Implementing Precise Interrupts in Pipelined Processors James E. Smith Andrew R.Pleszkun Presented By: Shrikant G.
Implementing Precise Interrupts in Pipelined Processors James E. Smith Andrew R.Pleszkun Presented By: Ravikumar Source:
Spring 2003CSE P5481 Reorder Buffer Implementation (Pentium Pro) Hardware data structures retirement register file (RRF) (~ IBM 360/91 physical registers)
Spring 2003CSE P5481 Precise Interrupts Precise interrupts preserve the model that instructions execute in program-generated order, one at a time If an.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Oct. 14, 2002 Topic: Instruction-Level Parallelism (Multiple-Issue, Speculation)
Samira Khan University of Virginia Feb 9, 2016 COMPUTER ARCHITECTURE CS 6354 Precise Exception The content and concept of this course are adapted from.
Computer Organization and Architecture CPU Structure and Function.
Out-of-Order Machine State Instruction Sequence: Inorder State: Look-ahead State: Architectural State: R3 A R7 B R8 C R7 D R4 E R3 F R8 G.
Computer Architecture 2011 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Speeding it up Part 3: Out-Of-Order and SuperScalar execution dr.ir. A.C. Verschueren.
What are Exception and Interrupts? MIPS terminology Exception: any unexpected change in the internal control flow – Invoking an operating system service.
William Stallings Computer Organization and Architecture 7 th Edition Chapter 12 CPU Structure and Function.
© Wen-mei Hwu and S. J. Patel, 2005 ECE 412, University of Illinois Lecture Instruction Execution: Dynamic Scheduling.
LECTURE 10 Pipelining: Advanced ILP. EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls,
Computer Organization and Architecture CPU Structure and Function Chapter 12.
Computer Structure 2014 – Out-Of-Order Execution 1 Computer Structure Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
Pipeline Exceptions & ControlCSCE430/830 Pipeline: Exceptions & Control CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;
Computer System Architecture Interrupt and Precise Exception Lynn Choi Dept. Of Computer and Electronics Engineering.
OOO execution © Avi Mendelson, 4/ MAMAS – Computer Architecture Lecture 7 – Out Of Order (OOO) Avi Mendelson Some of the slides were taken.
COMP 4211 Seminar Presentation Based On: Computer Architecture A Quantitative Approach by Hennessey and Patterson Presenter : Feri Danes.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections )
THE MIPS R10000 SUPERSCALAR MICROPROCESSOR Kenneth C. Yeager IEEE Micro in April 1996 Presented by Nitin Gupta.
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition CPU Structure and Function.
OOO Pipelines - II Smruti R. Sarangi IIT Delhi 1.
CPE 731 Advanced Computer Architecture ILP: Part IV – Speculative Execution Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
Copyright 2001 UCB & Morgan Kaufmann ECE668.1 Adapted from Patterson, Katz and Kubiatowicz © UCB Csaba Andras Moritz UNIVERSITY OF MASSACHUSETTS Dept.
CSC 4250 Computer Architectures September 22, 2006 Appendix A. Pipelining.
William Stallings Computer Organization and Architecture 7 th Edition Chapter 12 CPU Structure and Function 1.
1 Lecture: Out-of-order Processors Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ.
1 Lecture: Out-of-order Processors Topics: a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
1 Sixth Lecture: Chapter 3: CISC Processors (Tomasulo Scheduling and IBM System 360/91) Please recall: Multicycle instructions lead to the requirement.
Computer Architecture 2010 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
Computer Architecture 2011 – out-of-order execution (lec 7) 1 Computer Architecture Out-of-order execution By Dan Tsafrir, 11/4/2011 Presentation based.
National & Kapodistrian University of Athens Dep.of Informatics & Telecommunications MSc. In Computer Systems Technology Advanced Computer Architecture.
1 Lecture 10: Memory Dependence Detection and Speculation Memory correctness, dynamic memory disambiguation, speculative disambiguation, Alpha Example.
Out-of-order execution Lihu Rappoport 11/ MAMAS – Computer Architecture Out-Of-Order Execution Dr. Lihu Rappoport.
CS203 – Advanced Computer Architecture ILP and Speculation.
ECE 2162 Tomasulo’s Algorithm. Implementing Dynamic Scheduling Tomasulo’s Algorithm –Used in IBM 360/91 (in the 60s) –Tracks when operands are available.
EENG449b/Savvides Lec 4.1 1/22/04 January 22, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
1 Lecture 7: Speculative Execution and Recovery Branch prediction and speculative execution, precise interrupt, reorder buffer.
ECE/CS 552: Pipeline Hazards © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and Jim.
Chapter 12 CPU Structure and Function. Example Register Organizations.
Chapter 12 Pipelining Strategies Performance Hazards.
© 2017 SlidePlayer.com Inc. All rights reserved.