Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Manchester Mark I, 1949. This was the second (the first was a small- scale prototype) machine built at Cambridge. A production version of this computer.

Similar presentations


Presentation on theme: "1 Manchester Mark I, 1949. This was the second (the first was a small- scale prototype) machine built at Cambridge. A production version of this computer."— Presentation transcript:

1 1 Manchester Mark I, 1949. This was the second (the first was a small- scale prototype) machine built at Cambridge. A production version of this computer was sold by Ferranti. The logic was implemented with 4200 vacuum tubes, which resulted in a great deal of down time. A transistorized prototype was built in 1953.

2 2 COMP 206: Computer Architecture and Implementation Montek Singh Tue, Feb 10, 2009 Topic: Pipelining III (Control Hazards)

3 3Overview 1. Data Hazards that require stalls 2. Control hazards Branch delay Branch delay 3. Dealing with exceptions 4. Multiple functional units Floating point unit, for example Floating point unit, for example

4 4 Data Hazard Example  Consider LDR1,0(R2); Load R1 DSUBR4, R1, R5; Use R1 ANDR6, R1, R7 ORR8, R1, R9  Let’s look at pipeline diagram

5 5 Data Needed from the Future! A problem for even the most advanced hardware.

6 6 Stall is Necessary

7 7 Summary: Types of Data Hazards  RAW j tries to read before i writes (most common) j tries to read before i writes (most common)  WAW j tries to write before i writes j tries to write before i writes Would leave the value of i rather than j Would leave the value of i rather than j Not a problem w/ our simple 5 stage MIPS because there’s only one place to write Not a problem w/ our simple 5 stage MIPS because there’s only one place to write  WAR j tries to write before i reads j tries to write before i reads Not common because reads occur early Not common because reads occur early  RAR not a hazard

8 8 On to Control Hazards  Pipeline hazards caused by branch  With multiple instructions in flight, what happens if you branch?

9 9 Solution 1: Stall  This is a fairly simple strategy  Needs control to disable instructions in the pipeline  Simple implementation – stall even if not taken  10% - 30% penalty BranchIFIDEXMEMWB Branch +1 IFIFIDEXMEMWB Branch +2 IFIDEXMEMWB Branch +3 IFIDEXMEM Branch +4 IFIDEX

10 10 Solution 2: Predict Not Taken  If wrong (taken), make sure that non-branch instructions change no state  Predict taken no help for our pipeline we know dest. & outcome at same time we know dest. & outcome at same time BranchIFIDEXMEMWB i+1IFidleidleidleidle TargetIFIDEXMEMWB Target+1IFIDEXMEMWB Target+2IFIDEXMEM

11 11 Solution 3: Delayed Branch  As in MIPS  Sequence is Branch instruction Sequential successor Branch target or next in line  Works well with our 5-stage pipe (next) No stalls No stalls

12 12 Branch Delay Pipeline

13 13 Compiler Impl. – Option (a)  Easiest, just move previous instruction to delay slot

14 14 Compiler Impl.– Option (b)  Can’t move the DADD Dependency Dependency Note branch condition Note branch condition  So put target in slot Will that be OK? Will that be OK?

15 15Dependencies  Compiler computes dependencies  If target will not be used when branch not taken, then OK to write it Here’s where the condition style of MIPS helps – no condition codes to worry about Here’s where the condition style of MIPS helps – no condition codes to worry about  If register will be used, then put nop in delay slot

16 16 Branch Delay – Option (c)  To use this option, must be OK to execute the OR  In other words, result of R7 from before not needed by code after branch

17 17 Summary: Control Hazards  So far have looked at simple pipe  Delay Slot  Compiler options

18 18Exceptions  Problem when multiple instructions are in flight is dealing w/ exceptions  Need to (perhaps) stop execution of some instructions  Avoid changing state  Possibly re-start instructions after dealing with exception

19 19 Examples of Exceptions  I/O interrupt  System call  Tracing (breakpoint, single step)  Arithmetic problem, integer & float  Page fault  Memory protection error  Illegal instruction Non-existent or protected Non-existent or protected  Power failure  Hardware fault (machine check)

20 20 Types of Exceptions  Synchronous or asynchronous Caused by external action? Caused by external action?  Some exceptions happen between instructions, others within  Resume vs terminate Will we need to restart instructions, or just stop? Will we need to restart instructions, or just stop? More on next slide More on next slide

21 21 Restarting Instructions  Example given in HP is a page fault due to a load or store Occurs at MEM stage Occurs at MEM stage We have instructions in pipe after faulting instruction that must be restarted after page fault We have instructions in pipe after faulting instruction that must be restarted after page fault  Possible sequence after a fault Insert trap at IF Insert trap at IF Disallow writes for all instructions in pipe Disallow writes for all instructions in pipe Save PC Save PC  Harder than it seems  What if we are in middle of a delayed branch?

22 22 Precise Exceptions  A machine that supports precise exceptions will Not allow faulting instruction to write Not allow faulting instruction to write Restart it (perhaps) and subsequent instructions as if exception had not happened Restart it (perhaps) and subsequent instructions as if exception had not happened  Sometimes too expensive to guarantee this, so some machines also support imprecise exception Enable processing at higher performance Enable processing at higher performance

23 23 Faults on Multiple Instructions  Example in HP LD … DADD …  Scenarios Data fault on LD and arithmetic fault on DADD Data fault on LD and arithmetic fault on DADD  Will happen at same time (MEM, EX) Data fault on LD, instruction fault on DADD Data fault on LD, instruction fault on DADD  So fault on 2 nd instruction will happen 1 st !  To handle this, will need to store all faults in a vector and deal with them in order, meanwhile making sure that inappropriate stage changes aren’t allowed

24 24Complications  Won’t go into this, but  Problems with ISAs more complicated that MIPS State change in middle of pipeline rather than at end State change in middle of pipeline rather than at end Instructions that take variable number of cycles Instructions that take variable number of cycles  Will look at pipe for this, but not exceptions

25 25 Multi Cycle Operations  Refers to multiple cycles in a state  Why? Floating point operations (and also integer divide) can be made more efficient by dividing into multiple cycles Floating point operations (and also integer divide) can be made more efficient by dividing into multiple cycles Otherwise, the clock rate will suffer Otherwise, the clock rate will suffer  What are the implications for the pipeline?

26 26 Multiple Functional Units  In this simple version, one instruction enters the EX stage at a time  Simple ones finish in 1 cycle  Complicated ones take multiple cycles

27 27 Pipelined vs. Not Pipelined Units  All types of inst. except DIV can be issued once per clock  Potential problems with ordering, hazards Not pipelined

28 28 RAW Hazard (2 of them here)  MUL (double precision) must wait for LD  ADD stalls to wait for F0  Store waits because of structural hazard

29 29 Potential Structural Hazard  Three instructions end up in MEM stage at same time  Note that there’s no deep structural hazard Only the load uses the memory Only the load uses the memory

30 30 Could also have WAW Hazard  Imagine L.D one cycle earlier Then 2 nd write of F2 would be 1 st Then 2 nd write of F2 would be 1 st So we’d have wrong value in F2 So we’d have wrong value in F2 Note: If F2 was used after the ADD.D and before L.D, then this would be caught by RAW hazard circuit Note: If F2 was used after the ADD.D and before L.D, then this would be caught by RAW hazard circuit

31 31Summary  Things start getting complicated when instructions complete in different numbers of cycles  We’ll be looking more at this

32 32Stalls  Divide structural hazards are shown separately (and are rare)  Stalls from RAW hazards roughly proportional to latency 0 for integer 0 for integer 3 for FP add/sub 3 for FP add/sub 6 for multiply 6 for multiply 24 for divide 24 for divide About 50% of the latency (not always) About 50% of the latency (not always)

33 33 Next Time  Look at an implementation  On to Scoreboarding Out of order execution Out of order execution  Then move on to more complex ILP  Read App. A, Sec. 4-7


Download ppt "1 Manchester Mark I, 1949. This was the second (the first was a small- scale prototype) machine built at Cambridge. A production version of this computer."

Similar presentations


Ads by Google