Presentation is loading. Please wait.

Presentation is loading. Please wait.

ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan.

Similar presentations


Presentation on theme: "ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan."— Presentation transcript:

1 ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan Zhou University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu Presented at CARG, March 8 th, 2006 By Jason Zebchuk

2 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 2 Data Value Speculation  Predict the value and proceed speculatively  When correct value is known, if mispediction, squash and re-execute  Initial proposals L1 data value Store/Load Data dependences  Aggressive novel proposals  Retire speculative instructions Values of L2 misses Thread independence in Thread-Level Speculation (TLS) Speculative synchronization

3 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 3 Long-latency Speculation  Misprediction recovery is very wasteful  Most discarded instructions are still useful Checkpoint Prediction Resolution 210 Instructions 6.6 Instructions Speculatively retired instructions

4 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 4 Contributions (I)  ReSlice: Architecture to buffer forward slice and re- execute it on misprediction Checkpoint Prediction Resolution Buffered Forward Slice If succeeds If fails

5 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 5 Contributions (II)  A sufficient condition for correct slice re-execution and merge Guarantee to correctly repair the program state  Application to Thread-Level Speculation (TLS) on SpecInt Prediction: task live-ins Removed 61% task squashes Speedup: up to 33% over TLS (avg 12%) E×D 2 reduction: avg 20%

6 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 6 Outline  Motivation and Contributions  Idea of ReSlice  Architecture Design  Experimental Results  Conclusions

7 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 7 Idea of ReSlice  Initial execution of the task Predict value of “risky” load and continue e.g L2 missing loads, exposed loads in TLS … Buffer in HW the forward slice of the load  If a misprediction happens Re-execute the slice with the correct value If succeed: merge the register and memory state and continue If fail: roll back and re-execute (conventional recovery)

8 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 8 Why is this Challenging? … #1: LD R1 mem_A … #2: ST R2 (R1) … #3: LD R5 mem[0x20] … “Risky” Load R1  0x10 ST R2 mem[0x10] R1  0x20 ST R2 mem[0x20] LD R5 mem[0x20] Problem: Instruction #3 is not buffered! Buffered SliceCorrect Execution Initial Execution Initial execution is speculative  Buffered slices may not be correct

9 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 9 Solution: Run-time Address Check … #1: LD R1 mem_A … #2: ST R2 (R1) … #3: LD R5 mem[0x20] … “Inhibiting” Store R1  0x10 ST R2 mem[0x10] R1  0x20 ST R2 mem[0x20] Buffered SliceSlice Re-ExecutionInitial Execution Different store addresses; And mem[0x20] was loaded in initial execution “Risky” Load

10 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 10 Problems Buffered Slice Missing Insts Extra Insts Data Corruption Slice Re-execution Initial Speculative Execution (Example just shown) “Dangling” Load“Inhibiting” Load

11 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 11 A Sufficient Condition  Guarantee correct Slice re-execution State merge with program  Two sub-conditions Branch takes same direction in both initial execution and slice re-execution No presence of any previous three problems  Easy for HW to check at run-time  Details please see our paper (Briefly: compare addresses accessed in old and new run, also look for references to same address in first run)

12 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 12 How does ReSlice Work? Timeline Checkpoint Prediction Resolution Slice Collection Compare the correct and predicted value Correct prediction Slice Re-Execution Misprediction Incorrect State Merging Correct

13 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 13 Architecture Design CPU Live Ins Buffered Instr. Re-Execution Unit (REU) Slice Descriptor Slice Descriptor Slice Descriptors Slice Info Slice Buffer Cache Register State Memory State

14 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 14 Other Architectural Requirements  Tag physical registers with “SliceTag”  Tag Cache for storing addresses accessed by slice instructions  Undo Log records first store to each memory location in each Slice  High-coverage Predictor to predict which instructions to use as seeds. Uses Dependence and Value Predictor used by TLS  Uses TLS information to identify Speculatively Read/Written locations in cache

15 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 15 Step 1: Slice Collection Fill up the Slice Buffer when a prediction is made Track both register and memory dependences Save live-in operands and slice instructions Slices are buffered when instructions are retired CPU Live Ins Buffered Instr. Re-Execution Unit (REU) Slice Descriptor Slice Info Slice Buffer Cache Register State Memory State

16 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 16 An Example Slice Desc. Live InsBuffered Insts ST R3, mem[R4] R4 = 0x10 ADD R3, R1, R2 R4 = 0x10 ST R3, mem[R4]

17 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 17 Step 2: Slice Re-Execution REU takes over after a misprediction In-order execution Sufficient condition is checked during slice re-execution If success, merge the register and memory state; otherwise, squash the task and restart CPU Live Ins Buffered Instr. Re-Execution Unit (REU) Slice Descriptor Slice Info Slice Buffer Cache Register State Memory State

18 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 18 Step 3: State Merging Copy live registers back to CPU register file Merge memory state (details in paper) CPU Live Ins Buffered Instr. Re-Execution Unit (REU) Slice Descriptor Slice Info Slice Buffer Cache Register State Memory State

19 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 19 Multiple Overlapping Slices  One slice may change live-ins of the other slice  Detect overlapping slices during slice collection  Combine the instructions of the overlapping slices and re- execute them in program order misprediction

20 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 20 Outline  Motivation and Contributions  Idea of ReSlice  Architecture Design  Experimental Results  Conclusions

21 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 21 Methodology  Simulated CMP architecture 5 GHz Private 16k L1 per core; 1 MB Shared L2  SpecInt 2000 applications TLS+ReSlice Four 3-issue cores with TLS and ReSlice Baseline: TLS Four 3-issue cores with TLS Serial One 3-issue core

22 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 22 Slice Characterization (Averages) 10.4 instructions 4.5 Reg live-ins 1.0 Mem live-ins 2.2 Reg live-outs 1.9 Mem live-outs 1.6 slices per task 15% tasks with overlapping slices

23 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 23 Number of Task Squashes 61% squashes are avoided because of ReSlice

24 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 24 Performance  Speedup TLS+ReSlice Over TLS: up to 33% (avg 12%) Over Serial: avg 45%

25 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 25 Energy × Delay 2 E×D 2 reduction: 20% over TLS

26 ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 26 Conclusions  Generic architecture for forward slice re-execution  Sufficient condition for correct re-execution and merge  Improve TLS on SpecInt Speedups: up to 33% over TLS (avg 12%) E×D 2 reduction: avg 20% over TLS  Recovering wasted work is a promising approach Boost performance Energy efficiency

27 ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan Zhou University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu


Download ppt "ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan."

Similar presentations


Ads by Google