Presentation is loading. Please wait.

Presentation is loading. Please wait.

(C) 2004 Daniel SorinDuke Architecture Using Speculation to Simplify Multiprocessor Design Daniel J. Sorin 1, Milo M. K. Martin 2, Mark D. Hill 3, David.

Similar presentations


Presentation on theme: "(C) 2004 Daniel SorinDuke Architecture Using Speculation to Simplify Multiprocessor Design Daniel J. Sorin 1, Milo M. K. Martin 2, Mark D. Hill 3, David."— Presentation transcript:

1 (C) 2004 Daniel SorinDuke Architecture Using Speculation to Simplify Multiprocessor Design Daniel J. Sorin 1, Milo M. K. Martin 2, Mark D. Hill 3, David A. Wood 3 1 Dept. of Electrical & Computer Engineering, Duke University 2 Dept. of Computer & Information Science, Univ. of Pennsylvania 3 Computer Sciences Dept., University of Wisconsin-Madison

2 IPDPS 2004 – Daniel Sorin slide 2 My Talk in One Slide Shared memory multiprocessors are complicated –Difficult to design for every possible corner case Proposal: Use speculation to target the common case –Speculate that corner cases won’t happen –Detect if they do occur and recover system –Ensure forward progress Case studies –Simplify cache coherence protocols –Simplify the interconnection network

3 IPDPS 2004 – Daniel Sorin slide 3 Speculation for Simplicity Why we want to avoid complexity –Time and money for design and verification Design for the common case –But we have to make ALL cases work correctly Examples of this philosophy in uniprocessors –Trapping to software for infrequent/obsolescent instructions –Pentium4 recovers from edge case scheduler deadlocks But this idea hadn’t been used for multiprocessors –Key: we now have efficient multiprocessor recovery

4 IPDPS 2004 – Daniel Sorin slide 4 Framework for Speculation Four keys to design simplification with speculation 1)Ensure that mis-speculations are rare 2)Detect all mis-speculations 3)Recover from mis-speculations 4)Ensure forward progress even for worst-case

5 IPDPS 2004 – Daniel Sorin slide 5 SafetyNet Checkpoint/Recovery We use SafetyNet [ISCA 2002] for system recovery All-hardware checkpoint/recovery for shared memory multiprocessors Periodically, takes logical checkpoints of system –Including caches, coherence state, memory, directory state –Implements checkpointing with incremental logging –Consistent checkpoints using logical time coordination Can recover 100,000+ cycles Negligible performance impact –Incremental logging performed off critical path Small log buffers (512 KB) at caches & memories

6 IPDPS 2004 – Daniel Sorin slide 6 The Need for Multiprocessor Recovery Assumption: multiprocessors will have system-wide recovery mechanisms for purposes of availability –As fault rates keep increasing, recovery is crucial Will be all-hardware (like SafetyNet) for performance –But many alternative designs are possible We leverage this recovery mechanism for recovering from mis-speculations

7 IPDPS 2004 – Daniel Sorin slide 7 Outline A Framework for Speculation Simplifying Cache Coherence Protocols Simplifying the Interconnection Network Evaluation Conclusions

8 IPDPS 2004 – Daniel Sorin slide 8 Directory Protocol Complexity We want adaptive routing in interconnection network –Better performance and availability –But adaptive routing precludes point-to-point ordering So what? –Point-to-point ordering simplifies protocol design –Eliminates several potential corner case races

9 IPDPS 2004 – Daniel Sorin slide 9 Race Case in Directory Protocol Example race if no point-to-point ordering in network P1 Dir P2 RequestReadWrite Writeback RequestReadWrite arrives first at Dir, gets forwarded to P1 Forwarded RequestReadWrite

10 IPDPS 2004 – Daniel Sorin slide 10 Race Case in Directory Protocol P1 Dir P2 RequestReadWrite Forwarded RequestReadWrite Writeback Ack Writeback Forwarded RequestReadWrite arrives after Writeback Ack

11 IPDPS 2004 – Daniel Sorin slide 11 Race Case in Directory Protocol Problem: P1 sees Forwarded Request in state Invalid P1 Dir P2 RequestReadWrite Forwarded RequestReadWrite Writeback Ack Writeback Not possible if point-to-point order in interconnection network

12 IPDPS 2004 – Daniel Sorin slide 12 Simplifying a Directory Protocol Speculate that adaptive network provides ordering 1)Why is mis-speculation rare? –Not many re-orderings –Most re-orderings don’t matter! 2)How do we detect all mis-speculations? –If we get a Forwarded RequestReadWrite in state Invalid 3)How do we recover? –SafetyNet 4)How do we ensure forward progress? –Slow-start operation for a while after recovery –Guarantees that this race can’t keep recurring

13 IPDPS 2004 – Daniel Sorin slide 13 Simplifying a Snooping Coherence Protocol During design, we missed a corner case State M State trans1 Writeback State trans2 Request ReadWrite Solution: it’s rare, treat it as mis-speculation Detect by seeing RequestReadWrite in state trans2 Recovery with SafetyNet Forward progress with slow-start after recovery ???

14 IPDPS 2004 – Daniel Sorin slide 14 Outline A Framework for Speculation Simplifying Cache Coherence Protocols Simplifying the Interconnection Network –Deadlock –Avoiding deadlock Evaluation Conclusions

15 IPDPS 2004 – Daniel Sorin slide 15 Two Causes of Deadlock P1 P2 Response full of requests Response Message M1 full of messages Message M2 Endpoint Deadlock Switch Deadlock switch1 switch2

16 IPDPS 2004 – Daniel Sorin slide 16 Avoiding Deadlock Simple but wasteful solution: full buffering –But it’s rare that we ever need full buffering More efficient solution: virtual channels (networks) For endpoint deadlock –Need a virtual network per type of message For switch deadlock –Need some number of virtual channels per virtual network –Depends on network topology and routing scheme A major source of design complexity

17 IPDPS 2004 – Daniel Sorin slide 17 Simplifying Deadlock Avoidance Speculate that deadlock won’t occur, despite using less than full buffering and no virtual channels 1)Why is mis-speculation rare? –Can usually avoid deadlock with reasonable buffering 2)How do we detect all mis-speculations? –Timeout mechanism for cache coherence transactions 3)How do we recover? –SafetyNet 4)How do we ensure forward progress? –Slow-start operation for a while after recovery –Guarantees that deadlock can’t keep recurring

18 IPDPS 2004 – Daniel Sorin slide 18 Outline A Framework for Speculation Simplifying Cache Coherence Protocols Simplifying the Interconnection Network Evaluation –Goals –Methodology –Results Conclusions

19 IPDPS 2004 – Daniel Sorin slide 19 Goals Discover the point at which mis-speculation recoveries impact performance –Determines whether our simplified snooping protocol and our simplified interconnection network are viable Determine whether our simplified directory protocol can usefully speculate on point-to-point ordering

20 IPDPS 2004 – Daniel Sorin slide 20 Methodology Full-system simulation –Simics provides full-system functionality –We added detailed timing model for memory system Workloads –Online transaction processing (OLTP) with DB2 –SPECjbb2000 java middleware –Apache static web serving –Slashcode dynamic web serving –Barnes-Hut scientific simulation

21 IPDPS 2004 – Daniel Sorin slide 21 How Rare Must Mis-speculation Be? We can tolerate high mis-speculation rates – these rates are much higher than what our simplified designs incur

22 IPDPS 2004 – Daniel Sorin slide 22 Adaptive Routing with Speculative Ordering Adaptive routing can provide better performance by routing around congestion, even with mis-speculations

23 IPDPS 2004 – Daniel Sorin slide 23 Conclusions Simplify multiprocessor design with speculation –Treat corner cases as mis-speculations & recover from them Must be able to ensure that –Mis-speculations are sufficiently rare –Can detect all mis-speculations –Can recover from mis-speculations –Can provide forward progress in all cases Showed how to simplify –Cache coherence protocols –Interconnection network deadlock avoidance Applicable to other complicated designs


Download ppt "(C) 2004 Daniel SorinDuke Architecture Using Speculation to Simplify Multiprocessor Design Daniel J. Sorin 1, Milo M. K. Martin 2, Mark D. Hill 3, David."

Similar presentations


Ads by Google