Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Prediction to Accelerate Coherence Protocols Shubu Mukherjee, Ph.D. Principal Hardware Engineer VSSAD Labs, Alpha Development Group Compaq Computer.

Similar presentations


Presentation on theme: "Using Prediction to Accelerate Coherence Protocols Shubu Mukherjee, Ph.D. Principal Hardware Engineer VSSAD Labs, Alpha Development Group Compaq Computer."— Presentation transcript:

1 Using Prediction to Accelerate Coherence Protocols Shubu Mukherjee, Ph.D. Principal Hardware Engineer VSSAD Labs, Alpha Development Group Compaq Computer Corporation Shrewsbury, Massachusetts Joint work Mark D. Hill at the University of Wisconsin-Madison Published in the Proceedings of the 25th Annual International Symposium on Computer Architecture (ISCA), 1998.

2 Distributed Shared-Memory Machine CPU Cache Directory Hardware Main Memory CPU Cache Directory Hardware Main Memory Network Memory is physically distributed for scalability Per-CPU caches cache remote memory Cache coherence via directory protocols

3 Reduce Directory Protocol Latency Using Prediction get_rw_request inval_ro_request inval_ro_response get_rw_response Producer Cache Directory Consumer Cache get_rw_request get_rw_response Producer Cache Directory Consumer Cache inval _ro_response Coherence Protocol ActionSpeculative Action Dynamic Self-Invalidation (Lebeck & Wood, ISCA ‘95)

4 Directed Predictors Many Examples Read-modify write in SGI Origin (Laudon & Lenoski, ISCA ‘97) Scalable Coherence Interface (SCI)’s pairwise sharing Protocols optimized for migratory sharing (Cox/Fowler, Stenstrom, et al. ISCA ‘93) Dynamic Self-Invalidation (Lebeck & Wood, ISCA ‘95) Competitive Update (Karlin, et al., Algorithmica ‘88) Half-migratory optimization Compiler-directed prediction Can we have a general predictor? => COSMOS + easier to compose multiple predictors + discover & adapt to application-specific patterns - more hardware

5 Cosmos: A General Predictor Cosmos predictors for both cache (CP) and directory (DP) Predictor issues what message to predict?………………….…………...this talk how to integrate with real system?…………….NOT in this talk Network CPU Cache Directory Hardware Main Memory DP Cache CP CPU Cache Directory Hardware Main Memory DP Cache CP

6 Cosmos Overview Given cache block address history of incoming coherence messages for cache block (i.e., source processor and message type tuples) Cosmos Predicts next incoming coherence message for the cache block Cosmos’ Structure two-level adaptive predictor resembles Yeh & Patth’s PAp branch predictor (ISCA ‘92) Cosmos’ Prediction Accuracy 62 - 93% for five parallel scientific applications

7 Outline Motivation & Overview Cosmos’ Structure Cosmos Results

8 Producer-Consumer Sharing Pattern Cache Blocks Have Predictable Message Signatures get_rw_request from producer inval_ro_response from consumer inval_rw_response from producer get_ro_request from consumer get_rw_responseinval_rw_request Producer Cache get_ro_response inval_ro_request Consumer Cache DIRECTORY

9 Cosmos’ Basic Structure Parameterized by “depth” of MHT and “filters” for PHT (Reminiscent of Yeh and Patt’s PAp branch predictor) Message History Table (MHT) Pattern History Tables (PHT) Global Address of Cache Block

10 Cosmos’ Entries for Producer-Consumer Signature get_rw_request from producer inval_ro_response from consumer inval_rw_response from producer get_ro_request from consumer get_rw_responseinval_rw_request Producer Cache (P) get_ro_response inval_ro_request Consumer Cache (C) DIRECTORY MHT Index Prediction PHT Global Address of Cache Block Cosmos at the directory

11 Outline Motivation & Overview Cosmos’ Structure Cosmos Results

12 Evaluation Methodology Traces of coherence messages Simulator Wisconsin Wind Tunnel II (Mukhejee, et al. PAID, ‘97) Simulated coherence protocol = Wisconsin Stache Full-map Simple COMA (main memory used as software cache) Reinhardt, et al. ISCA ‘94 Simulated benchmarks appbt………………………………………………………………………NAS barnes……………………………………………………………...SPLASH II dsmc, moldyn, unstructured………….Universities of Maryland & Wisconsin

13 Cosmos’ Base Prediction Rate Overall accuracy = 62 - 84% (base) Low accuracy for barnes reassignment of logical data strcutrures to different memory addresses

14 Example Signatures: Appbt 94 inval_rw_request upgrade_response inval_ro_request get_ro_response 97 9395 93 get_ro_request inval_rw_response inval_ro_response upgrade_request 70 92 89 87 CACHE DIRECTORY Numbers for MHR of depth one, summarized for all cache blocks

15 Increasing Cosmos’ Accuracy Overall prediction accuracy = 62 - 93% Other techniques filters (e.g., J. Smith’s saturating counters subdividing coherence message stream (suggested by Sohi) available in Mukherjee, PhD. Thesis, May 1998 ftp://ftp.cs.wisc.edu/wwt/Theses/mukherjee-1side.ps

16 Cosmos’ Memory Overhead Depth of MHR appbt barnes dsmc moldynunstruct. ratio ovhdratioovhd ratioovhdratioovhd ratioovhd 1 1.2 5.4% 3.8 13.5% 0.8 3.9% 0.8 4.0% 1.7 6.8% 2 1.4 9.6% 6.9 35.4% 0.4 5.1% 1.1 8.3% 2.1 12.8% 3 1.9 16.4% 9.3 63.0% 0.3 6.7% 1.6 14.9% 2.8 21.9% 4 2.6 26.5% 10.9 91.8% 0.3 8.9% 2.0 21.6% 3.4 33.0% Ratio = total number of PHT entries / total number of MHT entries Ovhd = average memory overhead per 128-byte block For MHR depth = 2 overhead < 13% for all, except barnes (35%)

17 Summary and Future Work Cosmos Predictor predicts next coherence message for a cache block uses history information + simpler than composition of multiple directed predictors + adapts dynamically to application-specific coherence streams - requires more hardware than directed predictors Cosmos’ Prediction Accuracy 74 - 93% for four applications 62 - 69% for barnes (reassignment of logical data structures) Future Work improve Cosmos’ accuracy ( e.g., Kaxiras/Goodman 1999, Lai/Falsafi 1999 ) integrate Cosmos with a coherence protocol ( e.g., Lai/Falsafi 1999 )


Download ppt "Using Prediction to Accelerate Coherence Protocols Shubu Mukherjee, Ph.D. Principal Hardware Engineer VSSAD Labs, Alpha Development Group Compaq Computer."

Similar presentations


Ads by Google