Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quantifying and Comparing the Impact of Wrong-Path Memory References in Multiple-CMP Systems Ayse Yilmazer, University of Rhode Island Resit Sendag, University.

Similar presentations


Presentation on theme: "Quantifying and Comparing the Impact of Wrong-Path Memory References in Multiple-CMP Systems Ayse Yilmazer, University of Rhode Island Resit Sendag, University."— Presentation transcript:

1 Quantifying and Comparing the Impact of Wrong-Path Memory References in Multiple-CMP Systems Ayse Yilmazer, University of Rhode Island Resit Sendag, University of Rhode Island Joshua J. Yi, Freescale Semiconductor, Inc.

2 Motivation Previous work on Wrong-path (WP) effects in Uniprocessors  Positive Effects: Prefetching Up to 20% better performance for 181.mcf (SPECint 2000)  Negative Effects: Pollution L1 and L2 cache pollution Extra traffic  Important to simulate WP, especially for some applications How about WP effects in Multiple-CMP systems?

3 Outlines Wrong Path Effects in SMPs and multi-CMPs Simulation Methodology Evaluation Results Conclusion

4 Wrong-path effects in SMPs – 0 / 4 Broadcast (snoop)- and directory-based SMP systems  MSI, MOSI, MESI, MOESI cache coherence protocols Same issues in uniprocessors apply  Pollution effect  Prefetching effect  Extra cache/memory traffic In contrast to uniprocessor effects, WP cause:  Extra coherence traffic: data, invalidations, write-backs, acknowledgements  Additional cache block state transitions

5 Wrong-path effects in SMPs – 1 / 4 Replacements A speculatively replaces B A is a Wrong-path Block ! Initial States

6 Wrong-path effects in SMPs – 2 / 4 Write-backs Write-back dirty copy of B Write-back dirty copy of A Only for MESI (or MSI) M -> S

7 Wrong-path effects in SMPs – 3 / 4 Invalidations P1 loses its write privileges for block A P1 asks for grant to write and sends invalidation

8 Wrong-path effects in SMPs – 4 / 4 Data/Bus and Coherence Traffic Increases  L1 references,  L2 references,  coherence traffic snoop, directory requests for data and invalidations Power Consumption Increases  Due to extra cache references, coherence traffic and cache block state transitions Resource Contention  Competing with correct-path resources In contrast to uniprocessors, the increase in the frequency of full service buffers  critical when many cache-to-cache transfers

9 WP effects in Multiple-CMPs – 0 / 2 CMP node and a 4 CMP system  We studied inclusive L1 and L2 cache  L2 cache also tracks the coherence of cache blocks in L1

10 WP effects in Multiple-CMPs – 1 / 2 State Transitions when replacement of an SO line in L2 cache SOOIV OINI S I

11 WP effects in Multiple-CMPs – 1 / 2 State Transitions when an MT line in L2 cache receives a WP request MTMO SO M S

12 Outlines Wrong Path Effects in SMPs and multi-CMPs Simulation Methodology Evaluation Results Conclusion

13 Experimental Methodology GEMS simulator – Wisconsin Multifacet Group  Based on Virtutech SIMICS  Aggressive out-of-order superscalar processor  Detailed Shared-Memory Model We evaluate 16-processor (4 and 8-CMPs) SPARC V9 system running unmodified Solaris 9 Evaluated 2-level MOSI directory coherence protocol  MOSI: Modified, Owned, Shared, Invalid We track the speculatively generated memory references  and mark them as being on the wrong-path when the branch misprediction is known

14 Experimental Methodology

15 Outlines Wrong Path Effects in SMPs and multi-CMPs Simulation Methodology Evaluation Results Conclusion

16 Evaluation Results 1 / 5 4 CMPs8 CMPs -- L1 and L2 Cache Traffic Total memory references increase by 16% and 14% for 4- and 8-CMPs, respectively. L2 cache references increase by 35% and 36%, respectively. For em3d, the increase in the number of L1 misses increase as much as 70%.

17 Evaluation Results 2 / 5 -- Coherence Traffic Internal -- 36% External -- 30% 4 CMPs8 CMPs

18 Evaluation Results 3 / 5 -- L1 and L2 cache replacements L1 -- 30%, L2 -- 17% Potential Cache Performance Impact TypeMeaningL1L2 Usedused by a correct-path reference50%7% Unused evicted before being used or never used by a correct- path 42%70% Direct Miss Replaces a cache block that is needed by a later correct-path load, and is evicted before being used. 4%20% Indirect Miss Changes the LRU of a set, which may eventually cause correct-path misses 4%3%

19 Evaluation Results 4 / 5 -- Write Misses 4 CMPs8 CMPs On average 4% On average 7%

20 Evaluation Results 5 / 5 -- Cache Line State Transitions 4 CMPs Internal: 2% to 13% External: 1% to 9% Internal: 2% to 17% External: 1% to 10% 8 CMPs

21 Outlines Wrong Path Effects in SMPs and multi-CMPs Simulation Methodology Evaluation Results Conclusion

22 It is important to model WP memory references in cache- coherent multi-CMP systems For multi-CMPs, not only do the WP affect the performance of individual processors due to prefetching and pollution, they also affect the performance of the entire system by increasing  cache coherence transactions  cache block state transitions  write-backs  invalidations  resource contention For a workload with many cache-to-cache transfers, WP can significantly affect coherence actions.

23 The End Thank You !


Download ppt "Quantifying and Comparing the Impact of Wrong-Path Memory References in Multiple-CMP Systems Ayse Yilmazer, University of Rhode Island Resit Sendag, University."

Similar presentations


Ads by Google