Presentation is loading. Please wait.

Presentation is loading. Please wait.

PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying.

Similar presentations


Presentation on theme: "PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying."— Presentation transcript:

1 PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying Tian 36th ACM/IEEE International Symposium on Computer Architecture (ISCA ‘ 09)

2 Last Level Caches (LLCs) are shared by all cores in Chip Multi-Processors (CMPs). Multiple cores compete for the limited LLC capacity. Manage Shared Caches Core0 L1IL1D Core1 L1IL1D Last Level Cache (LLC) Core1’s Data 2 Core0’s Data

3 LRU leads to poor performance and fairness as a sharing-oblivious cache management policy. Previous works tried to allocate LLC resources fairly via:  Capacity Management: way-partitioning (UCP)  Dead-Time Management: LRU insertion (TADIP) PIPP: Do both capacity and dead time management better at the same time !

4 Outline Background and Motivation Previous Work PIPP Evaluation Conclusion

5 UCP (Utility based Cache Partitioning) ` Core1 Core0 Core 0 gets 5 ways Core 1 gets 3 ways *Some materials are taken from original presentation slides.

6 DIP (Dynamic Insertion Policy) MRU LRU Incoming Block

7 MRU LRU Occupies one cache block for a long time with no benefit! DIP (Dynamic Insertion Policy)

8 MRU LRU Incoming Block

9 DIP (Dynamic Insertion Policy) MRU LRU Useless BlockEvicted at next eviction Useful BlockMoved to MRU position

10 DIP (Dynamic Insertion Policy) MRU LRU Useless BlockEvicted at next eviction Useful BlockMoved to MRU position

11 Cache Replacement Policy Eviction: Which block should be replaced when a cache miss occurs?  LRU block Insertion: For a coming block, where should it be inserted in the corresponding set?  MRU insertion (Default LRU replacement policy)  LRU insertion (Dead-on-arrival blocks) Promotion: If a block is re-referenced, where should its position be adjusted?  Move to MRU position

12 PIPP: Promotion/Insertion Pseudo-Partitioning Insertion: Target partitioning: ∏ = {∏ 1, ∏ 2, …., ∏ n }, ∑∏ i = w (w is the associativity of the cache) On insertion, core i inserts its coming block in position ∏ i. ( Dynamically computed via UCP monitors or other ways.) Promotion: One step toward MRU position with P and unchanged with 1-P. MRULRU To Evict Promote Hit Insert Position = 3 (Target Allocation) New

13 PIPP Example Core0 quota: 5 blocks Core1 quota: 3 blocks Core0 quota: 5 blocks Core1 quota: 3 blocks 1 1 A A 2 2 3 3 4 4 5 5 B B C C Core0’s Block Core1’s Block Request MRU LRU Core1’s quota=3 D D 13

14 PIPP Example Core0 quota: 5 blocks Core1 quota: 3 blocks Core0 quota: 5 blocks Core1 quota: 3 blocks 1 1 A A 2 2 5 5 3 3 4 4 D D B B Core0’s Block Core1’s Block Request MRU LRU 6 6 Core0’s quota=5 14

15 PIPP Example Core0 quota: 5 blocks Core1 quota: 3 blocks Core0 quota: 5 blocks Core1 quota: 3 blocks 1 1 A A 2 2 6 6 3 3 4 4 D D B B Core0’s Block Core1’s Block Request MRU LRU Core0’s quota=5 7 7 15

16 PIPP Example Core0 quota: 5 blocks Core1 quota: 3 blocks Core0 quota: 5 blocks Core1 quota: 3 blocks 1 1 A A 2 2 6 6 3 3 4 4 D D Core0’s Block Core1’s Block Request MRU LRU D D 7 7 16

17 PIPP Example Core0 quota: 5 blocks Core1 quota: 3 blocks Core0 quota: 5 blocks Core1 quota: 3 blocks 1 1 A A 2 2 7 7 6 6 4 4 Core0’s Block Core1’s Block Request MRU LRU Core1’s quota=3 D D 3 3 E E 17

18 PIPP Example Core0 quota: 5 blocks Core1 quota: 3 blocks Core0 quota: 5 blocks Core1 quota: 3 blocks 1 1 A A 2 2 7 7 6 6 D D Core0’s Block Core1’s Block Request MRU LRU 3 3 E E 2 2 18

19 Pseudo-Partition Benefit 19 MRU 0 Core0 quota: 5 blocks Core1 quota: 3 blocks Core0 quota: 5 blocks Core1 quota: 3 blocks Core0’s Block Core1’s Block Request Strict Partition MRU 1 LRU 1 LRU 0 New

20 Pseudo-Partition Benefit 20 MRU LRU Core0 quota: 5 blocks Core1 quota: 3 blocks Core0 quota: 5 blocks Core1 quota: 3 blocks Core0’s Block Core1’s Block Request New Pseudo Partition

21 Methodology SimpleScalar simulator for x86 Intel Core 2 processor 32KB, 8-way 3-cycle L1I-L1D for each core A shared 4MB, 16-way, 11-cycle LLC Multi-programmed workloads from SPEC CPU benchmarks. (2-core and 4-core workloads) 500m insns warmup, 250m insns simulation

22 Evaluation 2-Core Weighted Speedup TADIP FriendlyUCP Friendly PIPP outperforms LRU by 19.0%, UCP by 10.6%, TADIP by 10.1%

23 4-Core Weighted Speedup TADIP FriendlyUCP Friendly PIPP outperforms LRU by 21.9%, UCP by 12.1%, TADIP by 17.5%

24 Occupancy Control For most workloads, the partitioning deviation is within 1.0 of the target allocation, similar to UCP.

25 Conclusion Novel proposal on Insertion and Promotion A single unified mechanism provides both capacity and dead time management Outperforms prior UCP and TADIP

26 Thank you ! Questions?


Download ppt "PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches Yuejian Xie, Gabriel H. Loh Georgia Institute of Technology Presented by: Yingying."

Similar presentations


Ads by Google