Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reactive NUMA: A Design for Unifying S-COMA and CC-NUMA Babak Falsafi and David A. Wood University of Wisconsin, Madison, 1997 Presented by: Jie Xiao.

Similar presentations


Presentation on theme: "Reactive NUMA: A Design for Unifying S-COMA and CC-NUMA Babak Falsafi and David A. Wood University of Wisconsin, Madison, 1997 Presented by: Jie Xiao."— Presentation transcript:

1

2 Reactive NUMA: A Design for Unifying S-COMA and CC-NUMA Babak Falsafi and David A. Wood University of Wisconsin, Madison, 1997 Presented by: Jie Xiao Feb 6, 2008

3 Outline: Introduction CC-NUMA, S-COMA, R-NUMA Theoretical Results Simulation Results Pros & Cons

4 Introduction DSM clusters Remote misses latency > Local misses latency Looking for the best remote caching strategy!

5 Introduction Looking for the best remote caching strategy! Solutions: CC-NUMA: Cache-coherent Non-Uniform Memory Access S-COMA: Simple Cache-Only Memory Architecture Our approach: R-NUMA: Reactive NUMA

6 CC-NUMA block cache: small & fast

7 S-COMA page cache: sufficiently large (part of the local node’s main memory) page granularity OS handles allocation and migration

8 CC-NUMA vs S-COMA Looking for the best remote caching strategy! Which one is better? Answer: Depends on the application! (1) Communication pages (2) Reuse pages

9 R-NUMA  Dynamically switching from CC-NUMA to S-COMA  Refetch times: per-node, per-page (hardware: counter)  Each node to independently choose the best protocol for a particular page  Greater performance stability  Not much extra hardware

10 R-NUMA CC-NUMA R-NUMA S-COMA

11 R-NUMA CC-NUMA S-COMA

12 Theoretical Results Worst case analysis: R-NUMA performs no more than 3 times worse than either a CC-NUMA or S-COMA.

13 Simulation Results Base line: CC-NUMA: infinite block cache CC-NUMA: 32 KB block cache S-COMA: 320KB page cache R-NUMA: 128B block cache, 320KB page cache, relocation threshold 64

14 Simulation Results

15 R-NUMA is only sensitive to block cache size for applications whose reuse working set does not fit in the page cache (e.g. ocean) A large fraction of reuse pages in an application favor a smaller threshold value (e.g. choleshy, fmm, lu and ocean) R-NUMA is not very sensitive to page-fault and TLB invalidation overheads

16 Pros & Cons Pros + Flexible: per-page per-node + Exploit the best remote caching strategy without much extra work Cons - Threshold: 64? Change according to the applications?


Download ppt "Reactive NUMA: A Design for Unifying S-COMA and CC-NUMA Babak Falsafi and David A. Wood University of Wisconsin, Madison, 1997 Presented by: Jie Xiao."

Similar presentations


Ads by Google