Presentation is loading. Please wait.

Presentation is loading. Please wait.

Conditional Memory Ordering Christoph von Praun, Harold W.Cain, Jong-Deok Choi, Kyung Dong Ryu Presented by: Renwei Yu Published in Proceedings of the.

Similar presentations


Presentation on theme: "Conditional Memory Ordering Christoph von Praun, Harold W.Cain, Jong-Deok Choi, Kyung Dong Ryu Presented by: Renwei Yu Published in Proceedings of the."— Presentation transcript:

1 Conditional Memory Ordering Christoph von Praun, Harold W.Cain, Jong-Deok Choi, Kyung Dong Ryu Presented by: Renwei Yu Published in Proceedings of the 33nd International Symposium on Computer Architecture, 2006.

2 Motivation  Modern multiprocessor systems need memory barrier instructions in the program to specify the memory ordering  Conventionally, we can guarantee memory ordering by using locks or barriers, it leads to superfluous memory barriers in programs.  We need a mechanism to reduce unnecessary memory ordering.

3 Redundancies of memory ordering in conventional locking algorithms  Lock operation on lock variable l  Unlock operation on lock variablel Neither private nor shared caches provide both goals

4 Source of memory ordering redundancy  Thread-confinement of lock variables.  Memory ordering that occurs for lock variables that are solely accessed by a single thread are redundant  Thread locality of locking.  Locality of locking is a situation where consecutive acquires of a lock variable are made by the same thread  Eager releases and repetitive acquires. CMPs change Latency-Capacity Tradeoff in two ways

5 CMO-conditional memory ordering  CMO is demonstrated on a lock algorithm that identifies those dynamic lock/unlock operations for which memory ordering is unnecessary, and speculatively omits the associated memory ordering instructions.  When ordering is required, this algorithm relies on a hardware mechanism for initiating a memory ordering operation on another processor.

6 CMO-conditional memory ordering  Acquire of lock l with conditional memory ordering

7 CMO-conditional memory ordering  Release of lock l with conditional memory ordering

8 CMO-conditional memory ordering  Memory synchronization model is different: the release synchronization is omitted at the unlock operation and “recovered” at the lock operation – only if necessary.  Necessity is determined according to a release number that is communicated between the thread that unlocks l and the thread that subsequently locks l.

9 Release numbers  relnum ⇐ (id & release ctr. of current proc)  a value that reflects a combination of a processor id and a counter of the release synchronization operations(release counter) that the respective processor performed at a certain stage during the execution of a program.

10 Conditional memory ordering  Based on the release number, the system arranges that release synchronization is recovered at the processor that previously released the lock, but only if necessary.  (sync conditional) implies (sync acquire) at the processor that issues the instruction.

11 Hardware support for CMO  Logical operation  Release vector entry  Register operand  Comparison of release counters  Release vector support  To support low latency reads, a copy of the release vector is mirrored in local storage at each processor.  Broadcast operation  Release hints  Instruct a processor to increment its release counter as soon as the conditions are met

12 Evaluation  S-CMO: A software CMO prototype  The result show that CMO avoids memory ordering operations for the vast majority of dynamic acquire and release operations across a set of multithreaded Java workloads, leading to significant speedups for many.  However, performance improvements in the software prototype are hindered by the high cost of remote memory ordering.

13 Experimental Methodology  Use a set of single and multi-threaded Java benchmarks from Java Grande and SPEC benchmark suites.  Run these applications on IBM’s J9 productive virtual machine.  Performed on both Power4 and Power5 multiprocessor systems running AIX, with 4 and 6 processors respectively.

14 Software CMO prototype with hardware support  Hardware-based (sync conditional) and (sync remote) implementation

15 Software CMO prototype with hardware support  CMO performance while varying remote sync latency in high-cost (Power4)memory ordering implementation.

16 Software CMO prototype with hardware support  CMO performance while varying remote sync latency in high-cost (Power5)memory ordering implementation.

17 Future Proposal  Hardware Proposal  Software Proposal

18 Summary  It developed a algorithm called conditional memory ordering (CMO), that can eliminates redundant memory ordering operations and improves the performance of the system effectively.  It summaries the characters the of synchronization and memory ordering operations in lock intensive Java workloads and demonstrate that a lot of memory ordering operations occur superfluously.  It evaluates the performance improvement of CMO.  It gives a Hardware proposals of CMO and its hardware implementation using a software prototype and an analytical model.

19 Conclusions  CMO can significantly improve the performance of multiprocessor systems.  With hardware support, CMO offers significant performance benefits across our set of Java benchmarks when assuming a reasonable remote synchronization latency.

20 Thank you Questions?


Download ppt "Conditional Memory Ordering Christoph von Praun, Harold W.Cain, Jong-Deok Choi, Kyung Dong Ryu Presented by: Renwei Yu Published in Proceedings of the."

Similar presentations


Ads by Google