Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev Nir Shavit MIT.

Similar presentations


Presentation on theme: "Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev Nir Shavit MIT."— Presentation transcript:

1 Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev Nir Shavit MIT

2 Good: Hardware Transactional Memory (HTM) HTM may always fail due to: 1.L1 cache capacity 2.Interrupt 3.Unsupported instruction Bad: The HTM is “best-effort” To ensure progress, we need a software fallback

3 Thread 1Thread 2 1. HTM Start 2. Read lock and check it is free 3.... code … 4. HTM Commit 1. HTM Start 2. Read lock and check it is free 3.... code … 4. HTM Commit No conflict – HTMs commit concurrently No conflict – HTMs commit concurrently A Possible Solution is: Lock Elision 1. Lock 2. Unlock

4 Thread 1Thread 2 1. HTM Start 2. Read lock and check it is free 3.... code … 1. HTM Start 2. Read lock and check it is free 3.... code … No concurrency between hardware and software Thread 3 1. HTM Start 2. Read lock and check it is free 3.... code …3.... FAIL … HTM Restart 1. Acquire Lock 2.... code … 3. Release Lock 4.... CONFLICT … HTM Restart 4.... CONFLICT … HTM Restart Wait for Lock A Possible Solution is: Lock Elision

5 Good – Simple: No need to instrument reads and writes Bad: – Serial fallback: A software fallback grabs the global lock and aborts all hardware transactions A Possible Solution is: Lock Elision

6 Thread 1Thread 2 1. HTM Start 2. Read lock and check it is free 3.... code … 1. HTM Start 2. Read lock and check it is free 3.... code … Thread 3 1. HTM Start 2. Read lock and check it is free 3.... code …3.... FAIL … HTM Restart 1. STM Start 2.... code … 3. … more code … 4.... more code … 4.... more code STM and HTM execute concurrently Another Approach is: Hybrid Transactional Memory

7 Good – Hardware-Software Concurrency Bad: – Complex: 1.Hard to coordinate hardware and software 2.Hard to apply to code due to instrumentation Another Approach is: Hybrid Transactional Memory Our focus GCC C/C++ TM helps here a lot

8 2006: First Hybrid TM [DamronFedorovaLevLuchangcoMoirNussbaum] – Key Idea: Use per location metadata version- locks to coordinate hardware and software Bad: – Hardware is slow: on each read/write must read the version-lock and execute a branch condition check Hybrid TM History

9 2007: Phased TM [LevMoirNussbaum] – Key Idea: Use HTM mode or STM mode, but not HTM and STM at the same time Bad: – Expensive to switch modes: a single fallback must stop all hardware Hybrid TM History

10 2011: Hybrid Norec (state-of-the-art) [DalessandroCarougeWhiteLevMoirScottSpear] – Key Idea: No metadata + global clock for coordination Hybrid TM History

11 Good – No metadata: Efficient for low concurrency Bad: – Limited Scalability: too much aborts due to global clock updates A software write must abort all hardware A hardware write must abort all software Hybrid NOrec

12 Slow-Path: Software Read X (pure) Lock clock ABORT X = 4 Fast-Path: Hardware Unlock clock Read clock Read X Read clock RESTART Update clock Read X (verify clock) Read X: check clock => changed => restart/revalidate

13 2011: Hybrid NOrec 2 [RiegelMarlierNowackFelberFetzer] – Key Idea: Use non-speculative reads inside HTM to verify the global clock and avoid unnecessary aborts Bad: – HTM of Intel and IBM has no support for non- speculative reads A Possible Solution

14 2014: Invyswell Hybrid [CalciuGottschlichShpeismanPokamHerlihy] – Key Idea: Allow unsafe concurrency between hardware and software, and use the HTM sandboxing to detect and handle errors A Recent Approach

15 Invyswell Slow-Path: Software Read X (NEW) Lock clock X = 4 (NEW) Read Y (OLD) Func(X, Y): Unsafe Hopes HTM aborts Y = 8 (NEW) Unlock clock Update clock Fast-Path: Hardware NO ABORT FUTURE

16 Good – Much less aborts than Hybrid Norec Bad: – Unfortunately, HTM sandboxing may miss errors, so a corrupted transactions may commit and crash the system: – This problem was shown in a recent work: “Pitfalls of Lazy Subscription” by [DiceHarrisKoganLevMoir] Invyswell

17 2015: RH NOrec [MatveevShavit] – Key Idea: Use a “mixed” fallback path, that uses both software and short hardware transactions Our New Approach

18 RH NOrec Slow-Path: Software Read X (NEW) Lock clock X = 4 (NEW) Read Y (OLD) Func(X, Y): Unsafe Hopes HTM aborts Y = 8 (NEW) Unlock clock Update clock Fast-Path: Hardware X = 4 (HIDDEN) Y = 8 (HIDDEN) HTM X and Y both OLD or both NEW – not a mix Read X (OLD) Read Y (OLD) Func(X, Y) Safe! A Writes are speculative (invisible) Mixed Slow-Path

19 Key Point 1: Execute software writes in a short hardware transaction – No need to abort hardware transactions – Full safety In practice this works well – Due to the 80:20 rule: a typical operation has 80% reads and 20% writes RH NOrec

20 Key Point 2: Execute a maximal amount of initial software reads in a read-only hardware transaction – Allows to defer the global clock read, and significantly reduce the software restarts/revalidations RH NOrec

21 HTM start …reads/writes… Update clock HTM commit Fast-Path: Hardware Mixed Path Read clock RESTART Read some X: check clock => changed => restart/revalidate … reads in software … (verifies clock)

22 HTM start …reads/writes… Update clock HTM commit HTM start …reads in HTM… (pure/direct) Read clock HTM commit HTM Prefix Fast-Path: Hardware Mixed Path NO ABORT

23 HTM start …reads/writes… Update clock HTM commit HTM start …reads in HTM… (pure/direct) Read clock HTM commit HTM Prefix …reads in software… HTM start HTM commit HTM Postfix Lock clock …writes in HTM… Unlock clock HTM start Update clock HTM commit NO ABORT …reads/writes…

24 Throughput on 8-core Intel (GCC C/C++)

25

26 RH Norec: a new Hybrid TM that is safe and scalable Key Idea: Use a “mixed” fallback path that uses two short hardware transactions: 1.HTM Prefix: Executes a maximal amount of initial reads – defers the global clock read 2.HTM Postfix: Executes the software writes – preserves safety and allows hardware- software concurrency Conclusion

27 Thank You


Download ppt "Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev Nir Shavit MIT."

Similar presentations


Ads by Google