5 April 2005IPDPS 2005 6 Mutual Exclusion Access to shared data will be atomic because of lock Reduced Parallelism by definition Blocking, Danger of priority inversion and deadlocks. Solutions exists, but with high overhead, especially for multi-processor systems P1 P2 P3
5 April 2005IPDPS 2005 7 Non-blocking Synchronization Perform operation/changes using atomic primitives Lock-Free Synchronization Optimistic approach Retries until succeeding Wait-Free Synchronization Always finishes in a finite number of its own steps Coordination with all participants
5 April 2005IPDPS 2005 8 Memory Management Dynamic data structures need dynamic memory management Concurrent D.S. need concurrent M.M.!
5 April 2005IPDPS 2005 9 Concurrent Memory Management Concurrent Memory Allocation i.e. malloc/free functionality Concurrent Garbage Collection Questions (among many): When to re-use memory? How to de-reference pointers safely? P2P1P3
5 April 2005IPDPS 2005 10 Lock-Free Memory Management Memory Allocation Valois 1995, fixed block-size, fixed purpose Michael 2004, Gidenstam et al. 2004, any size, any purpose Garbage Collection Valois 1995, Detlefs et al. 2001; reference counting Michael 2002, Herlihy et al. 2002; hazard pointers
5 April 2005IPDPS 2005 11 Wait-Free Memory Management Hesselink and Groote, ”Wait-free concurrent memory management by create and read until deletion (CaRuD)”, Dist. Comp. 2001 limited to the problem of shared static terms New Wait-Free Algorithm: Memory Allocation – fixed block-size, fixed purpose Garbage Collection – reference counting
5 April 2005IPDPS 2005 12 Wait-Free Reference Counting De-referencing links 1. Read the link contents, i.e. a pointer. 2. Increment (FAA) the reference count on the corresponding object. What if the link is changed between step 1 and 2? Wait-Free solution: The de-referencing operation should announce the link before reading. The operations that changes that link should help the de-referencing operation.
5 April 2005IPDPS 2005 13 Wait-Free Reference Counting Announcing Writes the link adress to a (per thread and per new de-ref) shared variable. Atomically removes the announcement and retrieves possible answer (from helping) by Swap with null. Helping If announcement matches changed link, atomically answer with a proper pointer using CAS.
5 April 2005IPDPS 2005 14 Wait-Free Memory Allocation Solution (lock-free), IBM freelists: Create a linked-list of the free nodes, allocate/reclaim using CAS How to guarantee that the CAS of a alloc/free operation eventually succeeds? HeadMem 1Mem 2Mem i … Used 1 Reclaim Allocate …
5 April 2005IPDPS 2005 15 Wait-Free Memory Allocation Wait-Free Solution: Create 2*N freelists. Alloc operations concurrently try to allocate from the current (globally agreed on) freelist. When current freelist is empty, the current is changed in round-robin manner. Free operation of thread i only works on freelist i or N+i. Alloc operations announce their interest. All free and alloc operations try to help announced alloc operations in round-robin.
5 April 2005IPDPS 2005 16 Wait-Free Memory Allocation Null X … SWAP! Announcement variables Helping Globally agreed on which thread to help, incremented when agreed in round-robin. Free atomically answers the selected thread of interest with a free node using CAS. First time that Alloc succeeds with getting a node from the current freelist, it tries to atomically answer the selected thread of interest with the node using CAS. id X Announcing A value of null in the per thread shared variable indicates interest. Alloc atomically announces and recieves possible answer by using Swap. X CAS! X
5 April 2005IPDPS 2005 17 Performance Worst-case Need analysis of maximum execution path and apply known WCET techniques. e.g. 2*N 2 maximum CAS retries for alloc. Average and Overhead Experiments in the scope of dynamic data structures (e.g. lock-free skip list) H. Sundell and P. Tsigas, ”Fast and Lock-Free Concurrent Priority Queues for Multi-thread Systems”, IPDPS 2003 Performed on NUMA (SGI Origin 2000) architecture, full concurrency.
5 April 2005IPDPS 2005 18 Average Performance
5 April 2005IPDPS 2005 19 Conclusions New algorithms for concurrent & dynamic Memory Management Wait-Free & Linearizable. Reference counting. Fixed-size memory allocation. To the best of knowledge, the first wait-free memory management scheme that supports implementing arbitrary dynamic concurrent data structures. Will be available as part of NOBLE software library, http://www.noble-library.org http://www.noble-library.org Future work Implement new wait-free dynamic data structures. Provide upper bounds of memory usage.
5 April 2005IPDPS 2005 20 Questions? Contact Information: Address: Håkan Sundell Computing Science Chalmers University of Technology Email: email@example.com Web: http://www.cs.chalmers.se/~phs