Lock vs. Lock-Free memory Fahad Alduraibi, Aws Ahmad, and Eman Elrifaei
Outline CPU vs. Memory performance development Single-core vs. Multi-core CPU Synchronization –Lock-based –Transactional Memory (TM) Related Research Lock-based vs. STM performance Methodology Expected results References
CPU vs. Memory performance x 10 6 x 10 5 x 10 4 x 10 3 x 10 2 x 10x 1x Growth Memory latency improvement CPU speed improvement
Dual Core CPU Chip Intro. Multi-core CPU Core and L1 Cache CPU Core and L1 Cache Bus Interface and L2 Cache Main Memory
Multi-core CPU Advs. –Higher clock rates for cash coherency –Increased data processing –Decreased latency Challenges –OS Support –Software Adjustment –Synchronization (shared data)
Synchronization Synchronizing concurrent access to shared memory by multiple threads Lock Based Synchronization –Coarse-grained Locking –Fine-grained Locking Lock Free Synchronization –Transactional Memory
Lock-Based Synchronization if (lock == 0) lock = myPID; /* lock free - set it */ Drawbacks –Deadlock –Priority inversion –Finer-grained locks Complex & Overhead Thread 1Thread 2 Lock
Transactional Memory Synch. Lock-free controlling access to shared memory in concurrent computing. Transactions are atomic: e.g. Swap (a,b); –Executes completely (commits) or has no effect (aborts) A transaction runs in isolation (serialization) temp = a; a = b; b = temp;
TM Advantages Easier parallel programming. Good parallel performance. Eliminates deadlocks. Avoids priority inversion and convoying. Fault tolerance (in case a thread dies).
TM Implementations Hardware TM Software TM Hybrid TM Slow but flexible Fast but limited Uses both HTM & STM
Related research McRT-STM: A High Performance Software Transactional Memory System for a Multi-core Runtime Compared performance between STM different Schemes Also compared performance between STM and locks with a set of programs Built in C++ Measurements done on 16-processor IBM x445 SMP system with Xeon MP 2.2 Ghz Running Redhat EL3
Related research Hybrid-TM –implementation of both software and hardware transactional memory schemes. Hybrid Transactional Memory. By: Sanjeev Kumar† Michael Chu‡ Christopher J. Hughes† Partha Kundu† Anthony Nguyen† †Intel Labs, Santa Clara, CA ‡University of Michigan, Ann Arbor HWSW System resources Trans in Trans out
Lock-based vs. STM performance Recent Research Studies showed that STM can perform as good as Fine-Grain Lock-Based system Source: “McRT-STM: A High Performance Software Transactional Memory System for a Multi-Core Runtime”
Lock-based vs. STM performance The performance is application dependent too Source: “McRT-STM: A High Performance Software Transactional Memory System for a Multi-Core Runtime”
Our Methodology Comparing Performance of Lock-Based Synchronization System with SXM Software Transactional System Building a benchmark to run programs written with locks The benchmark will be programmed in C# Language The inputs (parameters) to the benchmark will be (Program name, #threads) The output of the benchmark is execution time
Our Methodology (Cont.) To Measure Execution time –Record Start Time –Loop for number of Iterations –Record End Time –Subtract End Time – Start Time, divide by number of Iterations Many Iterations are used to calculate the performance accurately Different scenarios can be applied to calculate the average
Expected Results Execution Time
References M. Herlihy and J. E. B. Moss. Transactional memory: Architectural support for lock-free data structures. In Proc. 20th Annual International Symposium on Computer Architecture, pages 289–300,May N. Shavit and D. Touitou. Software transactional memory. Distributed Computing, Special Issue(10):99–116, S. Kumar, M. Chu, C. J. Hughes, P. Kundu, and A. Nguyen. Hybrid transactional memory. In Proc. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Mar McRT-STM: A High Performance Software Transactional Memory System for a Multi-Core Runtime “How to Write High-Performance C# Code" By: Jeff Varszegi,.NET Developer's Journal Wikipedia.org