Lock vs. Lock-Free memory Fahad Alduraibi, Aws Ahmad, and Eman Elrifaei.

Slides:



Advertisements
Similar presentations
Transactional Memory Parag Dixit Bruno Vavala Computer Architecture Course, 2012.
Advertisements

Maurice Herlihy (DEC), J. Eliot & B. Moss (UMass)
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Transactional Memory Supporting Large Transactions Anvesh Komuravelli Abe Othman Kanat Tangwongsan Hardware-based.
Transactional Memory Overview Olatunji Ruwase Fall 2007 Oct
Thread-Level Transactional Memory Decoupling Interface and Implementation UW Computer Architecture Affiliates Conference Kevin Moore October 21, 2004.
Copyright © 2006, CS 612 Transactional Memory Architectural Support for a Lock-Free Data Structure Some material borrowed from : Konrad Lai, Microprocessor.
Concurrent Data Structures in Architectures with Limited Shared Memory Support Ivan Walulya Yiannis Nikolakopoulos Marina Papatriantafilou Philippas Tsigas.
Single-Chip Multiprocessor Nirmal Andrews. Case for single chip multiprocessors Advances in the field of integrated chip processing. - Gate density (More.
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
Background Computer System Architectures Computer System Software.
SYNAR Systems Networking and Architecture Group CMPT 886: Special Topics in Operating Systems and Computer Architecture Dr. Alexandra Fedorova School of.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman Final Presentation.
1 MetaTM/TxLinux: Transactional Memory For An Operating System Hany E. Ramadan, Christopher J. Rossbach, Donald E. Porter and Owen S. Hofmann Presenter:
[ 1 ] Agenda Overview of transactional memory (now) Two talks on challenges of transactional memory Rebuttals/panel discussion.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Multi-core processors. History In the early 1970’s the first Microprocessor was developed by Intel. It was a 4 bit machine that was named the 4004 The.
Why The Grass May Not Be Greener On The Other Side: A Comparison of Locking vs. Transactional Memory Written by: Paul E. McKenney Jonathan Walpole Maged.
To GPU Synchronize or Not GPU Synchronize? Wu-chun Feng and Shucai Xiao Department of Computer Science, Department of Electrical and Computer Engineering,
A Transaction-Friendly Dynamic Memory Manager for Embedded Multicore Systems Maurice Herlihy Joint with Thomas Carle, Dimitra Papagiannopoulou Iris Bahar,
An Introduction to Software Transactional Memory
Computer System Architectures Computer System Software
A performance analysis of multicore computer architectures Michel Schelske.
Higher Computing Computer Systems S. McCrossan 1 Higher Grade Computing Studies 3. Computer Performance Measures of Processor Speed When comparing one.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Programming Paradigms for Concurrency Part 2: Transactional Memories Vasu Singh
Parallel and Distributed Systems Instructor: Xin Yuan Department of Computer Science Florida State University.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
Multi-Core Architectures
Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev Nir Shavit MIT.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Evaluating FERMI features for Data Mining Applications Masters Thesis Presentation Sinduja Muralidharan Advised by: Dr. Gagan Agrawal.
Integrating and Optimizing Transactional Memory in a Data Mining Middleware Vignesh Ravi and Gagan Agrawal Department of ComputerScience and Engg. The.
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics Minjia Zhang, 1 Jipeng Huang, Man Cao, Michael D. Bond.
Dr. Alexandra Fedorova School of Computing Science SFU
Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.
On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University.
Transactional Memory Lecturer: Danny Hendler. 2 2 From the New York Times…
CS510 Concurrent Systems Why the Grass May Not Be Greener on the Other Side: A Comparison of Locking and Transactional Memory.
Multi-core processors. 2 Processor development till 2004 Out-of-order Instruction scheduling Out-of-order Instruction scheduling.
Computer Network Lab. Korea University Computer Networks Labs Se-Hee Whang.
Concurrency unlocked Programming
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
Platform Abstraction Group 3. Question How to deal with different types hardware and software platforms? What detail to expose to the programmer? What.
© 2008 Multifacet ProjectUniversity of Wisconsin-Madison Pathological Interaction of Locks with Transactional Memory Haris Volos, Neelam Goyal, Michael.
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
Transactional Memory Student Presentation: Stuart Montgomery CS5204 – Operating Systems 1.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
On Transactional Memory, Spinlocks and Database Transactions Khai Q. Tran Spyros Blanas Jeffrey F. Naughton (University of Wisconsin Madison)
SSU 1 Dr.A.Srinivas PES Institute of Technology Bangalore, India 9 – 20 July 2012.
Background Computer System Architectures Computer System Software.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
ECE 1747: Parallel Programming Short Introduction to Transactions and Transactional Memory (a.k.a. Speculative Synchronization)
Hardware Architecture
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Software Coherence Management on Non-Coherent-Cache Multicores
Introduction to parallel programming
Welcome: Intel Multicore Research Conference
Parallel Software Development with Intel Threading Analysis Tools
PHyTM: Persistent Hybrid Transactional Memory
Multi-core processors
Multi-core processors
Challenges in Concurrent Computing
Symmetric Multiprocessing (SMP)
Yiannis Nikolakopoulos
Hybrid Transactional Memory
EE 4xx: Computer Architecture and Performance Programming
Presentation transcript:

Lock vs. Lock-Free memory Fahad Alduraibi, Aws Ahmad, and Eman Elrifaei

Outline CPU vs. Memory performance development Single-core vs. Multi-core CPU Synchronization –Lock-based –Transactional Memory (TM) Related Research Lock-based vs. STM performance Methodology Expected results References

CPU vs. Memory performance x 10 6 x 10 5 x 10 4 x 10 3 x 10 2 x 10x 1x Growth Memory latency improvement CPU speed improvement

Dual Core CPU Chip Intro. Multi-core CPU Core and L1 Cache CPU Core and L1 Cache Bus Interface and L2 Cache Main Memory

Multi-core CPU Advs. –Higher clock rates for cash coherency –Increased data processing –Decreased latency Challenges –OS Support –Software Adjustment –Synchronization (shared data)

Synchronization Synchronizing concurrent access to shared memory by multiple threads Lock Based Synchronization –Coarse-grained Locking –Fine-grained Locking Lock Free Synchronization –Transactional Memory

Lock-Based Synchronization if (lock == 0) lock = myPID; /* lock free - set it */ Drawbacks –Deadlock –Priority inversion –Finer-grained locks  Complex & Overhead Thread 1Thread 2 Lock

Transactional Memory Synch. Lock-free controlling access to shared memory in concurrent computing. Transactions are atomic: e.g. Swap (a,b); –Executes completely (commits) or has no effect (aborts) A transaction runs in isolation (serialization) temp = a; a = b; b = temp;

TM Advantages Easier parallel programming. Good parallel performance. Eliminates deadlocks. Avoids priority inversion and convoying. Fault tolerance (in case a thread dies).

TM Implementations Hardware TM Software TM Hybrid TM Slow but flexible Fast but limited Uses both HTM & STM

Related research McRT-STM: A High Performance Software Transactional Memory System for a Multi-core Runtime Compared performance between STM different Schemes Also compared performance between STM and locks with a set of programs Built in C++ Measurements done on 16-processor IBM x445 SMP system with Xeon MP 2.2 Ghz Running Redhat EL3

Related research Hybrid-TM –implementation of both software and hardware transactional memory schemes. Hybrid Transactional Memory. By: Sanjeev Kumar† Michael Chu‡ Christopher J. Hughes† Partha Kundu† Anthony Nguyen† †Intel Labs, Santa Clara, CA ‡University of Michigan, Ann Arbor HWSW System resources Trans in Trans out

Lock-based vs. STM performance Recent Research Studies showed that STM can perform as good as Fine-Grain Lock-Based system Source: “McRT-STM: A High Performance Software Transactional Memory System for a Multi-Core Runtime”

Lock-based vs. STM performance The performance is application dependent too Source: “McRT-STM: A High Performance Software Transactional Memory System for a Multi-Core Runtime”

Our Methodology Comparing Performance of Lock-Based Synchronization System with SXM Software Transactional System Building a benchmark to run programs written with locks The benchmark will be programmed in C# Language The inputs (parameters) to the benchmark will be (Program name, #threads) The output of the benchmark is execution time

Our Methodology (Cont.) To Measure Execution time –Record Start Time –Loop for number of Iterations –Record End Time –Subtract End Time – Start Time, divide by number of Iterations Many Iterations are used to calculate the performance accurately Different scenarios can be applied to calculate the average

Expected Results Execution Time

References M. Herlihy and J. E. B. Moss. Transactional memory: Architectural support for lock-free data structures. In Proc. 20th Annual International Symposium on Computer Architecture, pages 289–300,May N. Shavit and D. Touitou. Software transactional memory. Distributed Computing, Special Issue(10):99–116, S. Kumar, M. Chu, C. J. Hughes, P. Kundu, and A. Nguyen. Hybrid transactional memory. In Proc. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Mar McRT-STM: A High Performance Software Transactional Memory System for a Multi-Core Runtime “How to Write High-Performance C# Code" By: Jeff Varszegi,.NET Developer's Journal Wikipedia.org