Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.

Slides:



Advertisements
Similar presentations
Operating Systems: Monitors 1 Monitors (C.A.R. Hoare) higher level construct than semaphores a package of grouped procedures, variables and data i.e. object.
Advertisements

Concurrent programming for dummies (and smart people too) Tim Harris & Keir Fraser.
Software Transactional Memory and Conditional Critical Regions Word-Based Systems.
Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,
Code Generation and Optimization for Transactional Memory Construct in an Unmanaged Language Programming Systems Lab Microprocessor Technology Labs Intel.
Chapter 6: Process Synchronization
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Rich Transactions on Reasonable Hardware J. Eliot B. Moss Univ. of Massachusetts,
Toward High Performance Nonblocking Software Transactional Memory Virendra J. Marathe University of Rochester Mark Moir Sun Microsystems Labs.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
Threading Part 2 CS221 – 4/22/09. Where We Left Off Simple Threads Program: – Start a worker thread from the Main thread – Worker thread prints messages.
TOWARDS A SOFTWARE TRANSACTIONAL MEMORY FOR GRAPHICS PROCESSORS Daniel Cederman, Philippas Tsigas and Muhammad Tayyab Chaudhry.
Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.
5.6 Semaphores Semaphores –Software construct that can be used to enforce mutual exclusion –Contains a protected variable Can be accessed only via wait.
1 Lecture 21: Synchronization Topics: lock implementations (Sections )
CS510 Advanced OS Seminar Class 10 A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy.
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
Exceptions and side-effects in atomic blocks Tim Harris.
Tuple Spaces and JavaSpaces CS 614 Bill McCloskey.
CS510 Concurrent Systems Class 13 Software Transactional Memory Should Not be Obstruction-Free.
Multiscalar processors
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
Department of Computer Science Presenters Dennis Gove Matthew Marzilli The ATOMO ∑ Transactional Programming Language.
Distributed Deadlocks and Transaction Recovery.
Real-Time Concepts for Embedded Systems Author: Qing Li with Caroline Yao ISBN: CMPBooks.
Modern Concurrency Abstractions for C# by Nick Benton, Luca Cardelli & C´EDRIC FOURNET Microsoft Research.
Software Transactional Memory for Dynamic-Sized Data Structures Maurice Herlihy, Victor Luchangco, Mark Moir, William Scherer Presented by: Gokul Soundararajan.
A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.
CS5204 – Operating Systems Transactional Memory Part 2: Software-Based Approaches.
Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
Maged M.Michael Michael L.Scott Department of Computer Science Univeristy of Rochester Presented by: Jun Miao.
ICS 313: Programming Language Theory Chapter 13: Concurrency.
Chapter 6 – Process Synchronisation (Pgs 225 – 267)
Concurrency Control 1 Fall 2014 CS7020: Game Design and Development.
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
CS510 Concurrent Systems Jonathan Walpole. A Methodology for Implementing Highly Concurrent Data Objects.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
1 Condition Variables CS 241 Prof. Brighten Godfrey March 16, 2012 University of Illinois.
13-1 Chapter 13 Concurrency Topics Introduction Introduction to Subprogram-Level Concurrency Semaphores Monitors Message Passing Java Threads C# Threads.
CS533 – Spring Jeanie M. Schwenk Experiences and Processes and Monitors with Mesa What is Mesa? “Mesa is a strongly typed, block structured programming.
A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy Slides by Vincent Rayappa.
AtomCaml: First-class Atomicity via Rollback Michael F. Ringenburg and Dan Grossman University of Washington International Conference on Functional Programming.
1 Previous Lecture Overview  semaphores provide the first high-level synchronization abstraction that is possible to implement efficiently in OS. This.
Optimistic Design CDP 1. Guarded Methods Do something based on the fact that one or more objects have particular states Make a set of purchases assuming.
Java.util.concurrent package. concurrency utilities packages provide a powerful, extensible framework of high-performance threading utilities such as.
Multiprocessors – Locks
Chapter 4 – Thread Concepts
Minh, Trautmann, Chung, McDonald, Bronson, Casper, Kozyrakis, Olukotun
Part 2: Software-Based Approaches
Lecture 19: Coherence and Synchronization
Threads Cannot Be Implemented As a Library
Chapter 4 – Thread Concepts
Faster Data Structures in Transactional Memory using Three Paths
Challenges in Concurrent Computing
A Qualitative Survey of Modern Software Transactional Memory Systems
Lecture 21: Synchronization and Consistency
Shared Memory Programming
Lecture: Coherence and Synchronization
Critical section problem
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Software Transactional Memory Should Not be Obstruction-Free
CSE 153 Design of Operating Systems Winter 19
Update : about 8~16% are writes
Foundations and Definitions
Lecture: Coherence and Synchronization
Lecture 18: Coherence and Synchronization
CSE 542: Operating Systems
CSE 542: Operating Systems
Presentation transcript:

Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008

Talk Outline Motivation – Problems with Concurrent programming – Types of concurrency control (Imperative vs Declarative) Support for Concurrency – Conditional Critical Regions (CCR) – Software Transactional Memory (STM) Implementation of STM in Java – Details – Optimizations – Modifications to JVM Results Conclusion 6/25/2015CS258 - Parallel Computer Architecture2

Concurrent Programming Concurrent programming is considered difficult because – Abstractions can be hard/unintuitive – Hard to write reliable and scalable software – Hard to monitor side-effects of concurrency control using locks (like priority-inversion and deadlocks) – Difficult performance optimizations 3CS258 - Parallel Computer Architecture6/25/2015

Styles of concurrency control Imperative (Mutex based) Declarative (Conditional Critical Regions) 4CS258 - Parallel Computer Architecture6/25/2015

Support for concurrency Programming Language features – Java’s util.concurrent library provides atomic variables, locks, queues, thread pools etc – CCR-like features can be supported using semaphores, atomics – Static analysis can remove some pessimistic sharing scenarios, but cannot eliminate all Non-blocking algorithms can guarantee that as long as a thread is not contending with other threads for resources, it will make forward progress 5CS258 - Parallel Computer Architecture6/25/2015

Conditional Critical Regions CCRs are popular in teaching concurrency, but do not have efficient implementations There is no indication of which memory locations can be accessed inside CCRs Implementations that used mutex locks to allow only one CCR to execute at any time were inefficient This paper maps CCRs onto Software Transactional Memory (STM) and studies the implementation in the Java language 6CS258 - Parallel Computer Architecture6/25/2015

CCR implementation in Java atomic (condition) { statements; } is a CCR block CCR can access any field of any object Exceptions can be thrown from inside CCRs Native methods cannot be called inside CCRs CCRs can be nested Mutex locks can be used inside CCRs (with non-blocking implementations) 7CS258 - Parallel Computer Architecture6/25/2015

CCR implementation in Java (contd) Wait() and notify() methods should not be used inside CCRs Problems with class loading and initialization in Java “Correct” consistency is guaranteed if for any shared location – all accesses to it must be controlled by a given mutex, or – all accesses to it must be made within CCRs, or – it must be marked as volatile. 8CS258 - Parallel Computer Architecture6/25/2015

STM Implementation Five operations for transaction management 1.STMStart() 2.STMAbort() 3.STMCommit() Commits if the transaction is valid 4.STMValidate() Checks if the current transaction is valid Used to check error conditions anytime inside a transaction 5.STMWait() Allows threads to block on entry to CCR Simple implementation : STMWait() = STMAbort() Efficient Implementation : Block the caller till there is an update to a location the transaction has accessed Two operations for memory access 1.STMRead(addr a) 2.STMWrite(addr a, stm_word w) 9CS258 - Parallel Computer Architecture6/25/2015

Heap Structure Modification Ownership function Logical State 1 – a1 (7,15) Logical State 2 – a2 (100,7) Logical State 3 – a3 (200,7) 10CS258 - Parallel Computer Architecture6/25/2015

Implementation details STMStart – Allocate new descriptor and set status to ACTIVE STMAbort – Set status to ABORT STMRead – If no entry exists in orec, copy the value from heap – If an entry already exists in current transaction, use that value and version – Else, determine logical state of location and copy its value 11CS258 - Parallel Computer Architecture6/25/2015

Implementation details STMWrite – Perform a read on location – Set new_version <= old_version+1 – Use this new_version for other accesses to that particular orec STMCommit – Acquire a lock to all locations accessed inside transaction – If that fails, one can either abort the owner, or wake it if in STMWait(), or abort current transaction – If lock acquisition succeeds, check if the versions match – If that succeeds, set status to COMMITTED, write back and release locks – Else, set status to ABORTED and release locks 12CS258 - Parallel Computer Architecture6/25/2015

Implementation STMCommit – While releasing the locks, set version of released orecs to new_version if committed STMValidate – Similar to STMCommit without the actual commit STMWait – Similar to STMCommit, acquiring locks to orecs – If successful, set status to ASLEEP – Other transactions that try to access the orec will wake up the waiting transaction 13CS258 - Parallel Computer Architecture6/25/2015

Optimizations More than 1 transaction can be asleep at an orec by including a thread list with each transaction descriptor Inefficiencies due to treating reads and writes as the same can be overcome having a READ_PHASE status where no locks are acquired, but other threads trying to write will have to abort Searches inside descriptors can be avoided by table lookups To allow non-blocking commits, stealing ownerships is allowed. However, the stealer has to take care of the victim’s status (committed/aborted) to make appropriate changes in the end. Per-orec counters with number of pending transactions makes in-order commits easier to implement 14CS258 - Parallel Computer Architecture6/25/2015

Modification to JVM Both source-to-bytecode compiler and bytecode-to-native compiler need to be modified Each class has a second transactional variant of each method Bytecode-to-native compiler inserts STMValidate calls to detect looping Volatile fields are translated as small transactions 15CS258 - Parallel Computer Architecture6/25/2015

Experimental results 16CS258 - Parallel Computer Architecture6/25/2015

Results (contd.) 17CS258 - Parallel Computer Architecture6/25/2015

Results – Wait test 18CS258 - Parallel Computer Architecture6/25/2015

Conclusion A complete STM implementation in Java is provided STM implementation is competitive with or better than other locking approaches STM scales better than locks Comparison done with other locking approaches - comparison with a single threaded non-locking algorithm is missing 19CS258 - Parallel Computer Architecture6/25/2015

Are Transactions the solution? Transactions offer several advantages – Easier to write than locks – Can offer good performance – Is declarative rather than imperative – Code composability is better – Can be implemented in hardware or software – Can solve priority-inversion and deadlocks that occur from lock usage 20CS258 - Parallel Computer Architecture6/25/2015