Transactional Memory R. Guerraoui, EPFL. Locking is ’’history’’ Lock-freedom is difficult.

Slides:



Advertisements
Similar presentations
Maurice Herlihy (DEC), J. Eliot & B. Moss (UMass)
Advertisements

1 Lecture 20: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
1 Chapter 4 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
Concurrency Control II
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Process Synchronization Continued 7.2 The Critical-Section Problem.
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
Hybrid Transactional Memory Nir Shavit MIT and Tel-Aviv University Joint work with Alex Matveev (and describing the work of many in this summer school)
Transactional Memory Overview Olatunji Ruwase Fall 2007 Oct
Transactions are back But are they the same? R. Guerraoui, EPFL.
Idit Keidar and Dmitri Perelman Technion 1 SPAA 2009.
Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, “lazy” implementation.
1 Lecture 23: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Lecture 21: Synchronization Topics: lock implementations (Sections )
Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.
Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E. Ramadan, Aditya Bhandari, and Emmett Witchel - Presentation By Sathish P.
CS510 Concurrent Systems Class 5 Threads Cannot Be Implemented As a Library.
System Catalogue v Stores data that describes each database v meta-data: – conceptual, logical, physical schema – mapping between schemata – info for query.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
1 © R. Guerraoui Seth Gilbert Professor: Rachid Guerraoui Assistants: M. Kapalka and A. Dragojevic Distributed Programming Laboratory.
An Introduction to Software Transactional Memory
Transactional Memory CDA6159. Outline Introduction Paper 1: Architectural Support for Lock-Free Data Structures (Maurice Herlihy, ISCA ‘93) Paper 2: Transactional.
How Good can a Transactional Memory be? R. Guerraoui, EPFL.
1 © R. Guerraoui Concurrent Algorithms (Overview) Prof R. Guerraoui Distributed Programming Laboratory.
CS5204 – Operating Systems Transactional Memory Part 2: Software-Based Approaches.
1 Transactions Chapter Transactions A transaction is: a logical unit of work a sequence of steps to accomplish a single task Can have multiple.
Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.
1 Lecture: Large Caches, Virtual Memory Topics: cache innovations (Sections 2.4, B.4, B.5)
11/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University.
Transactional Memory Lecturer: Danny Hendler. 2 2 From the New York Times…
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Chapter 6 – Process Synchronisation (Pgs 225 – 267)
DOUBLE INSTANCE LOCKING A concurrency pattern with Lock-Free read operations Pedro Ramalhete Andreia Correia November 2013.
Memory Consistency Zhonghai Lu Outline Introduction What is a memory consistency model? Who should care? Memory consistency models Strict.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
ICFEM 2002, Shanghai Reasoning about Hardware and Software Memory Models Abhik Roychoudhury School of Computing National University of Singapore.
A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy Slides by Vincent Rayappa.
SYNAR Systems Networking and Architecture Group CMPT 886: The Art of Scalable Synchronization Dr. Alexandra Fedorova School of Computing Science SFU.
CGS 3763 Operating Systems Concepts Spring 2013 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 11: :30 AM.
1 Lecture 4: Transaction Serialization and Concurrency Control Advanced Databases CG096 Nick Rossiter [Emma-Jane Phillips-Tait]
1 Lecture 20: Speculation Papers: Is SC+ILP=RC?, Purdue, ISCA’99 Coherence Decoupling: Making Use of Incoherence, Wisconsin, ASPLOS’04.
Parallel Data Structures. Story so far Wirth’s motto –Algorithm + Data structure = Program So far, we have studied –parallelism in regular and irregular.
Lecture 20: Consistency Models, TM
Maurice Herlihy and J. Eliot B. Moss,  ISCA '93
Part 2: Software-Based Approaches
PHyTM: Persistent Hybrid Transactional Memory
Memory Consistency Models
Lecture 19: Coherence and Synchronization
Faster Data Structures in Transactional Memory using Three Paths
Memory Consistency Models
Challenges in Concurrent Computing
Designing Parallel Algorithms (Synchronization)
Transactions are back But are they the same? R. Guerraoui , EPFL
Lecture 6: Transactions
Lecture 21: Synchronization and Consistency
Part 1: Concepts and Hardware- Based Approaches
Lecture 22: Consistency Models, TM
Lecture: Coherence and Synchronization
CSE 153 Design of Operating Systems Winter 19
Lecture: Coherence and Synchronization
Lecture: Consistency Models, TM
Lecture 18: Coherence and Synchronization
CSE 542: Operating Systems
Presentation transcript:

Transactional Memory R. Guerraoui, EPFL

Locking is ’’history’’ Lock-freedom is difficult

Wanted A concurrency control abstraction that is simple, robust and efficient

Transactions

Historical perspective Eswaran et al (CACM’76) Databases Papadimitriou (JACM’79) Theory Liskov/Sheifler (TOPLAS’82) Language Knight (ICFP’86) Architecture Herlihy/Moss (ISCA’93) Hardware Shavit/Touitou (PODC’95) Software Herlihy et al (PODC’03) Software – Dynamic

accessing object 1; accessing object 2; Back to the sequential level

accessing object 1; accessing object 2; Back to the sequential level atomic { }

Semantics Every transaction appears to execute at an indivisible point in time

Double-ended queue Enqueue Dequeue

class Queue { QNode head; QNode tail; public enq(Object x) { atomic { QNode q = new QNode(x); q.next = head; head = q; }... }

Queue composition Dequeue Enqueue

class Queue { … public transfer(Queue q) { atomic { Qnode n = this.dequeue(); q.enqueue(n) } }... }

Simple example (consistency invariant) 0 < x < y

T: x := x+1 ; y:= y+1 Simple example (transaction)

accessing object 1; accessing object 2; The illusion of a critical section atomic { }

“It is better for Intel to get involved in this [Transactional Memory] now so when we get to the point of having …tons… of cores we will have the answers” Justin Rattner, Intel Chief Technology Officer

“…we need to explore new techniques like transactional memory that will allow us to get the full benefit of all those transistors and map that into higher and higher performance.” Bill Gates, Businessman

“…manual synchronization is intractable…transactions are the only plausible solution….” Tim Sweeney, Epic Games

Sun/Oracle, Intel, AMD, IBM, MSR Fortress (Sun); X10 (IBM); Chapel (Cray) The TM Topic is VERY HOT

begin() returns ok read() returns a value or abort write() returns an ok or abort commit() returns ok or abort abort() returns ok The TM API (a simple view)

Two-phase locking To write or read O, T requires a lock on O; T waits if some T’ acquired a lock on O At the end, T releases all its locks

Two-phase locking (more details) Every object O, with state s(O) (a register), is protected by a lock l(O) (a c&s) Every transaction has local variables wSet and wLog Initially: l(O) = unlocked, wSet = wLog = 

Two-phase locking Upon op = read() or write(v) on object O if O wSet then wait until unlocked= l(O).c&s(unlocked,locked) wSet = wSet U O wLog = wLog U S(O).read() if op = read() then return S(O).read() S(O).write(v) return ok

Two-phase locking (cont’d) Upon commit() cleanup() return ok Upon abort() rollback() cleanup() return ok

Two-phase locking (cont’d) Upon rollback() for all O  wSet do S(O).write(wLog(O)) wLog =  Upon cleanup() for all O  wSet do l(O).write(unlocked) wSet = 

Why two phases? (what if?) To write or read O, T requires a lock on O; T waits if some T’ acquired a lock on O T releases the lock on O when it is done with O

Why two phases? T1 T2 read(0)write(1) O1 O2 read(0)write(1) O2O1

Two-phase locking (read-write lock) To write O, T requires a write-lock on O; T waits if some T’ acquired a lock on O To read O, T requires a read-lock on O; T waits if some T’ acquired a write-lock on O Before committing, T releases all its locks

Two-phase locking - better dead than wait - To write O, T requires a write-lock on O; T aborts if some T’ acquired a lock on O To read O, T requires a read-lock on O; T aborts if some T’ acquired a write-lock on O Before committing, T releases all its locks A transaction that aborts restarts again

Two-phase locking - better kill than wait - To write O, T requires a write-lock on O; T aborts T’ if some T’ acquired a lock on O To read O, T requires a read-lock on O; T aborts T’ if some T’ acquired a write-lock on O Before committing, T releases all its locks A transaction that is aborted restarts again

Two-phase locking - better kill than wait - To write O, T requires a write-lock on O; T aborts T’ if some T’ acquired a lock on O To read O, T requires a read-lock on O; T waits if some T’ acquired a write-lock on O Before committing, T releases all its locks A transaction that is aborted restarts again

Visible Read (SXM, RSTM, TLRW) Write is mega killer: to write an object, a transaction aborts any live one which has read or written the object Visible but not so careful read: when a transaction reads an object, it says so

Visible Read A visible read invalidates cache lines For read-dominated workloads, this means a lot of traffic on the bus between processors - This reduces the throughput - Not a big deal with single-CPU, but with many core machines (e.g. SPART T2 Niagara)

Two-phase locking with invisible reads To write O, T requires a write-lock on O; T waits if some T’ acquired a write-lock on O To read O, T checks if all objects read remain valid - else T aborts Before committing, T checks if all objects read remain valid and releases all its locks

Invisible reads (more details) Every object O, with state s(O) (register), is protected by a lock l(O) (c&s) Every transaction maintains, besides wSet and wLog: - a local variable rset(O) for every object

Invisible reads Upon write(v) on object O if O wSet then wait until unlocked= l(O).c&s(unlocked,locked) wSet = wSet U O wLog = wLog U S(O).read() (*,ts) = S(O).read() S(O).write(v,ts) return ok

Invisible reads Upon read() on object O (v,ts) = S(O).read() if O  wSet then return v if l(O) = locked or not validate() then abort() if rset(O) = 0 then rset(O) = ts return v

Invisible reads Upon validate() for all O s.t rset(O) > 0 do (v,ts) = S(O).read() if ts ≠ rset(O) or (O wset and l(O) = locked) then return false return true

Invisible reads Upon commit() if not validate() then abort() for all O  wset do (v,ts) = S(O).read() S(O).write(v,ts+1) cleanup()

Invisible reads Upon rollback() for all O  wSet do S(O).write(wLog(O)) wLog =  Upon cleanup() for all O  wset do l(O).write(unlocked) wset =  rset(O) = 0 for all O

DSTM (SUN) To write O, T requires a write-lock on O; T aborts T’ if some T’ acquired a write-lock on O To read O, T checks if all objects read remain valid - else abort Before committing, T releases all its locks

DSTM Killer write (ownership) Careful read (validation)

More efficient algorithm? Apologizing versus asking permission Killer write Optimistic read validity check only at commit time

Example Invariant: 0 < x < y Initially: x := 1; y := 2

Division by zero T1: x := x+1 ; y:= y+1 T2: z := 1 / (y - x)

T1: x := 3; y:= 6 Infinite loop T2: a := y; b:= x; repeat b:= b + 1 until a = b

Opacity Serializability Consistent memory view

Trade-off The read is either visible or careful

Intuition T1 T2 read() write() commit I1,I2,..,Im O1,O2,..,On read() Ik

Read invisibility The fact that the read is invisible means T1 cannot inform T2, which would in turn abort T1 if it accessed similar objects (SXM, RSTM) NB. Another way out is the use of multiversions: T2 would not have written “on” T1

Conditional progress - obstruction-freedom - A correct transaction that eventually does not encounter contention eventually commits Obstruction-freedom seems reasonable and is indeed possible

DSTM To write O, T requires a write-lock on O (use C&S); T aborts T’ if some T’ acquired a write-lock on O (use C&S) To read O, T checks if all objects read remain valid - else abort (use C&S) Before committing, T releases all its locks (use C&S)

If a transaction T wants to write an object O owned by another transaction T’, T calls a contention manager The contention manager can decide to wait, retry or abort T’ Progress

Contention managers Aggressive: always aborts the victim Backoff: wait for some time (exponential backoff) and then abort the victim Karma: priority = cumulative number of shared objects accessed – work estimate. Abort the victim when number of retries exceeds difference in priorities. Polka: Karma + backoff waiting

Greedy contention manager State Priority (based on start time) Waiting flag (set while waiting) Wait if other has Higher priority AND not waiting Abort other if Lower priority OR waiting

T1 T2 read() write() commit O1 write() O2 Aborting is a fatality read() O2 abort

TM does not always replace locks: it hides them Memory transactions look like db transactions but are different Concluding remarks

The garbage-collection analogy In the early times, the programmers had to take care of allocating and de-allocating memory Garbage collectors do it for you: they are now incorporated in Java and other languages Hardware support was initially expected, but now software solutions are very effective