Transactions are back But are they the same? R. Guerraoui , EPFL
- Le retour de Martin Guerre - (Sommersby)
From the New York Times San Francisco, May 7, 2004 Intel announces a change in its business strategy: « Multicore is THE way to boost performance »
The free ride is over Every one will need to fork threads
Forking threads is easy Handling their conflicts is hard
Coarse grained locks => slow Fine grained locks => errors (Most Java bugs are due to misuse of he word « synchronized »);
Lock-free computing? Every lock-free data structure # podc/disc/spaa papers
How can we simply state that a certain code needs to appear atomic without using a big lock?
Consistency Contract (ACID) Transactions Consistency Contract (ACID) C A-I-D
Historical perspective Eswaran et al (CACM’76) Database Papadimitriou (JACM’79) Theory Liskov/Sheifler (TOPLAS’82) Language Knight (ICFP’86) Architecture Herlihy/Moss (ISCA’93) Hardware Shavit/Touitou (PODC’95) Software
Simple example (consistency invariant) 0 < x < y
Simple example (transaction) T: x := x+1 ; y:= y+1
Consistency Contract You: « atomicity » (AID) Grand’ma: « consistency » ( C) Consistency Contract C A-I-D
The underlying theory (P’79) A history H is atomic if the restriction of H to its committed transactions is serializable
A history H of committed transactions is serializable if there is a history S(H) that is (1) equivalent to H (2) sequential (3) legal
This is all fine But this is not new Why should we care? Because we want jobs
Transactions are indeed back But are they really the same? How can we figure that out?
Ask system people System people know « Those who know don’t need to think » Iggy Pop
Simple algorithm (DSTM) To write an object O, a transaction acquires O and aborts the transaction that owned O before To read an object, a transaction T takes a snapshot to see if the system has changed since T’s last reads; else T is aborted
Simple algorithm (DSTM) Killer write (ownership) Careful read (validation)
More efficient algorithm Apologizing versus asking permission Killer write Optimistic read Validity check at commit time
Am I smarter than a system guy? No way
Back to the simple example Invariant: 0 < x < y Initially: x := 1; y := 2
Division by zero T1: x := x+1 ; y:= y+1 T2: z := 1 / (y - x)
Infinite loop T3: a := y; b:= x; repeat b:= b + 1 until a = b T1: x := 3; y:= 6 T3: a := y; b:= x; repeat b:= b + 1 until a = b
System people care about live transactions The theoreticians didn’t
The old theory We need a theory that talks about ALL transactions A history is atomic if its restriction to committed transactions is serializable We need a theory that talks about ALL transactions
A new theory: Opacity (KG’06) A history H is opaque if for every transaction T in H, there is a serializable history in committed(T,H)
So what? Ask system people
Simple algorithm (DSTM) Careful read (validation) Killer write (ownership)
Visible vs Invisible Read (SXM; RSTM) Visible but not so careful read: when a transaction reads an object, it says so Write is mega killer: to write an object, a transaction aborts any live one which has read or written the object
Conjecture Either the read has to be visible or has to careful Wrong
Giving up Progress (TL2) To write an object, a transaction acquires it and writes its timestamp To read an object O, the transaction aborts itself if O was written by a transaction with a higher timestamp
Conjecture Visible read Vs Validation Progress
Theorem (GK06): progress with invisible reads requires Omega(k) steps
Theorem Visible read Vs Validation Progress
The theorem does not hold for classical atomicity i.e., the theorem does not hold for database transactions
More … Theorem (GK07): progress cannot be ensured with disjoint access parallelism
Transparent read? (DSTM) To read an object, a transaction T takes a snapshot to see if the system is still in the same state; else T is aborted (or wait)
Yet another theorem Theorem (GK07): progress cannot be ensured with transparent reads
So far, progress applied to solo transactions (i.e. solo progress) Some might never commit Can we ensure that all transactions eventually commit?
Yet another theorem: Theorem (GKK06): Solo progress and eventual progress are incompatible
Contention management If a transaction T wants to write an object O owned by another transaction, call a contention manager (various strategies)
System Perspective Scherer and Scott [CSJP 04] Some work well, but … Exponential backoff “Karma” Transaction with most work accomplished wins Various priority inheritance schemes … Some work well, but … Can’t prove anything!
Greedy Contention Manager State Priority (based on start time) Waiting flag (set while waiting) Wait if other has Higher priority AND not waiting Abort other if lower priority OR waiting
Preliminary Result Compare time to complete transaction schedule for Ideal off-line scheduler Knows transactions, conflicts, and start times in advance Greedy contention manager Does not know anything …
Competitive Ratio Let s be the number of objects accessed by all transactions Compare time to commit all transactions Greedy is O(s)-competitive with the off-line adversary GHP’05 O(s2) AEST’06 O(s)
Many many open problems What programming language? LS’83, GCLR’93, KG’03,… What hardware support? K’86, HM’93,.. What software implementation? ST’95, HMLS’03, RFF’06,.. What benchmark? KGV’06,.. What theory? … What algorithms? …
The Topic is VERY HOT http://www.cs.wisc.edu/trans-memory/biblio/index.html Sun, Intel, IBM, EU (VELOX) ISCA, OOPSLA, PODC, DISC, POPL, Transact What about SPAA and Euro-Par?
Transactions are conquering the parallel programming world The one slide to remember Transactions are conquering the parallel programming world They look simple and familiar and thus make the programmer happy They are in fact very sophisticated and thus should make YOU happy
Classical database transactions User 1 User 2 Transaction Server Database (disk)
In-memory transactions User 1 User 2 Transaction server + database Fast processor
Shared memory transactions Thread 3 Thread 4 Thread 1 Thread 2 Transactional language: shared memory Processor 1 Processor 2 Processor 3