Inherent limitations facilitate design & verification of concurrent programs Hagit Attiya Technion.

Inherent limitations facilitate design & verification of concurrent programs Hagit Attiya Technion

Concurrent Programs Core challenge is synchronization Correct synchronization is hard to get right Efficient synchronization is even harder Principled, Automatic approach

Work with Ramalingam and Rinetzky (POPL 2010) EXAMPLE I: VERIFYING LOCKING PROTOCOLS

The Goal: Sequential Reductions Verify concurrent data structures Pre-execution static analysis E.g., linked list with hand-over-hand locking no memory leaks, shape (it’s a list), serializability Find sequential reductions  Consider only sequential executions  But conclude that properties hold in all executions

Back-of-envelop estimate of gain Static analysis of a linked-list algorithm [Amit, Rinetzky, Reps, Sagiv, Yahav, CAV 2007] –Verifies e.g., memory safety, sortedness, pointed-to by a variable, heap sharing One thread (sequential) 10s3.6MB Two threads (interleaved)~4h886MB Three threads (interleaved)> 8h----

Serializability operation interleaved execution complete non-interleaved execution ~ ~ ~ ~ ~ ~ ~ ~ ~ [Papadimitriou ‘79] Observed by the threads locally

Serializability gives Sequential Reduction Concurrent code M A small subset of all executions If M is serializable, then a local property φ holds in all executions of M iff φ holds in all complete non-interleaved executions Easily derived from [Papadimitriou ‘79]

How do we know that M is serializable, without considering all executions?

Special (and common) case: Disciplined programming with locks Guard access to data with locks (lock & unlock) Only one process holds the lock at each time Follow a locking protocol that guarantees conflict serializability E.g., two-phase locking (2PL) or tree locking (TL)

Two-phase locking [Papadimitriou `79] Locks acquire (grow) phase followed by locks release (shrink) phase  No lock is acquired after some lock is released t1t1 H t1t1 t1t1 t2t2 t1t1

Tree (hand-over-hand) locking [Kedem & Sliberschatz ‘76] [Smadi ‘76] [Bayer & Scholnick ‘77] Except for the first lock, acquire a lock only when holding the lock on its parent No lock is acquired after being released t1t1 H t1t1 t1t1 t2t2

Tree (hand-over-hand) locking [Kedem & Sliberschatz ‘76] [Smadi ‘76] [Bayer & Scholnick ‘77] Except for the first lock, acquire a lock only when holding the lock on its parent No lock is acquired after being released t1t1 t2t2 t2t2 H t1t1

void p() { acquire(B) B = 0 release(B) int b = B if (b) acquire(A) } void q() { acquire(B) B = 1 release(B) } Yes! –for databases –concurrency control monitor ensures that M follows the locking policy at run-time  M is serializable No! –for code analysis –no central monitor Not two-phase locked But only in interleaved executions Not two-phase locked But only in interleaved executions

Our Goal Statically verify that M follows a locking policy For local conflict-serializable locking protocols –Depending only on thread’s local variables & global variables locked by it E.g., two phase locking, tree locking, (dynamic) DAG locking… But not protocols that rely on a centralized concurrency control monitor!

Thread-local properties Can be expressed as properties of thread-local variables E.g., no two processes are inside the critical section simultaneously A thread-local property of an execution holds in every execution indistinguishable from it

Our contribution: Easy step complete non-interleaved executions of M A local conflict serializable locking policy is respected in all executions iff it is respected in all non-interleaved executions A thread-local property holds in all executions iff it holds in all non-interleaved executions Two phase locking Tree locking Dynamic tree locking

Our contribution: Easy step complete non-interleaved executions of M Proof considers shortest execution violating the protocol + indistiguishability argument A local conflict serializable locking policy is respected in all executions iff it is respected in all non-interleaved executions

Reduction to non-interleaved executions: Proof idea σ is the shortest execution that does not follow LP  σ’ follows LP, guarantees conflict-serializability   non interleaved execution “equivalent” to σ’ σ (t,e) σ’

Reduction to non-interleaved executions: Proof idea σ is the shortest execution that does not follow LP  σ’ follows LP, guarantees conflict-serializability   non interleaved execution “equivalent” to σ’ σ (t,e) σ’ σ’ ni

Reduction to non-interleaved executions: Proof idea σ is the shortest execution that does not follow LP  σ’ follows LP, guarantees conflict-serializability   non interleaved execution “similar” to σ’   non interleaved execution “similar” to σ’ where LP is violated σ (t,e) σ’ σ ni (t,e)

Ni-reduction: Proof sketch there is a ni-execution that is “equivalent” to σ’  there is a ni-execution that is “equivalent” to σ where LP is violated σ’ σ’ ni σ ni (t,e)

Ni-reduction: Proof sketch  There is a ni-execution σ ni with the same conflicts as in σ  t can execute e also after σ ni Write σ ni = σ 1 σ t σ 2, σ t is the sub-exeuction by thread t  t can execute e also after σ 1 σ t σ 1 σ t (t,e) is a ni-execution and it follows the locking protocol Since σ 1 σ t (t,e) and σ (t,e) are conflict equivalent, σ (t,e) follows the locking protocol

Further reduction Almost-complete non-interleaved executions of M A local conflict serializable locking policy is respected in all executions iff it is respected in all almost-complete non-interleaved executions

Further reduction: A complication Need to argue about termination int X=0, Y=0 void p() { acquire(Y) y = Y release(Y); if (y ≠ 0) acquire(X) X = 3 release(X) } void q() { if (random(5) == 3){ acquire(Y) Y = 1 release(Y) while (true) nop } Y is set to 1 & the method enters an infinite loop Observe Y == 1 & violates 2PL Cannot happen in complete non- interleaved executions

Further reduction: Termination  Can use sequential reduction to verify termination A terminating local conflict serializable locking policy is respected in all executions iff it is respected in all almost-complete non-interleaved executions

Acni-reduction: Proof ideas Start from a ni-execution (rely on the previous, ni-reduction to get there) Create its equivalent completion, if possible Not always possible, e.g., Does not access variables accessed by later threads t 1 :lock(v),t 1 :lock(u),t 2 :lock(u) u v

Implications for static analysis Pessimistic analysis (over approximate) –Analyze a module from every possible state Semi-optimistic analysis –Analyze a module only from states that occur after a sequence of modules ran one after the other (not to completion) Optimistic analysis (precise) –Analyze a module only from states that occur after a sequence of modules ran to completion (one after the other) Acni-reduction Ni-reduction

Initial analysis results Shape analysis of hand-over-hand linked lists *Does not verify sortedness of list and fails to verify linearizability in some cases Shape analysis of hand-over-hand trees (for the first time)

What’s next? Extend to other serializability protocols –shared (read) locks –non-locking non-conflict based serializability (e.g., using timestamps) –optimistic protocols –Aborted / failed methods

EXAMPLE II: REQUIRED MEMORY ORDERINGS Work with Guerraoui, Hendler, Kuznetsov, Michael and Vechev (POPL 2011)

Relaxed memory models Out of order execution of memory accesses, to compensate for slow writes Optimize to issue reads before following writes, if they access different locations Reordering may lead to inconsistency CPU 0 CPU 1 cache memory interconnect

Read-after-write (RAW) Reordering Process P: Write(X,1) Read(Y) Process P: Write(X,1) Read(Y) Process Q: Write(Y,1) Read(X) Process Q: Write(Y,1) Read(X) P Q W(Y,1) R(Y)W(X,1) R(X) W(X,1)

Avoiding out-of-order: Read-after-write (RAW) Fence Process P: Write(X,1) FENCE Read(Y) Process P: Write(X,1) FENCE Read(Y) Process Q: Write(Y,1) FENCE Read(X) Process Q: Write(Y,1) FENCE Read(X) P Q W(Y,1) R(Y)W(X,1) R(X)

Avoiding out-of-order: Atomic Operations Atomic operations: atomic-write-after-read (AWAR) E.g., CAS, TAS, Fetch&Add,… atomic{ read(Y) … write(X,1) } atomic{ read(Y) … write(X,1) } RAW fences / AWAR are ~60 slower than (remote) memory accesses

Concurrent data types: –queues, counters, hash tables, trees,… –Non-commutative operations –Serializable solo-terminating implementations Mutual exclusion Our result Any concurrent program in a certain class must use RAW / AWARs

Non-commutative operations Operation A is non-commutative if there is operation B where: A influences B and B influences A

Example: Queue enq(v) adds v to the end of the queue deq() takes item from the head of the queue Q.deq():1;Q.deq():2 Q.deq():2;Q.deq():1 deq() influence each other Q.enq(3):ok;Q.deq():1 Q.deq():1;Q.enq(3):ok enq() is not non-commutative 1 1 2 2 Q 1 1 2 2 Q 3 3 1 1 2 2 Q 3 3

Proof Intuition: Writing If an operation does not write, it does not influence anyone It would be commutative no shared write 1 deq do not influence each other 1 deq

Proof Intuition: Reading If an operation does not read, it is not influenced by anyone It would be commutative deq do not influence each other no shared read 11 deq

Proof Intuition: RAW deq 1 1 W no RAW deq11 serialization

Mutual exclusion (Mutex) Two processes do not hold lock at the same time (Deadlock-freedom) If a process calls Lock() then some process acquires the lock Lock() operations do not “commute”! Every successful Lock() incurs a RAW / AWAR

Who should care? Concurrent programmers: know when is it futile to try and avoid expensive synchronization Hardware designers: motivation to lower cost of specific synchronization constructs API designers: choice of API affects synchronization Verification engineers: declare incorrect when synchronization is missing “…although I hope that these shortcomings will be addressed, I hasten to add that they are insignificant compared to the huge step forward that this paper represents….” -- Paul McKenney, Linux Weekly News, Jan 26, 2011 “…although I hope that these shortcomings will be addressed, I hasten to add that they are insignificant compared to the huge step forward that this paper represents….” -- Paul McKenney, Linux Weekly News, Jan 26, 2011

What else? Weaker operations? E.g., idempotent Work Stealing Other patterns Read-after-read, write-after-write, barriers, across- thread orders The cost of verifying adherence to a locking policy (Semi-) Automatic insertion of lock acquire / release commands or fences

And beyond… Other theorems allowing to “cut corners” when designing / verifying concurrent applications

Thank you!

Inherent limitations facilitate design & verification of concurrent programs Hagit Attiya Technion.

Similar presentations

Presentation on theme: "Inherent limitations facilitate design & verification of concurrent programs Hagit Attiya Technion."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Inherent limitations facilitate design & verification of concurrent programs Hagit Attiya Technion.

Similar presentations

Presentation on theme: "Inherent limitations facilitate design & verification of concurrent programs Hagit Attiya Technion."— Presentation transcript:

Similar presentations

About project

Feedback