A Compositional Method for Verifying Software Transactional Memory Implementations Serdar Tasiran Koç University Istanbul, Turkey Thanks: Rustan Leino,

A Compositional Method for Verifying Software Transactional Memory Implementations Serdar Tasiran Koç University Istanbul, Turkey Thanks: Rustan Leino, Tim Harris, Shaz Qadeer, Mike Barnett, Dave Detlefs, Mike Magruder, Yossi Levanoni,...

The STM Verification Problem Software transactional memory (STM): –Code blocks marked “atomic” –STM implementation guarantees atomic, serialized, non-blocking execution, composability Complexity shifted to STM implementer –Complicated, tricky code: Conflict detection, rollback, ordering,... TM will be used widely –Correctness as critical as the rest of the computing platform Runtime, compiler, processor,... Real STM implementations are ~10K lines –Interact with the runtime, OS, garbage collector,... Goal: Devise modular, re-usable method for proving correctness –Mechanically check most error-prone parts

Approach At the algorithm-level, STM’s are well understood –Key ideas in correctness proof known Algorithm level: Large-grain atomic actions –Field write, read –Transaction commit, roll back Example: Bartok STM, a write-in-place, lazy-invalidate STM Idea/earlier work: –Do algorithm-level proof once –Boil down to properties STM must satisfy: EW, VR, CU Sufficient condition for correctness –Check if STM implementation satisfies EW, VR, CU Formulate each as sequential assertion check, verify with Boogie

Not so simple! Implementation-level executions  Algorithm-level executions Problem 1: More variables –STM implementation variables: Logs, per-object metadata,... Problem 2: Finer-grain, smaller actions  more interleavings –Atomic action: One instruction in an STM function implementation Reasoning at implementation level more difficult –Serializability proofs messy to write, check Approach: Define, prove correspondence between algorithm- and implementation-level executions  Algorithm-level proof carries over to implementation-level executions.

Proof Approach Implementation-level execution Algorithm-level execution satisfying EW +VR + CU

Proof Approach Implementation-level execution “Coarse-atomic” execution Abstract Read operations Verify NOWS, VRS properties “Coarse-atomic” execution with serial undo’s Merge chains of STM internal state transitions Insert marker actions: “commit”, “undoLastLogEntry” Algorithm-level execution satisfying EW +VR + CU

Intuition for proof steps Abstract some actions Prove non-interference Commute Prove larger blocks atomic Abstract some actions Prove non-interference Commute Prove larger blocks atomic

Outline OTFJ: “Our” Transactional Featherweight Java –Algorithm-level and implementation-level semantics Correctness –Pure serializability –Algorithm-level proof Distill to three required properties Relating implementation and algorithm levels –What to abstract, verify at implementation level? Discussion

9 “Our” Transactional Featherweight Java (OTFJ) : Syntax P ::= 0 | P|P | t[e] L :: = Class C{f 1,f 2,...,f n ; M 1, M 2,...,M k } M :: = m(x 1,x 2,..., x p ){ e; } s :: = v | s.f | s.m(s 1,...,s n ) | s.f := s | new C() | lbl: onacid; s; commit | null e :: = s | s; s | spawn s v :: = r | v.f i | v.m(v 1,...v n ) | v.f i = v

OTFJ: Algorithm-level semantics

Algorithm-Level OTFJ Semantics 13 OTFJ Program Abstract STM Begin Transaction Field write Field read OK2Commit? RollBackTransaction OK2Commit  OK2Commit  Undo last log entry

14 OTFJ Semantics: (Transactional) Field Read Program StateSTM State p1p1 s1s1 p1p1 s2s2 p2p2 s2s2 read r.f i Open4Read(r) ”read r.f i ” added to transaction read log

15 OTFJ Semantics: Field Read Program StateSTM State p1p1 s1s1 p1p1 s2s2 p2p2 s2s2 read r.f i Open4Read(r) Abstract STM: This transition left unspecified ”read r.f i ” added to transaction read log

16 OTFJ Semantics: Field Write Program StateSTM State p1p1 s1s1 p1p1 s2s2 p2p2 s2s2 v.f i := r new Open4Write(v) ”write (v.f i, r new, r old )” added to transaction write/undo log OK2Write(v) 

17 OTFJ Semantics: Field Write Program StateSTM State p1p1 s1s1 p1p1 s2s2 p2p2 s2s2 Open4Write(v) Abstract STM: This transition left unspecified ”write (v.f i, r new, r old )” added to transaction write/undo log v.f i := r new OK2Write(v) 

18 OTFJ Semantics: Transaction Commit Program StateSTM State p1p1 s1s1 p1p1 s2s2 p2p2 s2s2 OK2Commit(s 1 )  Logs appended to parent transaction’s logs discarded if top-level transaction Commit

19 OTFJ Semantics: Transaction Rollback Program State STM State p1p1 s1s1 s2s2 !OK2Commit(s 1 )  p1p1

20 OTFJ Semantics: Transaction Rollback Program State STM State p1p1 s1s1 s2s2 p1p1  ......  Xaction rolled back using log entries STM state updated to reflect each rolled-back log entry !OK2Commit(s 1 ) 

Algorithm-level semantics: What is atomic? Execution fragments considered atomic –STM: Open4Read  Prog: Field Read –STM: Open4Write  Prog: Field Write –STM: CommitTransaction  Prog: Commit –Rolling back entire transaction UpdateSTMState  UndoLastLogEntry  UpdateSTMState  UndoLastLogEntry ... UpdateSTMState  UndoLastLogEntry  Rewind program to beginning of transaction 21

Correctness: Equivalence of Executions Equivalence: Executions equivalent if –Same set of threads –Same program end state –For each thread: Same sequence of actions Alternative view: –Swapping independent actions yields equivalent execution –Dependent: Access same variable, at least one is a write Equivalence modulo undo’s –Remove all actions by rolled-back transactions –Is what’s left equivalent? 23

Correctness: Serializability Serial execution: Conflict-free: No transaction is rolled-back Purely serial: Serial and conflict-free Serializability –Is each execution equivalent to a serial execution? Pure serializability –Is each execution equivalent modulo undo’s to a purely serial execution? 24... Action by transaction Tx... Action by transaction Tx Must belong to Tx, or a child of Tx

Algorithm-Level Serializability: Sufficient Conditions 1.Exclusive Writes (EW) No other Tx’ writes to obj between Open4Write(Tx, obj)....... Commit(Tx)/Undo(Tx) 2.Valid Reads (VR) obj not modified by another Tx’ during Open4Read(Tx,obj)....... Commit(Tx) 3.Correct Undo’s (CU) Open4Update(Tx, obj)....... Undo(Tx) Theorem: EW + VR + CU ==> Pure serializability State of obj same at these two points

Semantics: Implementation-level Executions 28 High Level Implementation Level ( th 1,  1 ) ( th 2,  2 ) … … 11 22 mm 11 22 nn … … 11 22 mm 11 22 nn … …

11 22 33 44 55 66 77 88 99  10  11 Implementation-level Atomic Actions

11 22 33

Atomic read The rest: Local variable accesses

Atomic read Atomic compare and swap The rest: Local variable accesses Desired: STM Methods Acting on Single Objects are Atomic

Read CAS The need for abstraction: Actions do not commute CAS If CAS fails, can’t commute Read or CAS across CAS.

NDRead CAS The need for abstraction: Actions do not commute CAS NDRead: Non-deterministically –acts like regular Read, or –reads arbitrary value, causes failed CAS NDRead leading to failed CAS commutes with all actions

Every execution is equivalent to a coarse atomic one Abstract read actions Prove non-interference Commute Prove larger blocks atomic ValidateRead(Tx 1,obj 1 ) Open4Write(Tx 2,obj 2 ) ValidateRead(Tx 3,obj 3 ) ValidateRead(Tx 1,obj 4 ) Open4Write(Tx 2,obj 5 ) ValidateRead(Tx 3,obj 6 )

Proof Approach Implementation-level execution “Coarse-atomic” execution Abstract Read Operations Verify NOWS, VRS properties “Coarse-atomic” execution with serial undo’s Merge chains of STM internal state transitions Insert marker actions: “commit”, “undoLastLogEntry” Algorithm-level execution satisfying EW +VR + CU Boogie!

38 Non-Overlapping Write Spans (NOWS) ==> EW [OpenForUpdate(Tx1, obj), Close*Obj(Tx1, obj) ] does not overlap with [OpenForUpdate(Tx2, obj), Close*Obj(Tx2, obj) ] Approach: Assume Tx executed OpenForUpdate(Tx,obj) and not Close*Obj(Tx,obj) ExclusiveOwner(Tx,obj): A formula that says obj metadata indicates that Tx is the exclusive write owner and other good things. Prove: [Remember all STM methods are atomic] Any possible method execution by another thread Tx_bad leaves ExclusiveOwner(Tx,obj) unchanged.

39 Checking NOWS with Boogie Havoc(obj’s state and metadata) OpenForUpdate(Tx_good, obj) assume( ExclusiveOwner(Tx_good, obj) );

40 Checking NOWS with Boogie Havoc(obj’s state and metadata) OpenForUpdate(Tx_good, obj) assume( ExclusiveOwner(Tx_good, obj) ); assume( Tx_bad != Tx_good); OpenForUpdate(Tx_bad, obj); assert(ExclusiveOwner(Tx_good, obj));

41 Valid Read Spans (VRS) ==> VR [OpenForRead(Tx1,obj), (Successful)ValidateRead(Tx1,obj) ] does not overlap with [OpenForUpdate(Tx2, obj), Close*Obj(Tx2, obj) ]

42 Checking VRS with Boogie InterfereWith(Tx, obj); OpenForRead(Tx,obj); InterfereWith(Tx, obj); if (*) OpenForUpdate(Tx,obj); InterfereWith(Tx, obj); ValidateRead(Tx,obj); assert(interferedWith ==> Tx.invalid);

43 Checking VRS with Boogie InterfereWith(Tx, obj); OpenForRead(Tx,obj); InterfereWith(Tx, obj); if (*) OpenForUpdate(Tx,obj); InterfereWith(Tx, obj); ValidateRead(Tx,obj); assert(interferedWith Tx.invalid);

44 Checking VRS with Boogie InterfereWith: Represents effects of (  + Close) (Open + Close)* ( Open +  ) InterfereWith (Transaction Tx, Object obj) { while (*) { Tx_bad = non-deterministically chosen transaction assume(Tx_bad != Tx); if (*) OpenForUpdate(Tx_bad, obj); if (*) Close*Obj(Tx_bad, obj); }

45 Checking VRS with Boogie InterfereWith(Tx, obj); OpenForRead(Tx,obj); InterfereWith(Tx, obj); if (*) OpenForUpdate(Tx,obj); InterfereWith(Tx, obj); ValidateRead(Tx,obj); assert(interferedWith Tx.invalid);

46 Checking VRS with Boogie Challenge: Finding pre- and post-conditions for InterfereWith() –Boogie has the loop invariant inference (using abstract interpretation) capability to automate this –But had a few bugs InterfereWith (Transaction Tx, Object obj) { while (*) { Tx_bad = non-deterministically chosen transaction assume(Tx_bad != Tx); if (*) OpenForUpdate(Tx_bad, obj); if (*) Close*Obj(Tx_bad, obj); }

47 Checking VRS with Boogie Wrote pre/post-condition pair, verified by induction, again using Boogie but on straightline code. –Simple post-condition: If object opened and closed by Tx_bad, version number is bigger, obj (metadata) is quiescent. Otherwise, obj metadata has info related to Tx_bad in it. InterfereWith (Transaction Tx, Object obj) ensures (interferedWith ==> PostCondition(Tx, obj)); ensures (!interferedWith ==>Unchanged(Tx,obj)) { InterfereWith(Tx, obj); if (*) OpenForUpdate(Tx_bad, obj); if (*) Close*Obj(Tx_bad, obj); }

48 Checking VRS with Boogie: Bugs detected Bugs in STM pseudocode showed up when checking this proof obligation with Boogie –Some interleavings had been overlooked in pseudocode Including the one in the PLDI ’06 paper »Tx_bad: opens for update –Tx_good: opens for read »Tx_bad: modifies object »Tx_bad: closes updated object –Tx_good: opens for update –Tx_good: validates read (shouldn’t have) –Transaction should be invalidated, it isn’t. After putting in fixes, check passed –Possible to make errors in STM design at this level Checks take ~5 minutes on my desktop

Proof Approach Implementation-level execution “Coarse-atomic” execution Abstract Read Operations Verify NOWS, VRS properties “Coarse-atomic” execution with serial undo’s Merge chains of STM internal state transitions Insert marker actions: “commit”, “undoLastLogEntry” Algorithm-level execution satisfying EW +VR + CU

Implementation  Algorithm-level: Committed Transactions ValidateRead(Tx,obj 1 ) ValidateRead(Tx,obj 2 ) ValidateRead(Tx,obj 3 )... ValidateRead(Tx,obj n )... Commit(Tx) Commute all of Tx’s actions here

Implementation  Algorithm Level: Rolled Back Transactions WriteField(Tx, obj 1.f 1 ) WriteField(Tx, obj 2.f 2 ) WriteField(Tx, obj 3.f 3 )... WriteField(Tx, obj n.f n )... Tx being rolled back

Implementation  Algorithm Level: Rolled Back Transactions WriteField(Tx, obj 1.f 1 ) WriteField(Tx, obj 2.f 2 ) WriteField(Tx, obj 3.f 3 )... WriteField(Tx, obj n.f n )...  = undoLastLogEntry

Implementation  Algorithm Level: Rolled Back Transactions WriteField(Tx, obj 1.f 1 ) WriteField(Tx, obj 2.f 2 ) WriteField(Tx, obj 3.f 3 )... WriteField(Tx, obj n.f n )... Commute all of Tx’s actions here

Need for abstraction WriteField(Tx, obj 1.f 1 ) WriteField(Tx, obj 2.f 2 ) WriteField(Tx, obj 3.f 3 )... WriteField(Tx, obj n.f n )... Commute all of Tx’s actions here Read(Tx_other, obj 2.f 2 )

NDRead(Tx_other, obj 2.f 2 ) Need for abstraction WriteField(Tx, obj 1.f 1 ) WriteField(Tx, obj 2.f 2 ) WriteField(Tx, obj 3.f 3 )... WriteField(Tx, obj n.f n )... Commute all of Tx’s actions here NDRead: Non-Deterministic Read –Acts like regular read, or –Reads arbitrary value Only if part of failing (rolled-back) transaction Commutes with any action Only adds to behaviors  safe for modeling 

NDRead(Tx_other, obj 2.f 2 ) Need for abstraction WriteField(Tx, obj 1.f 1 ) WriteField(Tx, obj 2.f 2 ) WriteField(Tx, obj 3.f 3 )... WriteField(Tx, obj n.f n )... Commute all of Tx’s actions here NDRead: Non-Deterministic Read Must allow algorithm-level reads to fail non- deterministically also –OK, because we are only proving safety 

Proof Approach Implementation-level execution “Coarse-atomic” execution Abstract Read Operations Verify NOWS, VRS properties “Coarse-atomic” execution with serial undo’s Merge chains of STM internal state transitions Insert marker actions: “commit”, “undoLastLogEntry” Algorithm-level execution with serial undo’s NOWS ==> EW, VRS ==>VR, SU ==> CU Straightforward manual proofs

58 Discussion Correctness proof for STM’s involves quantification over –threads, transactions –all possible sequences of accesses in a transaction –objects Almost all of this is taken care of in the manual part of proof. Initial focus: Do STM methods guarantee non-interference? –Most error prone part –Checked mechanically Checks with Boogie computationally manageable –Simple sequential scenarios: One object, one transaction, environment interfereWith models all other threads/transactions

Summary Formalizing correctness: –OTFJ language, “pure serializability” Proving correctness at high-level: –Reduced algorithm-level proof of pure serializability to three sufficient conditions: EW + VR + CU Relating implementation and algorithm levels: –Provided low-level semantics for OTFJ –Devised sequence of steps to transform implementation-level executions to algorithm-level executions –Correctness of most error-prone step checked with Boogie Modeling tricks: –Identified abstractions in modeling required for approach to work

60 Future Work Mechanize more of proof Non-transactional accesses, buffered-write STM’s –Privatization problem, etc. –Strong isolation for non-transactional accesses –What is the correctness criterion? Formalizing relationship between model and implementation –What exactly is assumed of the GC and CLR? –How to establish correspondence between Spec# model and 10K lines of C++? Weak memory models

Read ND-CAS With abstraction we have ND-CAS ND-CAS: Non-Deterministic Compare-and-Swap –Fails non-deterministically –Can only succeed where regular CAS can Failing ND-CAS commutes with any action ND-CAS introduces more behavior –OK because we are only proving “safety” properties

Proof Approach Implementation-level execution “Coarse-atomic” execution Abstract Read operations Verify NOWS, VRS properties “Coarse-atomic” execution with serial undo’s Merge chains of STM internal state transitions Insert marker actions: “commit”, “undoLastLogEntry” Algorithm-level execution satisfying EW +VR + CU Focus last summer

A Compositional Method for Verifying Software Transactional Memory Implementations Serdar Tasiran Koç University Istanbul, Turkey Thanks: Rustan Leino,

Similar presentations

Presentation on theme: "A Compositional Method for Verifying Software Transactional Memory Implementations Serdar Tasiran Koç University Istanbul, Turkey Thanks: Rustan Leino,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Compositional Method for Verifying Software Transactional Memory Implementations Serdar Tasiran Koç University Istanbul, Turkey Thanks: Rustan Leino,

Similar presentations

Presentation on theme: "A Compositional Method for Verifying Software Transactional Memory Implementations Serdar Tasiran Koç University Istanbul, Turkey Thanks: Rustan Leino,"— Presentation transcript:

Similar presentations

About project

Feedback