Download presentation

Presentation is loading. Please wait.

Published byGabrielle Howe Modified over 6 years ago

2
Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of Computer and Information Science University of Pennsylvania CAV 2006, Seattle

3
Sebastian Burckhardt -2- softwaremultiprocessor concurrent executions bugs The General Problem concurrency libraries can help e.g. Java JSR-166 but how to debug the libraries?

4
Sebastian Burckhardt -3- optimized implementations of concurrent datatypes shared-memory multiprocessor with relaxed memory model bugs The Specific Problem case study: use SAT solver to find bugs concurrent executions

5
Sebastian Burckhardt -4- Case Study: Two-Lock Queue Algorithm published by M. Michael and M. Scott [PODC 1996] 123 head lock tail lock Singly linked list with head and tail pointers Dummy node at front Independent head and tail locks allows for concurrent enqueue() and dequeue() Race condition if queue is empty head lock tail lock

6
Sebastian Burckhardt -5- client program observes ordering of operation calls within each thread argument and return values of the operation code is correct if and only if all executions are observationally equivalent to some serial execution (def. serial: interleaved at operation boundaries only) We assume serial executions are correct (can be verified by convential sequential methods) thread 1 enqueue(1) enqueue(2) thread 2 enqueue(3) dequeue() 1 thread 3 dequeue() 3 dequeue() 2 Case Study: Our Correctness Criterion

7
Sebastian Burckhardt -6- Finer Interleavings = More Executions serial executions threads interleave the operations (operations are atomic) (operations are in-order) sequentially consistent executions threads interleave the instructions (instructions are atomic) (instructions are in-order) relaxed executions hardware makes performance- motivated compromises (stores may be non-atomic) (loads/stores may be out-of-order) Serial SC Relaxed Reordered Instructions = More Executions

8
Sebastian Burckhardt -7- Case Study: Relaxed Memory Models Example: output not consistent with any interleaved execution! can be the result of out-of-order stores can be the result of out-of-order loads improves performance (more choices for processor) Q: Why doesnt everything break? A: Relaxations are transparent to normal programs uniprocessor semantics are preserved library code for lock/unlock contains memory ordering fences x = 1 y = 2 print y print x thread 1thread 2 2 0

9
Sebastian Burckhardt -8- Which Memory Model? Memory models are platform dependent We use a conservative approximation Relaxed to capture common effects Once code is correct for Relaxed, it is correct for all models See paper for formal spec of Relaxed TSO PSO PPCAlpha Relaxed RMO 390 SC

10
Sebastian Burckhardt -9- Halftime Overview General motivation Case study parameters Two-lock queue implementation Correctness criterion Relaxed memory models Our verification method Symbolic tests SAT encoding Results Bugs found Evaluation & Conclusion done coming up

11
Sebastian Burckhardt -10- Our Verification Method Encoder SAT solver implementation code with commit points symbolic test passcounterexample 1 2 5 4 3

12
Sebastian Burckhardt -11- thread 1 enqueue(X) thread 2 dequeue() Y How To Bound Executions Verify individual symbolic tests finite number of operations nondeterministic instruction order nondeterministic input values Example (this is the smallest one in our test suite) User creates suite of tests of increasing size 1

13
Sebastian Burckhardt -12- Why symbolic test programs? 1) Avoid undecidability by making everything finite: State is unbounded (dynamic memory allocation)... is bounded for individual test Sequential consistency is undecidable... is decidable for individual test 2) Gives us finite instruction sequence to work with State space too large for interleaved system model.... can directly encode value flow between instructions Memory model specified by axioms.... can directly encode ordering axioms on instructions

14
Sebastian Burckhardt -13- Implementation code we hand-translated Michael & Scotts code (above) into a low-level representation that uses explicit loads, stores we added code for dynamic memory allocation and locks 2

15
Sebastian Burckhardt -14- Commit points designate where the operation commits logically given order of commit points, we can construct serial witness execution eliminates the in executions equivalent serial execution 3

16
Sebastian Burckhardt -15- Counterexample Trace thread 1 enqueue (1) thread 2 dequeue() 0 4 5 1 11 12 2 13 14 3 6 7 9 10 8 commit point order (3 < 6) indicates that enqueue precedes dequeue, so we would expect dequeue() 1 incorrect value (0) of queue element gets read (7) before correct value (1) is being written (11). 4

17
Sebastian Burckhardt -16- Encoding Given symbolic test T(A, B) memory model Y implementation code & commit point specifications Encoding First step: encode concurrent executions of T on Y as solutions to CNF formula Y (A, B, X) (aux vars X) Second step: encode counterexamples as solutions to Y (A, B, X) Atomic (A, B, X) (A = A ) (commit point orders match) ((B B ) (some operations commit out of order)) thread 1 enqueue(A) thread 2 dequeue() B 5

18
Sebastian Burckhardt -17- Encoding Detail: Obtain Symbolic Instruction Stream Finite instruction sequence for each thread Only loads, stores, moves, and fences Each register is assigned exactly once Control flow represented by predicates

19
Sebastian Burckhardt -18- Encoding Detail: Memory Order Example: two threads: Encoding variables Use bool vars for relative order (x<y) of memory accesses Use bitvector variables A x and D x for address and data values associated with memory access x Encode constraints encode transitivity of memory order encode ordering axioms of the memory model Example (for SC): (s1<s2) (l1<l2) encode value flow Loaded value must match last value stored to same address Example: value must flow from s1 to l1 under following conditions: ( (s1<l1) ( A s1 = A l1 ) ( (s2<s1) (l1<s2) ( A s2 A l1 ) ) ) ( D s1 = D l1 ) s1 store s2 store l1 load l2 load thread 1thread 2 O(n 2 ) O(n 3 )

20
Sebastian Burckhardt -19- Encoding Detail: The combined formula communication formula memory order variables input values output values intermediate values thread-local formulas

21
Sebastian Burckhardt -20- So what did we learn in the case study? General motivation Case study parameters Two-lock queue implementation Correctness criterion Relaxed memory models Our verification method Symbolic tests SAT encoding Results Bugs found Evaluation & Conclusion done coming up

22
Sebastian Burckhardt -21- Results: 5 code problems found 3 were mistakes we made first commit point guess was wrong incorrect/insufficient fences in lock/unlock and alloc/free 2 were caused by missing fences in queue implementation (not fault of authors... were assuming SC multiprocessor) ---store-store-fence ---load-load-fence

23
Sebastian Burckhardt -22- Results: Scalability Graph shows tests in our suite (unsatisfiable instances only) y-axis : runtime in seconds x-axis : # accesses (loads/stores) in test Fast on small tests, slow on long tests Not sensitive to # threads All 5 problems were found on smallest 2 tests... all under 1 sec

24
Sebastian Burckhardt -23- Conclusion quickly finds subtle bugs supports relaxed memory models counterexample traces catches broad range of bugs (not limited to deadlocks or data races) is more automatic than deductive methods not truly scalable (though scalable enough to be useful) not fully automatic does not solve full problem (bounded instances, commit points) We would recommend this method to designers and implementors of concurrent data types. PROsCONs FUTURE WORK & CHALLENGES

25
Sebastian Burckhardt -24-

26
Sebastian Burckhardt -25- Ordering/Atomicity Relaxations store A, 1 load B, 0 store B, 1 load A, 0 processor 1processor 2 initially A=B=0 pink numbers = memory order 3131 4242 store A, 1 load A, reg store reg, B load B, 1 load A, 0 processor 1processor 2 initially A=B=0 split store into local / remote components 1/6 2 3 4545 EXAMPLE 1 store, load may execute out of order EXAMPLE 2 stores are buffered locally before effect is global The following 2 examples illustrate the main effects (1. ordering relaxation / 2. atomicity relaxation) Where necessary, a programmer can prevent these effects by inserting fence instructions

27
Sebastian Burckhardt -26- What code? Data type implementations optimized for concurrent execution (Concurrency libraries) What machines? Common shared-memory multiprocessors (e.g. PPC, Sparc, Alpha) What bugs? Bugs caused by concurrency (We assume code runs fine if single-threaded)

28
Sebastian Burckhardt -27- Encoding Concurrent Executions x 1 load a[0], R1 x 2 store R1, y x 3 load y, R2 move R2+1, R3 x 4 store 1, a[R3] Variables O(n 2 ) bitvectors R1, R2, R3 for intermediate values boolean variables M ij to represent memory order x i < x j (for i < j) Constraints O(n 3 ) memory order is transitive: Λ i<j<k (M ij M jk ) M ik loads get latest value stored to same address memory order must respect memory model axioms and fences (e.g. sequential consistency requires M 12 M 34 ) thread-local computations connect values (e.g. R3 = R2 + 1) label

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google