Presentation is loading. Please wait.

Presentation is loading. Please wait.

On Sequentializing Concurrent Programs Gennaro Parlato University of Southampton, UK UPMARC 7 th Summer School on Multicore Computing, June 8-10, 2015.

Similar presentations


Presentation on theme: "On Sequentializing Concurrent Programs Gennaro Parlato University of Southampton, UK UPMARC 7 th Summer School on Multicore Computing, June 8-10, 2015."— Presentation transcript:

1 On Sequentializing Concurrent Programs Gennaro Parlato University of Southampton, UK UPMARC 7 th Summer School on Multicore Computing, June 8-10, 2015

2 Concurrency for better performace Clock rates are stalling, but Moore’s Law is still alive… no longer feasible to increase the speed of individual processors additional gates turned into (caches and) multiple cores ⇒ For better performance programs must be concurrent! ⇒ Concurrency has become an important aspect for many areas of computer science: algorithms, data structures, programming languages, software engineering, testing, verification, …

3 Writing concurrent programs is DIFFICULT Programmers have to guarantee correctness of sequential execution of each individual thread under nondeterministic interferences from other threads (schedules) Rare schedules result in errors that are difficult to find, reproduce, and repair developers / testers can spend weeks chasing a single bug ⇒ huge productivity problem communication mechanism … T2T2 TNTN T2T2 threads

4 Writing concurrent programs is DIFFICULT //shared variable int n=0; int P(void) { int tmp, i=1; while (i<=10) { tmp = n; n = tmp + 1; i++; } int main (void) id1 = thread_create(P); id2 = thread_create(P); join( id1 ); join( id2 ); printf(“The value of n is %d”, n); } The value of n is ???

5 Testing Testing remains the most used (and often the only known) paradigm in industry... … but is ineffective for concurrent programs: large number of schedules makes scaling-up difficult non-deterministic nature of scheduling makes repeatability difficult ⇒ needs to be complemented by automated analyses that handle schedules symbolically

6 Verification approach develop practical but theoretically well-founded symbolic verification techniques based on the idea of Sequentialization

7

8 Sequentialization: motivations Building verification tools for full-fledged concurrent languages is difficult and expensive... … but scalable verification techniques exist for sequential languages Abstraction SAT/SMT techniques (i.e., bounded model checking) … ⇒ Can we leverage these?

9 Sequentialization as a code-to-code translation Code-to-code translation from multithreaded recursive programs to sequential programs that preserves reachability Conc. program “equivalent” Sequential program with non determinism shared variables … T2T2 TNTN T1T1 Use existing automatic verification techniques designed for sequential programs to analyze concurrent programs

10 From concurrent to sequential Always possible but can be inefficient simulate the global behavior (track all locals of each thread) current techniques do not work What do we want? avoid the extreme blow-up track at any point only the locals of one thread What we want is not always possible But it is possible if we restrict behaviors Bug-finding

11 Sequentialization: advantages Keep focus on the concurrency aspects of programs, delegate sequential reasoning to an existing analysis tool code-to-code translation is much easier to implement than a full-fledged analysis tool simplifies experimentation with different approaches can be designed to target multiple backends for sequential program analysis

12 Concurrent (shared-memory) Programs Formed of sequential programs T 1, …, T N (each possibly with recursive function calls) shared variables … T2T2 TNTN T1T1 threads each program T i can read and write shared vars we assume sequential consistency (SC) (writes are immediately visible to all the other programs) an execution is an interleaving of the executions of each T i

13 Anatomy of an execution ( l, s 1 ) ( l, s 2 ) ( l 1 ’,s 1 ) ( l 2 ’,s 2 ) T1T1 T2T2

14 Keep It Simple and Sequential Sequentialization

15 A first sequentialization: KISS KISS: Keep It Simple and Sequential [Quadeer-Wu, PLDI’04] Under-approximation (subset of interleavings) Thread creation  function call at context-switches either: -the active thread is terminated or -a not yet scheduled thread is started (by calling its main function) when a thread is terminated either: -the thread that has called it is resumed (if any) or -a not yet scheduled thread is started

16 KISS schedules (l 1,s 1 ) T1T1 (l 1,s 3 ) T2T2 (l 2,s 1 ) T3T3 (l 3,s 2 ) (l 4,s 2 ) (l 5,s 3 ) Scheduling 1: 1. Start T 1 2. Start T 2 3. Terminate T 2 4. start T 3 5. terminate T 3 6. Resume T 1 T1T1 T2T2 T3T3 Scheduling 2: 1. start T 1 2. start T 2 3. start T 3 4. terminate T 3 5. resume T 2 6. terminate T 2 7. resume T 1 T1T1 T2T2 T3T3 Scheduling 3: 1. start T 1 2. start T 2 3. terminate T 2 4. resume T 1 5. start T 3 6. terminate T 3 7. resume T 1

17 More on KISS Allows dynamic thread creation in form of asynchronous calls Bounds the number of threads that have been created but not started yet -scheduler nondeterministically starts a thread from this set, or -resumes the last suspended thread (if any) State space: no cross product Context-switches: -does allow an unbounded number of context-switches -does not allow a bounded number context-switches between any two threads (for more than 1 interaction)

18 Bounded Context-Switching (CS) is essential Switching between threads is allowed only a bounded number of times [Qadeer-Rehof, TACAS’05] Systematic bounded CS is useful for bug hunting: most concurrency related bugs manifest themselves within few CS [Musuvathi-Qadeer, PLDI’07] Efficient sequentializations for bounded CS Eager approach [Lal-Reps, CAV’08] Lazy approach [La Torre-Madhusudan-Parlato, CAV’09]

19 LR Sequentialization [ Lal-Reps, CAV’08 ]

20 LR sequentialization: Bounded Round-Robin schedules T1T1 TNTN T N-1 T2T2 … round 1 round 2 round k round 3 … … Bounded Round- Robin captures bounded context-switches Schedule: T 2 T 3 T 4 T 1 T 3 T 2 T 1 ; minimal number of rounds ???

21 LR sequentialization: simulation Sequential program (k-rounds) 1.Guess a 2, …, a k 2.Execute T 1 to completion  Computes -local states l 1,.., l k, and -global states b 1, …, b k (l 1,b 1 ) T1T1 (l 1,a 2 ) (l 2,b 2 ) (l 2,a 3 ) (l 3,b 3 ) T2T2 T3T3

22 LR sequentialization: simulation Sequential program (k-rounds) 1.Guess a 2,…,a k 2.Execute T 1 to completion 3.Pass b 1,…,b k to T 2 (l 1,b 1 ) T1T1 (l 1,a 2 ) (l 2,b 2 ) (l 2,a 3 ) (l 3,b 3 ) T2T2 T3T3 b1b1 b2b2 b3b3

23 LR sequentialization: simulation Sequential program (k-rounds) 1.Guess a 2,…,a k 2.Execute T 1 to completion 3.Pass b 1,…,b k to T 2  We can dismiss locals of T 1 T1T1 a2a2 a3a3 T2T2 T3T3 b1b1 b2b2 b3b3

24 LR sequentialization: simulation Sequential program (k-rounds) 1.Guess a 2,…,a k 2.Execute T 1 to completion 3.Pass b 1,…,b k to T 2 4.Execute T 2 to completion  Dismiss locals of T 2  Dismiss b 1,…,b k T1T1 a2a2 a3a3 T2T2 T3T3 b1b1 b2b2 b3b3 (l 1 ’,c 1 ) c3c3 (l 1 ’,b 2 ) (l 0 ’,b 1 ) (l 2 ’,c 2 ) (l 2 ’,b 3 )

25 LR sequentialization: simulation Sequential program (k-rounds) 1.Guess a 2,…,a k 2.Execute T 1 to completion 3.Pass b 1,…,b k to T 2 4.Execute T 2 to completion T1T1 a2a2 a3a3 T2T2 T3T3 b1b1 b2b2 b3b3 c1c1 c2c2 c3c3

26 LR sequentialization: simulation Sequential program (k-rounds) 1.Guess a 2,…,a k 2.Execute T 1 to completion 3.Pass b 1,…,b k to T 2 4.Execute T 2 to completion 5.Pass c 1,…,c k to T 3 6.Execute T 3 to completion  Dismiss locals of T 3  Dismiss c 1,…,c k T1T1 a2a2 a3a3 T2T2 T3T3 b1b1 b2b2 b3b3 c1c1 c2c2 c3c3 d3d3 (l 0 ’’, c 1 ) (l 1 ’,d 1 ) (l 1 ’,c 2 ) (l 1 ’,d 2 ) (l 1 ’,c 3 )

27 LR sequentialization: simulation Sequential program (k-rounds) 1.Guess a 2,…,a k 2.Execute T 1 to completion 3.Pass b 1,…,b k to T 2 4.Execute T 2 to completion 5.Pass c 1,…,c k to T 3 6.Execute T 3 to completion T1T1 a2a2 a3a3 T2T2 T3T3 b1b1 b2b2 b3b3 c1c1 c2c2 c3c3 d1d1 d2d2 d3d3

28 LR sequentialization: simulation Sequential program (k-rounds) 1.Guess a 2,…,a k 2.Execute T 1 to completion 3.Pass b 1,…,b k to T 2 4.Execute T 2 to completion 5.Pass c 1,…,c k to T 3 6.Execute T 3 to completion 7.Computation iff d i = a i+1  i  [1,k-1] T1T1 a2a2 a3a3 T2T2 T3T3 b1b1 b2b2 b3b3 c1c1 c2c2 c3c3 d1d1 d2d2 d3d3

29 LR sequentialization: simulation Sequential program (k-rounds) 1.Guess a 2,…,a k 2.Execute T 1 to completion 3.Pass b 1,…,b k to T 2 4.Execute T 2 to completion 5.Pass c 1,…,c k to T 3 6.Execute T 3 to completion 7.Computation iff d i = a i+1  i  [1,k-1] 8.Report an error if a bug occurs during the simulation T1T1 a2a2 a3a3 T2T2 T3T3 b1b1 b2b2 b3b3 c1c1 c2c2 c3c3 d1d1 d2d2 d3d3

30 LR seq. as a code-to-code translation for C programs + Pthread CSeq website: users.ecs.soton.ac.uk/gp4/cseq/ [Fischer-Inverso-Parlato, ASE’13]

31 … T1T1 T2T2 TNTN … F 1 () F 2 () F N ()main() Concurrent program “equivalent” Sequential program with non determinism Sequentialization (code-to-code translation) Simulation functions translates … LR sequentialization as a code-to-code translation

32 considers round-robin schedules with k rounds -thread → function, run to completion global memory copy for each round -scalar → array context switch → round counter++ first thread starts with nondeterministic memory contents -other threads continue with content left by predecessor T1T1 T2T2 S 2,2 S 0,2 S 1,2 S K-1,2 TNTN... S 2,1 S 0,1 S 1,1 S K-1,1 S 2,N S 0,N S 1,N S K-1,N

33 LR sequentialization as a code-to-code translation considers round-robin schedules with k rounds -thread → function, run to completion global memory copy for each round -scalar → array context switch → round counter++ first thread starts with nondeterministic memory contents -other threads continue with content left by predecessor checker prunes away inconsistent simulations - assume( S i+1,0 == S i,N ); -requires second set of memory copies -errors can only be checked at end of simulation requires explicit error checks T1T1 T2T2 S 2,2 S 0,2 S 1,2 S k-1,2 TNTN... S 2,1 S 0,1 S 1,1 S k-1,1 S 2,N S 0,N S 1,N S k-1,N

34 LR sequentialization as a code-to-code translation //shared vars type g1 g1; type g2 g2; … //thread functions t(){ type x1 x1; type x2 x2; … stmt 1 ; stmt 2 ; … } … main(){ … } //shared vars type g1 g1[K]; type g2 g2[K]; … uint round=1; bool ret=0; //aux vars // context-switch simulation cs() { unsigned int j; j= nondet(); assume(round +j < K); round+=j; if (round==K-1 && nondet()) ret=1; } //thread functions t(){ type x1 x1; type x2 x2; … cs(); if ret return; stmt 1 [round]; cs(); if ret return; stmt 2 [round]; … } … main_thread(){ … } main(){ … } //next slide

35 LR sequentialization as a code-to-code translation main(){ type g1 _g1[K]; type g2 _g2[K]; … // first thread starts with non-deterministic memory contents for (i=1;i++;i<K){ _g1[i] = g1[i] = nondet(); _g2[i] = g2[i] = nondet(); … } // thread simulations t[0] = main_thread; born[0] = ACTIVE; for (i=0;i++;i<N){ if(born[i]>NO_ACTIVE){ ret=0; round = born[i]; t[i](); } } // consistency check for (i=0;i++;i<K-1){ assume(_g1[i+1] == g1[i]); assume(_g2[i+1] == g2[i]); … } // error detection assert(err ==0); }

36 Implementations of variants of LR schema (SMT-based) Corral (SMT-based analysis for Boogie programs) –[ Lal–Qadeer–Lahiri, CAV’12 ] –[ Lal–Qadeer, FSE’14 ] CSeq (code-to-code translation for C + PThread) –[ Fischer–Inverso–Parlato, ASE’13 ] Rek (for Real-time Embedded Software Systems) –[ Chaki–Gurfinkel–Strichman, FMCAD’11 ] Storm: implementation for C programs –[ Lahiri–Qadeer–Rakamaric, CAV’09 ] –[Rakamaric, ICSE’10] General eager translation representing thread interactions using bounded DAGs [Bouajjani-Emmi-Parlato, SAS’11]

37 Lazy Sequentialization (Lazy Approach) [ La Torre-Madhusudan-Parlato, CAV’09 ]

38 LR sequentialization is “eager” T1T1 a2a2 T2T2 T3T3 b1b1 a3a3 b2b2 a 2 is guessed a 2 may be unreachable EAGER [Lal, Reps CAV’08]

39 LR sequentialization does not preserve assertions void thread1() { while (blocked); x = x/y; if (x%2==1) ERROR; } void thread2() { x=12; y=2; //unblock thread2 blocked=false; } // shared variables bool blocked=true; int x=0, y=0; Inv: y != 0 blocked=true blocked=false guess x=13 y=0

40 A lazy transformation is desirable A lazy sequential program explores only reachable states of the concurrent program Why is it desirable? In model-checking it can drastically reduce the explored state- space Better invariants for deductive verification / abstract interpretation We now illustrate a lazy transformation: [La Torre-Madhusudan-Parlato, CAV’09]

41 Lazy transformation: main idea  Execute T 1  Context-switch: store s 1 and abort  Execute T 2 from s 1  store s 2 and abort (l 1,s 1 ) (l’ 1,s 1 ) (l’ 2,s 2 ) T1T1 (l 0,s 0 ) T2T2 store s 1 & abort store s 2 & abort

42 Lazy transformation: main idea  Re-execute T 1 till it reaches s 1 May reach a new local state! Anyway it is correct !!  Restart from global s 2 and compute s 3 (l 1,s 1 ) (l’ 1,s 1 ) (l’ 2,s 2 ) T1T1 (l 0,s 0 ) T2T2 store s 1 & abort store s 2 & abort (l’’ 1,s 1 ) store s 3 & abort (l’’ 1,s 2 )

43 Lazy transformation: main idea  Switch to T 2  Execute till it reaches s 2  Continue computation from global s 3 (l 1,s 1 ) (l’ 1,s 1 ) (l’ 2,s 2 ) T1T1 (l 0,s 0 ) T2T2 store s 1 & abort store s 2 & abort (l’’ 1,s 1 ) store s 3 & abort (l’’’ 1,s 2 ) (l’’ 1,s 2 ) (l’’’ 1,s 3 )

44 Lazy transformation: main idea T1T1 T2T2 store s 1 store s 2 store s 3 store s 4 store s 5 end s1s1 s2s2 s3s3 s4s4 s1s1 s2s2 s3s3 s4s4 s5s5

45 Lazy transformation: features Explores only reachable states Preserves invariants across the translation Tracks local state of one thread at any time Tracks values of shared variables at context switches (s 1, s 2, …, s k ) Requires recomputation of local states

46 … T1T1 T2T2 TNTN … F 1 () F 2 () F N ()main() Concurrent program “equivalent” Sequential program with non determinism Sequentialization (code-to-code translation) Simulation functions translates … Lazy sequentialization as a code-to-code translation  Guess scheduling  Orchestrate calls to threads (F i )  Nondet jump to next context where this thread is active  At last context-switch, store shared state, abort, and return to main

47 Lazy translation for Concurrent Boolean programs Concurrent Boolean programs Boolean programs We have implemented an eager and a lazy translator for concurrent Boolean programs  Download: http://www.cs.uiuc.edu/~madhu/getafix/cbp2bphttp://www.cs.uiuc.edu/~madhu/getafix/cbp2bp Experiments: Windows NT Bluetooth driver Context switches 1-adder 1-stopper 2-adders 1-stopper 1-adder 2-stoppers 2-adders 2-stoppers eagerlazyeagerlazyeagerlazyeagerlazy 123456123456 NNNNNNNNNNNN 0.1 0.3 43.3 73.6 930.0 - 0.1 0.2 1.4 5.5 20.2 66.8 NNNYYYNNNYYY 0.2 0.9 135.9 1601.0 - 0.1 0.8 6.3 2.6 18.0 122.9 NNYYYYNNYYYY 0.1 0.7 70.1 597.2 - 0.1 0.9 0.4 2.9 14.0 66.1 NNYYYYNNYYYY 0.2 1.6 177.6 out of mem. 0.1 2.0 0.8 7.5 66.5 535.9 Backend sequential analysis: GetAFix [La Torre-Madhusudan-Parlato, PLDI’09] -BDD-based analysis: it stores summaries recomputations => no multiple thread explorations Lazy outperforms Eager

48 More Sequentializations

49 Lazy translation for Concurrent Boolean programs Eager and Lazy sequentializations can be extended to parameterized programs: [La Torre-Madhusudan-Parlato, CAV’10, FIT’12] T1T1 T2T2 TmTm in 1 in 2 in 3 out 1 out 2 out 3 Correctness of abstractions of several Linux device drivers

50 Lazy translation for Concurrent Boolean programs Eager and Lazy sequentializations can be extended to parameterized programs: [La Torre-Madhusudan-Parlato, CAV’10, FIT’12] Delay-bounded scheduling [Emmi-Qadeer-Rakamaric, POPL’11] -Programs with asynchronous calls (creating tasks) -Each task is executed to completion (no interleaving with other tasks) -Sequentialization is according to a DFS scheduler of tasks -When dispatched, a task can be delayed to next round –the total number of delays in a task-creation tree is bounded by k –total number of explored rounds is k+1 –The beginning of each round is guessed (eager)

51 Lazy translation for Concurrent Boolean programs Eager and Lazy sequentializations can be extended to parameterized programs: [La Torre-Madhusudan-Parlato, CAV’10, FIT’12] Delay-bounded scheduling [Emmi-Qadeer-Rakamaric, POPL’11] General sequentialization [Bouajjani-Emmi-Parlato, SAS’11] -Programs with asynchronous calls -Tasks can be interleaved with other ones -Sequentialization based on –DAGs of contexts –Composition and compression operations –Bound on the size of the DAGs –Generalizes k-rounds Eager e delay bounded-scheduling sequentialization a a c c d d b b e e

52 More Sequentializations Eager and Lazy sequentializations can be extended to parameterized programs: [La Torre-Madhusudan-Parlato, CAV’10, FIT’12] Delay-bounded scheduling [Emmi-Qadeer-Rakamaric, POPL’11] General sequentialization [Bouajjani-Emmi-Parlato, SAS’11] Scope-bounded sequentialization [La Torre-Napoli-Parlato, FSTTCS’12] Budget-bounded sequentialization [Abdulla-Atig- Rezine-Stenman, FMCAD’12] Sequentialization for proving correctness [Garg-Madhusudan, TACAS’11] Bounded-phase sequentialization of message passing programs [Bouajjani-Emmi, TACAS’12] Eager sequentialization for periodic programs [Chaki-Gurfinkel-Sinha, FMCAD’14], [Chaki-Gurfinkel-Strichman, FMCAD’13] …

53 Conclusions

54 Sequentialization is an effective approach to analyze concurrent programs –Fast prototyping –Re-use of mature technologies (tools designed for sequential programs) –Code-to-code translation Presented translations: –keep track only of the local state of the current thread (no cross product) –Keep track of a finite number of copies of shared states –thread creation is implemented with calls KISS is a lazy sequentialization that allows an unbounded number of context-switches, but does not allow a bounded number context-switches between any two threads Bounded context-switches: -Eager translations require guessing of values of the shared variables and may explore unreachable states -Lazy translations preserve the invariants and introduces many recursive calls (thread re-computations)

55 Tomorrow’s lecture Sequentializations for Bounded Model Checking backends: Lazy-CSeq [ Inverso-Tomasco-Fischer-La Torre-Parlato, CAV’14 ] MU-Cseq [ Tomasco-Inverso-Fischer-La Torre-Parlato, TACAS’15 ] framework for developing sequentialization of concurrent C programs : users.ecs.soton.ac.uk/gp4/cseq/


Download ppt "On Sequentializing Concurrent Programs Gennaro Parlato University of Southampton, UK UPMARC 7 th Summer School on Multicore Computing, June 8-10, 2015."

Similar presentations


Ads by Google