SymDiff: A differential program verifier

SymDiff: A differential program verifier
Shuvendu Lahiri Research in Software Engineering (RiSE), Microsoft Research, Redmond, WA USA Workshop on Program Equivalence (April 2016)

What is SymDiff? A platform for Leveraging and extending program verification to reason about “program differences” Contributors C. Hawblitzel, K. McMillan, O. Strichman, Z. Rakamaric, S. He, Interns: R. Sharma, M. Kawaguchi, H. Rebelo, R. Sinha, N. Partush, A. Gyori,…

Differential verification
Verifying properties of program differences instead of the program itself Motivations No specs! Proving assertions statically is harder (program-specific invariants, environment modeling, ..) Applications Program evolution Compilers (equivalence preserving, approximate) Comparing independent implementations New class of compilers that sacrifice equivalence for performance. Loop perforation : skip every other iteration. Where precision does not matter that much. In certain domains (images, video processing). Outputnew = \delta * oldimplementation. Two independent implementation of SSL. No structural similarities. (dawson angler)

Boogie program verifier+ Z3
SymDiff architecture Language agnostic: relies on translators from C/C#/Java/x86 to Boogie (bpl) P1.bpl P1P2.bpl Invariant inference Product P2.bpl P1P2.bpl + invs A bug is a failure of a differential property such as equivalence, DAC, relative termination etc. Spec: in a separate file. Prove mutual summary of f,f’, per pair of functions. If f’ was split to two you can still specify it. So it is an expression over the inputs/oututs of functions from both sides. Diff spec Boogie program verifier+ Z3

Language Subset of Boogie (an intermediate verification language) [FMCO’05] Commands x := E assert E assume E S;T goto L1, L2, … Ln //non-deterministic jump to labels call x,y := Foo(e1,e2,..) //procedure call Loops encoded as tail-recursive procedures Can encode arrays using SMT theory of arrays (sel/upd) x[e]  sel(x, e) x[y] := z  x := upd(x, y, z) x == y  i. sel(x,i) == sel(y,i) Extentional arrays.

Modeling imperative programs/heaps
Various translators to Boogie: C (HAVOC/SMACK/VCC/..), JAVA (Joogie/..), C# (BCT) E.g., C Heap modeling in HAVOC [CHLQ, POPL’09] A pointer is represented as an integer (int) One heap map per scalar/pointer structure field and pointer type struct A { int f; int g;} x; Mem_A_f : [int]int Mem_A_g : [int]int Simple example C code x->f = 1; Boogie Mem_A_f[x + Offset(f,A)] := 1; A map for each field of the structure. A is the name of the structure.

Differential specs: mutual summaries
void F1(int x1){ if(x1 < 100){ g1 := g1 + x1; F1(x1 + 1); }} void F2(int x2){ if(x2 < 100){ g2 := g2 + 2* x2; F2(x2 + 1); }} MS(F1, F2): (x1 = x2 && g1 <= g2 && x1 >= 0) ==> g1’ <= g2’ How the specification is written Last line: formal definition A specification over the I/O vocabulary of (F1,F2) Inputs: parameters, globals (g). Outputs: return values, next state of globals (g’). [Hawblitzel, Kawaguchi, Lahiri, Rebelo CADE’13]

And now... verification. Differential verification ⇒ single-program verification Leverage existing verification machinery: VC generation, SMT solvers Invariant inference to infer intermediate specifications A novel product (P1 x P2) construction [FSE’13] Allows Interprocedural reasoning Synchronizes at procedure boundaries only Can map one procedure to many procedures Symmetric products: relies on similarity of the cfg’s.

The product program f1 Instrument calls proc f1(x1): r1 modifies g1 {
w1 := call h1(e1); t1 } Instrument calls f2 proc f2(x2): r2 modifies g2 { s2; L2: w2 := call h2(e2); t2 } Suppose we have two procedures f1 and f2 that call procedures h1 and h2 then we can compose them to obtain a joint procedure for f1 and f2 which looks like this. The details are in the paper but the most important part of this composition is that the joint procedure of f1 and f2 calls the joint procedure for h1 and h2. This transformation helps us prove the following result. The 2nd call h_1h_2 is only required for the proof. It may contain a specification. The specification will be a s post-condition of h1_h_2, What we see here is the product f1_f_2 Will give this to any invariant-generation tool. Replay, constrain, restore

Property of the product
Let p1_p2 be the product of (p1, p2) Theorem: If S_p1_p2 is a summary of p1_p2, then it is a mutual summary of (p1, p2) Aids in differential specification A specification of the summary S_p1_p2 (e.g. partial equivalence) can be added as a postcondition of p1_p2 More importantly, aids differential invariant inference Can perform analysis on program P1xP2 to infer sound mutual summaries of (P1, P2) Can infer the intermediate summaries of intermediate procedures.

Automatic differential invariant inference
Performing invariant inference on the product program Experience with Duality (infers invariants) Diverges e.g., (( 𝑥 1 =0∧ 𝑥 1 =1)∨ (𝑥 1 =1∧ 𝑥 2 =2)∨…) instead of ( 𝑥 1 < 𝑥 2 ) Current approach is based on predicate abstraction Infer Boolean combination of predicates, or Houdini: Conjunction over a predefined set of predicates Automatically provide all simple differential predicates: x1  x2, where x1 in p1, x2 in p2,   {=,≤,≥,⇒…} Houdini. You have to provide the predicates.

Applications (1 / 3) Equivalence checking
Compiler translation validation [CADE’13] Cross-version compiler validation by comparing binaries [FSE’13] Translation from binary to Boogie. Semantic slicing. Taint – things that changed.

Example (equivalence checking)
f1(n) { if (n == 0) { return 1; } else { return n * f1(n - 1); } main(n) {return f1(n);} f2(n, a) { if (n == 0) { return a; } else { return f2(n - 1, a * n); } main(n) {return f2(n,1);} MS(f1, f2): (n1 == n2 ⇒ a2*r1’ == r2’) involves non-trivial diff specs user only provides predicate (a2*r1’ == r2’) Spec MS(main1, main2) = (n1 == n2 ⇒ r1’ == r2’)

Applications (2 / 3) Differential assertion checking [FSE’13]
Translation from binary to Boogie. Semantic slicing. Taint – things that changed.

Differential Assertion Checking (DAC)
Lahiri et al. FSE’13, Joshi, Lahiri, Lal POPL’12 Correctness  Relative correctness An input that can* satisfy p, cannot fail p’. Note: asymmetric check How? Replace assert A with ok := ok && A; Write a mutual summary: MS(f1,f2) = ((i1==i2 && ok1’) ==> ok2’)) Originally i1 == i2 && g1 = g2 => ok1 = ok2. We instead say that globals are part of the inputs. * Nondeterminism

DAC application: verifying bug fixes
Does a fix inadvertently introduce new bugs? Verisec suite: “snippets of open source programs which contain buffer overflow vulnerabilities, and corresponding patched versions.” Relative memory safety (e.g. buffer overflow) checking Snippets from apache, madwifi, sendmail etc. Verified several bug fixes using automatic differential invariant inference First, let us talk about verifying bug fixes. We are interested in answering the question whether a bug fix can inadvertently introduce new errors. For this case study we use the Verisec benchmark suite. This suite has buggy and patched versions of snippets of open source software. Since the bugs are buffer overflow errors, we validate relative buffer overflows in the patched version w.r.t. the buggy version. Our tool is able to automatically prove the relative correctness of these snippets thus ensuring that a new buffer overflow vulnerability was not introduced during the fix. For more details please refer to the paper but to give an idea of these patches, I will show an example.

Example: DAC int main_buggy() { … fb := 0; while(c1=read()!=EOF) fbuf[fb] = c1; fb++; } int main_patched() { … fb := 0; while(c1=read()!=EOF) fbuf[fb] = c1; fb++; if(fb >= MAX) fb = 0; } Differential loop invariant: fb2  fb1 Here is a buggy version for one of the benchmarks in the verisec suite. The variable fb increases beyond bound and can overflow fbuf. In the patched version, fb is re-initialized to zero when it exceeds a bound. Houdini is able to infer that fb of the patched version always has a value less than or equal to fb of the buggy version and use this invariant to automatically prove the relative buffer overflow specification. Buffer Overflow Can verify (relative) memory safety automatically, without manual preconditions

Applications (3/3) Current research:
Verifying approximate transformations Preserve assertions, termination and accuracy [w/ Rakamaric, He] Semantic change impact analysis Injecting change semantics into a dataflow-based taint analysis [w/ Partush, Gyori] Translation from binary to Boogie. Semantic slicing. Taint – things that changed.

SymDiff: A differential program verifier

Similar presentations

Presentation on theme: "SymDiff: A differential program verifier"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SymDiff: A differential program verifier

Similar presentations

Presentation on theme: "SymDiff: A differential program verifier"— Presentation transcript:

Similar presentations

About project

Feedback