Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automated Formal Verification of Software Randal E. Bryant Carnegie Mellon University.

Similar presentations


Presentation on theme: "Automated Formal Verification of Software Randal E. Bryant Carnegie Mellon University."— Presentation transcript:

1 Automated Formal Verification of Software http://www.cs.cmu.edu/~bryant Randal E. Bryant Carnegie Mellon University

2 – 2 – The Dream … and the Reality Programs Should be Formally Specified and Proved 1960s: Logical foundations of programs Floyd, Hoare, Dijkstra, Scott … 1970s: Attempts to automate proof process Based on automated theorem provers Intensive human effort required, even for small programs 1980s, 90s: General disillusionment Hard to fully describe what program should do Software is far to complex to verify completely

3 – 3 – Changing Software Environment Early IT Systems Expert users Understood and expected failures Modern Environment More demanding users Nontechnical people less tolerant of errors Safety critical systems Computers control everything from pacemakers to stock exchanges Hostile operating conditions Systems under constant attack by malicious forces Software errors make systems vulnerable

4 – 4 – Use of Formal Verification for Hardware Environment Higher need for first-time success Simulation speeds limit amount of testing achievable Equivalence Checkers Prove that low-level implementation matches high-level description Routinely performed on all designs Component Verification Floating-point units, memory arrays, … Common practice for large manufacturers Model Checking Automatically check specific properties of system

5 – 5 – Temporal Logic Model Checking Verify Reactive Systems Construct state machine representation of concurrent system Nondeterminism expresses range of possible behaviors Express desired behavior as formula in temporal logic Determine whether or not property holds Traffic Light Controller Design Traffic Light Controller Design “It is never possible to have a green light for both N-S and E-W.” Model Checker True False + Counterexample

6 – 6 – Hardware Modeling Example Multiprocessor Memory System Coherent Caching Can have multiple read-only copies of word Writing requires getting exclusive copy Interface Cluster #2Cluster #3 Interface Mem.Cache Global Bus Cluster #1 Bus Proc. Arbitrary reads & writes

7 – 7 – Symbolic Model Checking Example K. McMillan, E. Clarke (CMU) J. Schwalbe (Encore Computer) Encore Gigamax Cache System Distributed memory multiprocessor Cache system to improve access time Complex hardware and synchronization protocol.Verification Create “simplified” finite state model of system (10 9 states!) Verify properties about set of reachable states Bug Detected Sequence of 13 bus events leading to deadlock With random simulations, would require  2 years to generate failing case. In real system, would yield MTBF < 1 day.

8 – 8 – Boolean Inference Engines Core Verification Technology Satisfiability Find one solution, or prove none exist More advanced operations Determine all solutions Solve quantified formula Boolean Formula Boolean Formula SAT Checker Satisying solution Unsatisfiable (+ proof) Boolean Formula Boolean Formula BDD Construction All possible solutions

9 – 9 – Recent Progress in SAT Solving

10 – 10 – BDDs: Conceptual Basis Truth TableDecision Tree Vertex represents decision Follow green (dashed) line for value 0 Follow red (solid) line for value 1 Function value determined by leaf value.

11 – 11 – Example BDD Initial GraphReduced Graph Canonical representation of Boolean function  For given variable ordering Two functions equivalent if and only if graphs isomorphic Can be tested in linear time Desirable property: simplest form is canonical. (x 1  x 2 )  x 3

12 – 12 – Representing Complex Functions Functions Add 4-bit words a and b Get 4-bit sum S Carry output bit Cout Shared Representation Graph with multiple roots 31 nodes for 4-bit adder 571 nodes for 64-bit adder Linear growth!

13 – 13 – Model Checking State of Art Symbolic Model Checkers Use combinations of SAT checking and BDDs to represent and manipulate sets of possible system states Can verify systems with over 10 20 states Far too large to represent states explicitlyApplications Standard tool for cache & bus protocol designers Requires sophisticated users Large companies have own tool and support group Some commercial versions available

14 – 14 – Automatic Verification of Software Challenge Apply techniques devised for verifying hardware to softwareStrategy Minimize human effort Do not expect detailed specifications Automated inference techniques Partial verification Only verify limited properties of system Example: Program invulnerable to buffer overflow attacks

15 – 15 – Example: Program Equivalence Checking Do these functions produce identical results? Strategy Represent and reason about bit-level program behavior int abs(int x) { int mask = x>>31; return (x ^ mask) + ~mask + 1; } int test_abs(int x) { return (x < 0) ? -x : x; }

16 – 16 – Bit-Level Program Verification View computer word as 32 separate bit values Each output becomes Boolean function of inputs abs x0x0 x1x1 x2x2 x 31 y0y0 y1y1 y2y2 y 31 x0x0 x1x1 x2x2 x 31 yiyi abs i

17 – 17 – Program Verification Do these functions produce identical results? int bitOr(int x, int y) { return ~(~x & ~y); } int test_bitOr(int x, int y) { return x | y; } y x v1 =~x v2 =~y v3 =v1 & v2 v4 =~v3 v5 =x | y t =v4 == v5 Straight-Line Evaluation

18 – 18 – Symbolic Execution (3-bit word size) x x2x2 10 x1x1 10 x0x0 10 y y2y2 10 y1y1 10 y0y0 10 v1 =~x x2x2 01 x1x1 01 x0x0 01 v2 =~y y2y2 01 y1y1 01 y0y0 01

19 – 19 – Symbolic Execution (cont.) v3 =v1 & v2 x2x2 y2y2 01 x1x1 y1y1 01 x0x0 y0y0 01 v4 =~v3 x2x2 y2y2 10 x1x1 y1y1 10 x0x0 y0y0 10 v5 =x | y x2x2 y2y2 10 x1x1 y1y1 10 x0x0 y0y0 10 t =v4 == v5 1

20 – 20 – Counterexample Generation Find values of x & y for which these programs produce different results int bitOr(int x, int y) { return ~(~x & ~y); } int bitXor(int x, int y) { return x ^ y; } y x v1 =~x v2 =~y v3 =v1 & v2 v4 =~v3 v5 =x ^ y t =v4 == v5 Straight-Line Evaluation

21 – 21 – Symbolic Execution v4 =~v3 x2x2 y2y2 10 x1x1 y1y1 10 x0x0 y0y0 10 v5 =x ^ y x2x2 y2y2 10 y2y2 x1x1 y1y1 10 y1y1 x0x0 y0y0 10 y0y0 t =v4 == v5 x2x2 y2y2 x1x1 y1y1 x0x0 y0y0 01 x = 111 y = 001

22 – 22 – Performance: Good int addXY(int x, int y) { return x+y; } int addYX(int x, int y) { return y+x; }

23 – 23 – Performance: Bad int mulXY(int x, int y) { return x*y; } int mulYX(int x, int y) { return y*x; }

24 – 24 – Why Is Multiplication Slow? Multiplication function intractable for BDDs Exponential growth, regardless of variable orderingBitsAddMult421155 84114560 Multiplication-4 Add-4 Node Counts

25 – 25 – What if Multiplication were Easy? Determine case where factorK(x,y) != zero(x,y) Would enable us to factor numbers int factorK(int x, int y) { int K = XXXX...X; int rangeOK = 1 < x && x <= y; int factorOK = x*y == K; return rangeOK && factorOK; } int zero(int x, int y) { return 0; }

26 – 26 – Evaluation of Bit-Level Checking Application BDD version used on student programming examples Similar capabilities provided by CBMC Clarke, Kroening, Lerda, TACAS ’04 Uses SAT solverStrengths Provides 100% guarantee of correctness Performance very good for simple arithmetic functionsWeaknesses Important integer functions have exponential blowup Not practical for programs that build and operate on large data structures

27 – 27 – Model Checking Software Program is Hard to Model as Finite-State Machine Large number of large data words means lots of bits Although “finite”, bound is very large Recursion requires stack Conceptually unbounded Creating Abstraction Focus on control structure of program Minimal modeling of data representations & operations

28 – 28 – Microsoft SLAM Project http://research.microsoft.com/slam/Application Windows device drivers Check adherence to Windows driver guidelines

29 – 29 – Device Driver Environment Device Drivers Software to support external devices Part of trusted software base Has kernel-level privileges Can easily cause system to crash Provided by device manufacturer As object code Trusted Software Driver #2 Driver #1 Operating System Kernel Device #1 Device #2

30 – 30 – SLAM Operation Abstraction / Refinement Extract Boolean program from C code Like C program, except only Boolean data types Model check to find typical driver bugs Refine model if necessary

31 – 31 – Code Verification Example Adapted by Tom Ball from PCI device driver code do { lock(v); old = new; if (test()) { unlock(v); new++; } } while (new != old); unlock(v); Properties to Check Cannot unlock v unless locked Cannot lock v unless unlocked Must exit code with v unlocked

32 – 32 – Model as Boolean Program Programming Model Retain original control flow Abstract data to be just Boolean values Initial abstraction No data. Branches occur arbitrarily do { lock(); if (*) { unlock(); } } while (*); unlock(); Apparent bug: May call lock twice Apparent bug: May call unlock twice

33 – 33 – Boolean Program Details States encode program point + Variable values do { lock(); NOP if (*) { unlock(); NOP } } while (*); unlock(); !L L L ERR L !L do { lock(v); old = new; if (test()) { unlock(v); new++; } } while (new != old); unlock(v); !L L L: Locked ERR Original Program

34 – 34 – Refining Abstraction Keep generating refinements until find bug or prove doesn’t exist do { lock(); E = true; if (*) { unlock(); E = E?false:*; } } while (!E); unlock(); do { lock(v); old = new; if (test()) { unlock(v); new++; } } while (new != old); unlock(v); L: Locked E: old == new E !L E L E !L E L !E !L E !L !E !L E L ERR Original Program !E !L !E L

35 – 35 – Windows Success Story “Things like even software verification, this has been the Holy Grail of computer science for many decades but now in some very key areas, for example, driver verification we're building tools that can do actual proof about the software and how it works in order to guarantee the reliability” -- Bill Gates, WinHec Conference, 2002 Available in Windows Driver Development Kit Project moved from Microsoft Research into Windows Development Group

36 – 36 – Why the Recent Successes in Software Verification? Need for Improved Software Quality Worldwide network of hackers creates hostile testing environment Improved Technology Program analysis + hardware verification tools Loss of Ambition Bug finding rather than complete verification Vs. earlier attitude: must do complete verification of entire program

37 – 37 – Some Research Challenges Understanding Memory References In C, hard to determine whether two pointers refer to same object Pointer aliasing Real programs construct and manipulate complex data structures Current verification techniques do best with control-intensive applications Working with Large Code Base Runtime libraries Even simple programs build upon extensive library code »Especially Java

38 – 38 – Application Challenge Cannot Apply Directly to Full Scale Design Verify smaller subsystems Verify abstracted versions of full system Must understand system & tool to do effectively System Size Degree of Concurrency Challenging Systems to Design Model checking Capacity


Download ppt "Automated Formal Verification of Software Randal E. Bryant Carnegie Mellon University."

Similar presentations


Ads by Google