Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modular Data Structure Verification Viktor Kuncak Supervisor: Martin Rinard Committee members: Arvind, Daniel Jackson.

Similar presentations


Presentation on theme: "Modular Data Structure Verification Viktor Kuncak Supervisor: Martin Rinard Committee members: Arvind, Daniel Jackson."— Presentation transcript:

1 Modular Data Structure Verification Viktor Kuncak Supervisor: Martin Rinard Committee members: Arvind, Daniel Jackson

2 Program analysis and verification Discover/verify properties of software systems Practical relevance: programmer productivity –performance: compiler optimizations –reliability: discovering and preventing errors –maintainability: understanding code Ultimate impact: –make it easier to produce working software –create more sophisticated systems

3 Spectrum of analysis techniques Broad research area, many dimensions –bug finding versus bug prevention –control-intensive versus data-intensive systems –generic versus application-specific properties Original ideal was automated full verification Reality: verify partial correctness properties – success story: type systems – active area: temporal properties (typestate) trend: towards more complex properties

4 My research verifying properties of data structures

5 Data structure consistency properties next prev next prev root acyclicity of next x.next.prev == x right left graph is a tree shape not given by types, but by structural properties; may change over time unbounded number of objects, dynamically allocated right left class Node { Node f1, f2; } elements are sorted

6 Data structure consistency properties next first 3 size value of size field is the number of stored objects table key value node is stored in the bucket given by the hash of node’s key hashCode dynamically allocated arrays numerical quantities Examples of internal data structure consistency properties instances do not share array

7 External data structure consistency If a book is loaned to a person, then – book is in the catalog – person is registered with library Book Person loanedTo [0..4] A person can loan at most 4 books at a time Can loan a book to at most one person at a time [0..1] - correlate different data structures - global - meaningful to users of the system - capture design constraints (object models) - inconsistency can lead to policy violations relies on internal consistency to be even meaningful Simple Library System

8 Both static and dynamic properties Invariant properties: talk about single state –data structure invariants hold State change properties: correlate multiple states –operations have the expected effect add operation inserts element into a set removal removes all elements with a given key operations have no unintended side effects –expected sequencing of operations can remove only after adding elems to data structure

9 Goal Prove data structure properties –for all program executions (sound) –both internal and external consistency –both invariant and state change properties –both implementation and use of data structures also absence of run-time errors with high level of automation

10 Proving data structure properties Java source code of a program automated verifier program satisfies the properties error in program (or property) ! (x,y) 2 r ! x 2 A Æ y 2 B BA r data structure properties (Isabelle/HOL)... proc remove(x : Node) { Node p=x.prev; n=x.next; if (p!=null) p.next = n; else root = n; if (n!=null) n.prev = p; }...

11 Challenges in verifying consistency complex heterogenous data structures, in the context of application; developer-defined properties precision no single approach will work communication with developers scalability

12 Contributions: Jahob verification system Field constraint analysis front end, verification condition generator Boolean Algebra with Presburger Arithmetic (BAPA) decision procedure dispatcher parsing, type checking, intermediate forms, variable dependencies modular verification methodology method to deploy multiple reasoning techniques Translation to first-order logic three complementary reasoning techniques splitting proof obligations, dispatching each result 1 2 3 4 5 program property

13 8 b 8 p. ( 9 i < M. 9 n. (loanedTable[i], n) 2 next* Æ n.key = b Æ n.value = p) ! (bookTreeRoot, b) 2 (left [ right)* Æ (personListRoot, p) 2 next* Isolate data structure complexity into separate Java classes Then verify: 1. properties hold for simplified system w/ sets and relations 2. classes correctly implement sets and relations If a person has borrowed a book, then – book is in the catalog – person is registered with library TreeCollection implementation Map implementation ListCollection implementation Library system example Book Person loanedTo 8 b 8 p. (b, p) 2 loanedTo ! b 2 Book Æ p 2 Person ni bp p b b p 1. 2.

14 class Map { specvar mcontent ::(obj*obj)set; public void remove(Object key) ensures mcontent = old mcontent – {(k,v).k=key} 1. Verifying high-level properties Book Person 8 b 8 p. (b, p) 2 loanedTo ! b 2 Book Æ p 2 Person b ListCollection person; TreeCollection book; Map loanedTo; public void decomissionBook(Book b) { books.remove(b); loanedTo.remove(b); } class TreeCollection { specvar tcontent :: obj set; public void remove(Object x) ensures tcontent = old tcontent – {x} loanedTo class LibrarySystem {

15 2. Verifying Map implementation class Map { // Implemented as a hash table private AssocList[] table; public specvar mcontent ::“(obj*obj) set”; invariant contentDef: mcontent = {(k,v). 9 i · M.(k,v) 2 table[i].acontent} invariant correctBucket: 8 k v. 8 i · M.(k,v) 2 table[i].acontent ! hash k M = i public void remove(Object key) requires key  null ensures mcontent = old mcontent – {(k,v).k=key} { int hash = compute_hash(key); table[hash] = AssocList.removeAll(key, table[hash]); mcontent := old mcontent – {(k,v).k=key} }...

16 class AssocList { // Functional linked list private Object key, data; private AssocList next; specvar acontent ::(obj*obj) set; invariant contentDef2: this  null ! acontent={(key,data)} [ next.acontent static AssocList removeAll(Object key, AssocList list) requires key  null modifies content ensures result.acontent = list.acontent – {(k,v).k=key} { if (list==null) return null; if (key==list.key) return removeAll(key,list.next); else return cons(list.key,list.data, removeAll(key,list.next)); } 2. Verifying association list implementation

17 Modular verification summary TreeCollection implementation Map implementation ListCollection implementation Association list implementation Library example Key benefits of modular verification – each individual verification task simpler – verification results for collections and maps are reusable (repositories of verified data structures)

18 Jahob verification system Field constraint analysis front end, verification condition generator Boolean Algebra with Presburger Arithmetic (BAPA) decision procedure dispatcher parsing, type checking, intermediate forms, variable dependencies splitting proof obligations, dispatching each result modular verification methodology method to deploy multiple reasoning techniques Translation to first-order logic three complementary reasoning techniques

19 front end, verification condition generator Reducing verification to validity of formulas annotated code verification condition formula validity checker valid Verification condition (VC) – a logical formula saying: “If precondition holds at entry, then postcondition holds in the final state, invariants are preserved, and there are no run-time errors” invalid program satisfies properties error in program or property !

20 Formula validity checking in Jahob Field constraint analysis front end, verification condition generator Boolean Algebra with Presburger Arithmetic (BAPA) decision procedure dispatcher parsing, type checking, intermediate forms, variable dependencies splitting proof obligations, dispatching each result, approximating HOL formulas modular verification methodology method to deploy multiple reasoning techniques Translation to first-order logic formula validity checker

21 What do verification conditions look like? ( 8 b. b 2 books ! b  null) Æ ( 8 b p. (b,p) 2 loanedTo ! b 2 books Æ p 2 persons) Æ b1 2 books ! (books1 = books - {b1} ! b1  null Æ (loanedTo1 = loanedTo - {(b,p).b=b1} ! ( 8 b. b 2 books1 ! b  null ) Æ ( 8 b p. (b,p) 2 loanedTo1 ! b 2 books1 Æ p 2 persons ))) invariant 8 b. b 2 books ! b  null invariant 8 b p.(b,p) 2 loanedTo ! b 2 books Æ p 2 persons public void decomissionBook(Book b1) requires b1 2 books { books.remove(b1); loanedTo.remove(b1); } annotated code verification condition - an Isabelle formula

22 Interactively proving VCs in Isabelle lemma verificationCondition: “( 8 b. b 2 books ! b  null) Æ ( 8 b p. (b,p) 2 loanedTo ! b 2 books Æ p 2 persons) Æ b1 2 books ! (books1 = books - {b1} ! b1  null Æ (loanedTo1 = loanedTo - {(b,p).b=b1} ! ( 8 b. b 2 books1 ! b  null ) Æ ( 8 b p. (b,p) 2 loanedTo1 ! b 2 books1 Æ p 2 persons )))” apply (rule_tac impI) apply (rule_tac conjI)... done apply (rule_tac impI) Isabelle checks manually supplied proof Automation limited for larger formulas Interactive = user supplies proof script

23 Can we check VCs with more automation? ( 8 b. b 2 books ! b  null) Æ ( 8 b p. (b,p) 2 loanedTo ! b 2 books Æ p 2 persons) Æ b1 2 books ! (books1 = books - {b1} ! b1  null Æ (loanedTo1 = loanedTo - {(b,p).b=b1} ! ( 8 b. b 2 books1 ! b  null ) Æ ( 8 b p. (b,p) 2 loanedTo1 ! b 2 books1 Æ p 2 persons ))) 1 2 3 4 verification condition - an Isabelle formula splitting into conjuncts ( 8 b. b 2 books ! b  null) Æ ( 8 b p. (b,p) 2 loanedTo ! b 2 books Æ p 2 persons) Æ b1 2 books Æ books1 = books - {b1} Æ loanedTo1 = loanedTo - {(b,p).b=b1} Æ (b0,p0) 2 loanedTo1 ! b0 2 books1 Sequent3: A 1 Æ... Æ A n ! G Reasoning Technique 1 multiple reasoning techniques S1 S4 S2 RT 2 RT 4 RT 3 3 valid

24 Constructing a reasoning technique How can a specialized technique accept Isabelle formulas? A 1 Æ... Æ A n ! G sequent - an Isabelle formula belongs to an undecidable class specialized algorithm expects as input e.g. formula in a decidable class (or otherwise “easier” class) soundly approximates formula with a simpler formula A 1 ’ Æ A 3 ! G’ valid Jahob reasoning technique formula approximation

25 Range of sound approximations Worst: a(F) = False (useless) Best: a(F) = if “F is valid” then True else False (impossible) General idea of our approximations: a(F) = a 1 (simplify(F)) a p (F 1 Æ F 2 ) = a p (F 1 ) Æ a p (F 2 ) a p (F 1 Ç F 2 ) = a p (F 1 ) Ç a p (F 2 ) a p ( : F) = : a : p (F) a p (goodF) = translation of goodF a 1 (badF) = False a 0 (badF) = True

26 Jahob verification system front end, verification condition generator decision procedure dispatcher parsing, type checking, intermediate forms, variable dependencies modular verification methodology method to deploy multiple reasoning techniques three complementary reasoning techniques splitting proof obligations, dispatching each result Isabelle Field constraint analysis Boolean Algebra with Presburger Arithmetic (BAPA) Translation to first-order logic first-order theorem prover MONA decision procedure Presburger Arithmetic decision procedure w/ Charles Bouillaguet w/ Thomas Wies

27 Translation to first-order logic Motivation: FOL provers effective, fully automated –decades of research in resolution, paramodulation –solved open problems (e.g. axiomatization of BAs) Approach: approximate HOL by FOL –substitute, beta-reduce definitions –sets and relations become predicates –flattening, function updates –eliminate tuples –linear arithmetic axioms –approximate otherwise: avoid full encoding (using combinators S, K, or encoding set theory)

28 Encoding types Translated formulas have two types: obj,int Input to resolution-based provers is untyped! Standard solution: types as unary predicates –makes formulas larger, provers much slower Faster solution: omit them! –not sound in general Theorem: Omitting types is sound if –sorts are disjoint, and –sorts have equal cardinality Orders of magnitude speedup

29 Results obtained using first-order provers Instantiable set and relation implementations: –Hash table (120 sec) –Association list (12 sec) –Functional sorted binary search tree (178 sec) –Imperative list (18 sec) Library example (20 sec)

30 Hash table insertion public void add (Object key, Object value)... { int hash = compute_hash(key); table[hash] = AssocList.cons(key,value, table[hash]); mcontent := (old mcontent) [ {(key,value)} if (size > (4 *table.length)/5) rehash (table.length + table.length); } public void rehash (int m)... ensures “mcontent = old mcontent” { AssocList[] t = table; init(m); rehash_aux (0,t); } private void rehash_aux (int i, AssocList[] t)... { addAll (t[i]); if (j < t.length) rehash_aux (j,t); } public addAll (AssocList[] pairs)... { AssocList lst = pairs; while inv “...” (!AssocList.is_nil(lst)) { Pair p = AssocList.getOne(lst); lst = AssocList.remove(p.key, p.value, lst); add (p.key, p.value); } }

31 Verifying imperative lists private Node first; private ghost specvar con :: obj set; public specvar lcontent :: obj set; vardefs lcontent = first.con; invariant this  null ! con = {data} [ next.con & : data 2 next.con; public void remove(Object x) modifies lcontent ensures lcontent = old lcontent – {x} 1 first next 2 3 4 {1,2,3,4} x=3 con {2,3,4} {3,4} {4} Loop searching for 3 must also remove 3 from preceding con fields We really want is something that can express reachability During search, invariant defining con temporarily violated

32 Jahob verification system Field constraint analysis front end, verification condition generator Boolean Algebra with Presburger Arithmetic (BAPA) decision procedure dispatcher parsing, type checking, intermediate forms, variable dependencies modular verification methodology method to deploy multiple reasoning techniques Translation to first-order logic three complementary reasoning techniques first-order theorem prover MONA decision procedure Presburger Arithmetic decision procedure splitting proof obligations, dispatching each result

33 Imperative list using reachability private static Node first; public static specvar content :: obj set vardefs content=={x.x  null Æ (first,x) 2 {(a,b).b=next a}* } invariant tree [next] invariant 8 x y. prev x = y ! next y = x (almost) public void remove(Object x) requires n 2 content modifies content ensures content = old content – {x} { if (n==first) root = root.next else n.prev.next = n.next; if (n.next != null) n.next.prev = n.prev; n.next = null; n.prev = null; } content is dependent variable – no need to update it in remove reachability expressed directly – not using induction

34 Proving formulas with reachability Reachability properties in trees are decidable –Monadic Second-Order Logic over Trees –existing MONA decision procedure constructs a tree automaton for each formula checks emptiness of the language of automaton right left Can analyze list, tree implementations But not doubly-linked lists or trees with parent pointers Using simple MONA approximation:

35 Field constraint analysis Enables reasoning about non-tree fields Can handle broader class of data structures –doubly-linked lists, trees with parent pointers –skip lists next prev next tree backbone constrained fields prev Constrained fields satisfy constraint invariant: 8 x y. prev y = x ! next x = y

36 Elimination of constrained fields MONA field constraint analysis VMCAI'06 tree backbone constrained fields VC 1 (next,prev) VC 2 (next) valid soundness invalid completeness (for useful class including preservation of field constraints) substitute (prev a = b) with (next b = a) next prev next prev Constrained fields satisfy constraint invariant: 8 x y. prev y = x ! next x = y

37 Field constraints: a comparison next nextSub next nextSub tree backbone constrained fields Constrained fields satisfy constraint invariant: 8 x y. nextSub x = y  next + x y Previous approaches – constraining formula must be deterministic We allow arbitrary constraint formulas – fields need not be uniquely given by backbone

38 Field constraint analysis results Results within Jahob –lists –trees with parent pointer (insertion) –two-level skip list Proved sound and complete* High automation level –no need to for specification variable updates Symbolic shape analysis (Thomas Wies) –infers loop invariants

39 Jahob verification system Field constraint analysis front end, verification condition generator Boolean Algebra with Presburger Arithmetic (BAPA) decision procedure dispatcher parsing, type checking, intermediate forms, variable dependencies modular verification methodology method to deploy multiple reasoning techniques Translation to first-order logic three complementary reasoning techniques first-order theorem prover MONA decision procedure Presburger Arithmetic decision procedure splitting proof obligations, dispatching each result

40 BAPA: Sets with cardinality bounds Imposing constraints on abstract content card(content) = size 2 card(circulatedBooks) · card(books) next first 3 size size field is consistent with the number of stored objects

41 Boolean Algebra with Presburger Arithmetic Not widely known, but natural extension of BAs I gave first complexity bound (CADE'05, JAR) –quantifier elimination algorithm (as in LICS’03) S ::= V | S 1 [ S 2 | S 1 Å S 2 | S 1 n S 2 T ::= k | C | T 1 + T 2 | T 1 – T 2 | C ¢ T | card(S) A ::= S 1 = S 2 | S 1 µ S 2 | T 1 = T 2 | T 1 < T 2 F ::= A | F 1 Æ F 2 | F 1 Ç F 2 | : F | 9 S.F | 9 k.F

42 From BAPA to PA If A,B are disjoint, then |A [ B| = |A| + |B| Make them disjoint: Venn diagram Reduce set vars to integer vars For quantifiers, use quantifier elimination Preserves alternations  complexity same as for PA 2 3 6 1 4 |x c Å y Å z c | x y z 5 8

43 Quantifier-free BAPA Previous technique gives NEXPTIME We show it can be done in PSPACE: –analyze resulting integer linear equations exponentially many variables polynomially many equations  small model property: solutions singly exponential –guess sizes of sets –use alternating PTIME algorithm to check them Real-valued relaxation is NP-complete |x| + |y| = |x\y| + |x\y| + 1 - satisfiable in relaxation

44 Summary of BAPA results Application within Jahob –verified updates to size field –library example: at most ½ books in circulation Observations –clarified that problem is not undecidable (!) –first formalization of algorithm –showed complexity identical to PA –QFBAPA bound from NEXPTIME to PSPACE –QFBAPA fragments in P (with Bruno Marnette) –real-value version of QFBAPA is NP-complete

45 Jahob verification system Field constraint analysis Boolean Algebra with Presburger Arithmetic (BAPA) decision procedure dispatcher parsing, type checking, intermediate forms, variable dependencies modular verification methodology method to deploy multiple reasoning techniques Translation to first-order logic three complementary reasoning techniques first-order theorem prover MONA decision procedure Presburger Arithmetic decision procedure splitting proof obligations, dispatching each result front end, verification condition generator

46 there is more to Jahob verification system Field constraint analysis Boolean Algebra with Presburger Arithmetic (BAPA) decision procedure dispatcher Translation to first-order logic first-order theorem prover MONA decision procedure Presburger Arithmetic decision procedure symbolic shape analysis syntactic loop invariant inference Karen Zee Thomas Wies Isabelle Coq CVC Lite Omega w/ Thomas Wies w/ Charles Bouillaguet Huu Hai Nguyen Charles Karen, Thomas front end, verification condition generator

47 Synergy of reasoning techniques Map implementation ListCollection implementation Association list implementation Library example Translation to first-order logic BAPA Field constraint analysis

48 How Jahob addresses challenges complex heterogenous data structures, in the context of application; developer-defined properties precision no single approach will work communication with developers scalability modular verification multiple reasoning techniques Isabelle as specification language reduce verification to formulas in logic

49 Verified data structures Lists implementing sets and relations Trees implementing sets and relations List with a cursor (simplified iterator) Hash table Two-level skip list Insertion sort Library benchmark In progress: small game; part of file system

50 Future work Case studies Methodology for encapsulation Inference of specifications, specialized analyses New specification annotations and their power Finer-grained combination techniques Executing and under-approximating formulas –counterexamples for formulas (FSE’05) –testing, run-time checking of specifications –efficient execution of declarative specifications design appropriate specification language

51 Related work Program verification systems –King ’70, Deutsch’73, Suzuki’73, Nelson’81, Guttag, Horning’93 –Good, Akers, Smith ’86: Gypsy –Jones’86: VDM –Abrial, Lee, Neilson, Scharbach, Soerensen’91: B method –Owre, Shankar, Rushby, Stringer-Calvert: PVS –Ahrendt, Baar, Beckert, Giese, Habermalz, Haehnle, Menzel, Schmitt’00: KeY –Foulger, King’01: SPARK Ada –Flanagan, Leino, Lilibridge, Nelson, Saxe, Stata‘02: ESC/Java –Marche, Paulin-Mohring, Urbain’03: Krakatoa –Breunesse, Poll’05: model fields in JML –Barnett, DeLine, Jacobs, Fähndrich, Leino, Schulte, Venter’05: Spec# –Leino, Mueller’06: model fields in Spec#

52 Related work Shape analysis –Larus, Hilfinger’88: detecting conflicts in memory accesses –Hendren, Nicolau ’90: parallelization, connection analysis –Chase, Wegman, Zadeck’90: allocation-site model –Klarlund, Schwartzbach’93: graph types –Deutsch ’94: symbolic bounds on paths –Fradet, Metayer ’97: graph-grammars –Sagiv, Reps, Wilhelm ’99: 3-valued framework –Lev-Ami, Sagiv ’00: TVLA implementation –Moeller, Schwartzbach ’01: PALE based on MONA –Yorsh, Reps, Sagiv ’04: assume/guarantee reasoning for 3VL –McPeak, Necula ’05: local pointer properties –Rugina, Hacket’05: region-based –Lee, Yang, Yi’05: combining three-valued and grammar-based –separation logic

53 Related work Type systems –Freeman, Pfenning ’91: refinement types –Xi, Pfenning ’99: dependent ML, Xi: ATS –Nguyen, David, Qin, Chin’06: size, shape, bag properties Bug finding –Jackson, Vaziri ’00; Dennis, Chang, Jackson’06: finding errors using constraint solving –Xie, Aiken ’05: Saturn – low-level errors –Evans ’94: LCLint –Boyapati, Khurshid, Marinov ’02: imperative specifications –Sen, Marinov, Agha: symbolic execution and random testing

54 Related work Decision procedures and theorem provers –Barrett, Berezin’04: CVC Lite –Detlef, Nelson, Saxe’03: Simplify –Ball, Lahiri, Musuvathi ’05: Zap –Thatcher, Wright’68: MSOL over finite trees –Klarlund, Moeller, Schwartzbach’00: MONA –Yorsh, Rabinovich, Sagiv, Meyer, Bouajjani’06: reachability logic –BAPA: Feferman,Vaught’59; Zarba’04,’05 –Voronkov’95: Vampire, Weidenbach’01: Spass, Schulz’02: E –Gordon’85: HOL, Pfenning’91: LF, Coquand, Huet’85: Coq –Constable, Allen, Bromley, Cleaveland, Cremer, Harper, Howe, Knoblock, Mendler, Panangaden, Sasaki, Smith’86: NuPRL –Kaufmann, Manolios, Moore ’00: ACL2 –Nipkow, Paulson, Wenzel’02: Isabelle –Translations: Meng, Paulson’06

55 What I did Designed, built (with colleagues) Jahob: a new data structure verification system Modular verification w/ specification variables Addressed a key technical problem: proving validity of expressive formulas Combination technique: split, approximate, decide Three reasoning techniques –translation to first-order logic –field constraint analysis –Boolean Algebra with Presburger Arithmetic Verified: lists, hash tables, trees, client examples

56 Bottom line Can have verified data structures –individual data structures –correlated uses of multiple data structures


Download ppt "Modular Data Structure Verification Viktor Kuncak Supervisor: Martin Rinard Committee members: Arvind, Daniel Jackson."

Similar presentations


Ads by Google