Automatic Verification of Pointer Programs using Grammar-based Shape Analysis Hongseok Yang Seoul National University (Joint Work with Oukseh Lee and Kwangkeun.

Slides:



Advertisements
Similar presentations
A Framework for describing recursive data structures Kenneth Roe Scott Smith.
Advertisements

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Soundness of Higher-order Frame Rules (How did category theory help me?) Hongseok Yang Seoul National University Joint work with Lars Birkedal and Noah.
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Semantics Static semantics Dynamic semantics attribute grammars
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
A Program Transformation For Faster Goal-Directed Search Akash Lal, Shaz Qadeer Microsoft Research.
ICE1341 Programming Languages Spring 2005 Lecture #6 Lecture #6 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
1 How to transform an analyzer into a verifier. 2 OUTLINE OF THE LECTURE a verification technique which combines abstract interpretation and Park’s fixpoint.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.
CS 355 – Programming Languages
Relational Inductive Shape Analysis Bor-Yuh Evan Chang University of California, Berkeley Xavier Rival INRIA POPL 2008.
Reduction in End-User Shape Analysis Dagstuhl - Typing, Analysis, and Verification of Heap-Manipulating Programs – July 24, 2009 Xavier Rival INRIA and.
Discovering Affine Equalities Using Random Interpretation Sumit Gulwani George Necula EECS Department University of California, Berkeley.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 18 Program Correctness To treat programming.
PSUCS322 HM 1 Languages and Compiler Design II Formal Semantics Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Operational Semantics Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Describing Syntax and Semantics
© 2006 Pearson Addison-Wesley. All rights reserved2-1 Chapter 2 Principles of Programming & Software Engineering.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Reading and Writing Mathematical Proofs
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 2: Operational Semantics I Roman Manevich Ben-Gurion University.
June 27, 2002 HornstrupCentret1 Using Compile-time Techniques to Generate and Visualize Invariants for Algorithm Explanation Thursday, 27 June :00-13:30.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Program Analysis with Dynamic Change of Precision Dirk Beyer Tom Henzinger Grégory Théoduloz Presented by: Pashootan Vaezipoor Directed Reading ASE 2008.
1 Inference Rules and Proofs (Z); Program Specification and Verification Inference Rules and Proofs (Z); Program Specification and Verification.
Axiomatic Methods for Software Verification Hongseok Yang.
Chapter Twenty-ThreeModern Programming Languages1 Formal Semantics.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
CS 363 Comparative Programming Languages Semantics.
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
Refinements to techniques for verifying shape analysis invariants in Coq Kenneth Roe GBO Presentation 9/30/2013 The Johns Hopkins University.
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
Chapter 3 Part II Describing Syntax and Semantics.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Semantics In Text: Chapter 3.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
Beyond Reachability: Shape Abstraction in the presence of Pointer Arithmetic Hongseok Yang (Queen Mary, University of London) (Joint work with Dino Distefano,
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
1 Combining Abstract Interpreters Mooly Sagiv Tel Aviv University
Quantified Data Automata on Skinny Trees: an Abstract Domain for Lists Pranav Garg 1, P. Madhusudan 1 and Gennaro Parlato 2 1 University of Illinois at.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
Shape & Alias Analyses Jaehwang Kim and Jaeho Shin Programming Research Laboratory Seoul National University
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Lifting Abstract Interpreters to Quantified Logical Domains (POPL’08)
Spring 2017 Program Analysis and Verification
Graph-Based Operational Semantics
Kenneth Roe Scott F. Smith 9/30/2017 The Johns Hopkins University
Chapter 10: Mathematical proofs
Programming Languages 2nd edition Tucker and Noonan
Reduction in End-User Shape Analysis
Semantics In Text: Chapter 3.
Shape Analysis for Low-level Code
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Programming Languages 2nd edition Tucker and Noonan
Presentation transcript:

Automatic Verification of Pointer Programs using Grammar-based Shape Analysis Hongseok Yang Seoul National University (Joint Work with Oukseh Lee and Kwangkeun Yi)

Automatic Verification of Pointer Programs Inference of program invariants:  crucial for automatic verification.  Difficulty: unboundedly many new heap cells. h:=nil; while (*) { x:=new(nil,nil); if (h=nil) { h:=x; } else { x->next:=h; h->prev:=x; h:=x; } } h nil h h h h Need to “summarize” heap cells.

Goal: Precise “High-level” Invariants h nil h h p dlist(p) dlist(p) ::= nil | pdlist(c) nil c Existing technology: Shape analysis[SaReWi96,99]. Idea: Use a grammar to find a good abstraction of each heap object (i.e., cells and their pointers).

Demo Binomial heap construction: (all pointers to nil are omitted.) Our immediate goal was to handle the binomial heap construction algorithm.

Structure of Our Analysis Abstract Execution Normalize D#D# Nk#Nk# while(B){ C1; C2; C3; C4; } D#D# Nk#Nk# Embed

Abstract Domain D # : Grammar D # = P f (Graph) x Grammar + {T} A grammar R is a set of following rules:   (x) = nil | O | … where V 1, V 2 2 {nil, self, x,  (_),  (self),  (nil),  (x) } Examples:  tree(x) = nil | O  dList(x) = nil | O V1V1 V2V2 tree(_) x dList(self)

Abstract Domain D # : Shape Graph D # = P f (Graph) x Grammar + {T} Shape graph: Each “node” is concrete (“a”), annotated with nil (“d”), or annotated with a nonterminal (“c” and “b”). An element (S,G) in D # is called abstract state. x c:tree(_) a y b:dList(a) d:nil Stack Heap

Normalized Abstract Domain N k # Idealized version of normalization: 1. Group nodes according to heap objects; 2. Compute the best grammar that describes each group; 3. Ensure that each shape graph doesn’t use more than k nodes. Example: N k # ( µ D # ) consists of normalized abstract states.  The actual definition of N k # is not algorithmic. x c:nil a d:nile:nil b x x a:nil a:tree(_) tree(_) tree(_) = nil | O normalize

Definition of Analysis Analysis of programs without loops:  Forward analysis « C ¬ : D # ! D #  Case pruning « B ¬ : D # ! D # « while B C ¬ A = Fix A v F = t n F n (normalize(A)).  F : N k # ! N k #  F = A’. normalize(A’ t « B ¬ ( « C ¬ A’)))

Doubly Linked List Construction h := nil; while (*) { var x; x := new; if (h = nil) { h := x; } else { x->next := h; h->prev := x; h := x }

Inferred Loop Invariant Inferred abstract state (i.e., shape-graph set and grammar):  (x)= nil | O  (x)= nil | O prev next nil  (self) x h a:  (_)

3 rd Iteration Step Abstract state A 2 after the 2 rd iteration: Inferred invariant A: A = normalize(A 2 t « LoopBody ¬ (A 2 ))  (x)= nil | O  (x)= nil | O prev next nil  (self) nil x h a:  (_)  (x)= nil | O  (x)= nil | O prev next nil  (self) x h a:  (_)

Computation of A 2 t« LoopBody ¬ (A 2 )  (x)= nil | O  (x)= nil | O prev next nil  (self)nil x h a:  (_) h x:=new x f:nilg:nil e prevnext x f:nilg:nil e prevnext h a:nil if(h=nil)… x f:nil e prevnext a prev next c:  (a) h g:nilb:nil next prev False Branch 1.Unroll . 2.Prune cases. 3.“Execute”. 4.Join results. 5.Collect “garbage”. True Branch

Normalization 1: Identify Heap Objects  (x)= nil | O  (x)= nil | O prev next nil  (self) nil x h a:  (_) h b:nilc:nil a prevnext h b:nil a prevnext c prev next d:  (c)  (x)=O prev next nil Identify data structures, and express them by nonterminals.

Normalization 1: Identify Heap Objects  (x)= nil | O  (x)= nil | O prev next nil  (self) nil x h a:  (_) hh b:nil a prevnext c prev next d:  (c)  (x)=O prev next nil Identify data structures, and express them by nonterminals.  (x)=O prev next x  (self) a:  (_)

Normalization 1: Identify Heap Objects  (x)= nil | O  (x)= nil | O prev next nil  (self) nil x h a:  (_) hh b:nil a prevnext  (x)=O prev next nil Identify data structures, and express them by nonterminals.  (x)=O prev next x  (self) a:  (_)  (x)=O prev next nil  (self) c:  (a)

Normalization 1: Identify Heap Objects  (x)= nil | O  (x)= nil | O prev next nil  (self) nil x h a:  (_) hh  (x)=O prev next nil Identify data structures, and express them by nonterminals.  (x)=O prev next x  (self) a:  (_)  (x)=O prev next nil  (self) a:  (_)

Normalization 2: Unify Similar Shape Graphs  (x)= nil | O  (x)= nil | O prev next nil  (self) nil x h a:  (_) hh  (x)=O prev next nil Roughly, two shape graphs are similar iff they coincide except the use of nonterminals.  (x)=O prev next x  (self) a:  (_)  (x)=O prev next nil  (self) a:  (_)  (x)= nil | O h a:  (_) prev next nil  (self) prev next nil | O prev next nil  (self) | O

Normalization 3: Collect Garbage  (x)= nil | O  (x)= nil | O prev next nil  (self) nil x  (x)=O prev next nil Eliminate the definitions of unused nonterminals from the grammar.  (x)=O prev next x  (self)  (x)=O prev next nil  (self)  (x)= nil | O h a:  (_) prev next nil  (self) prev next nil | O prev next nil  (self) | O ,  are not used

Normalization 4: Simplify the Grammar  (x)= nil | O prev next nil x Regard  (x) and nil as the same. Combine “same” cases and “same” definitions.  (x)=O prev next x  (self)  (x)= nil | O h a:  (_) prev next nil  (self) prev next nil | O prev next nil  (self) | O “Same” Cases | nil

Normalization 4: Simplify the Grammar  (x)= nil | O prev next nil x Regard  (x) and nil as the same. Combine “same” cases and “same” definitions.  (x)=O prev next x  (self)  (x)= nil | O h a:  (_) prev next nil  (self) | O prev next nil  (self) | nil “Same” Definitions  (self) | O prev next x  (self)

Normalization 4: Simplify the Grammar  (x)= nil | O prev next nil x Regard  (x) and nil as the same. Combine “same” cases and “same” definitions.  (x)= nil | O h a:  (_) prev next nil  (self) | O prev next x  (self) “Same” Cases

Normalization 4: Simplify the Grammar  (x)= nil Regard  (x) and nil as the same. Combine “same” cases and “same” definitions.  (x)= nil | O h a:  (_) prev next nil  (self) | O prev next x  (self)

Summary 1. “Execute” the loop body abstractly: « LoopBody ¬ A 2 2. Join the old and new values: A 2 t « LoopBody ¬ A 2 3. Normalize the obtained abstract state: 1. For each shape graph, identify heap objects and express them using nonterminals. 2. Unify similar shape graphs. 3. Remove the definitions of unused nonterminals. 4. Simplify the grammar.

Correctness The meaning of each abstract state (G,R) is given by an assertion “trans(G,R)” in sep. logic. Correctness theorem: If « C ¬ (G,R) = (G’,R’), then {trans(G,R)}C{trans(G’,R’)} is derivable in sep. logic. Termination: Since the domain N k # is finite, the analysis terminates.

Conclusion Presented an analysis that infers the loop invariant of complex pointer programs. The key idea is to use a grammar to describe the structure of a heap object (i.e., data structure). Future work: 1. Develop a systematic reusable framework. 2. Handle data structures with more extensive sharing.  dags and trees with linked leaf nodes, etc. 3. Prove a property that relates the input and ouput states.  SW recovers link fields to their original values.

Inferred Loop Invariant Inferred shape-graph set and grammar: Representation by an assertion:  (x)= nil | O  (x)= nil | O prev next nil  (self) x h a:  (_) letrec  (a,x) = (emp Æ a=nil) Ç 9 b.(a  nil,b) *  (b,a)  (a,x) = (emp Æ a=nil) Ç 9 b.(a  x,b) *  (b,a)  in  9 a. h=a Æ 8 x.  (a,x)

Abstract Domain D # D # = P f (Graph) x Grammar + {T} Shape graph: Each “node” can be concrete (“a”), annotated with nil (“d”), or annotated with a nonterminal (“c” and “b”). Semantics by separation-logic assertions: 9 abcd.(x=a Æ y=b) Æ (( 8 y.tree(c,y))*(a  c,d)*(c=nil Æ emp)*dList(b,a)) Formal definition:  Graph = (Var ! fin SymL) x (SymL ! fin Val)  Val = {nil,,  (a),  () | a,b 2 SymL,  2 NonTerm } x c:tree(_) a y b:dList(a) d:nil Stack Heap

Grammar A grammar R is a set of following rules:   (x) = nil | O | … where V 1, V 2 2 {nil, self, x,  (_),  (self),  (nil) } Examples:  tree(x) = nil | OdList(x) = x | O Semantics by separation-logic assertions: tree(c,x) = (c=nil Æ emp) Ç 9 lr.(c  l,r)*( 8 y.tree(l,y))*( 8 y.tree(r,y)) dList(c,p) = (c=nil Æ emp) Ç 9 n.(c  p,n)*dList(n,c) Formal definition:  Grammar = NonTerm ! fin P nf ({nil} + Case x Case)  Case = {nil, self, arg,  (_),  (arg),  (self) |  2 NonTerm } V1V1 V2V2 tree(_) x dList(self)

Normalized Abstract Domain N # N # consists of abstract states (G,R) in D # s.t. 1. all “data structures” are expressed by nonterminals: 1. All “similar” shape graphs and rules are merged. x c:nil a b:  (_) x c:nil a d:nile:nil b yx c:nil a b:  (_) y  x) = nil | O  x) = nil | O  (_) nil