Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić.

Similar presentations


Presentation on theme: "Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić."— Presentation transcript:

1 Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić

2 This Work  Published at Static Analysis Symposium 2012  Joint work with Dimitra Giannakopoulou (NASA) and Vishwanath Raman (CMU/NASA)

3 Introduction

4 Motivating Example class Example { private static int x = 0; private static int y = 0; public static void init(int p, int q) { x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; } init can be called unconditionally a can be called unconditionally b can be called after init only when y != 10

5 Goal  Learn temporal interfaces of software components  Legal and illegal sequences of method calls defined as an automaton  Why?  Documentation  Reverse engineering  Model-based testing  Regression testing  Compositional verification  …

6 Limitations of Prior Approaches  Since method b in Example cannot be called unconditionally after init, prior approaches either  consider calling b after init an error no matter what the values of the parameters it depends on are, or  expect init to be manually partitioned

7 Our Contribution class Example {... }

8 Background

9 Symbolic Execution  Key idea: execution of programs using symbolic input values instead of concrete data  Concrete vs symbolic  Concrete execution  Program takes only one path determined by input values  Symbolic execution  Program can take any feasible path – coverage!  Limited by the power of constraint solver  Scalability issues when faced with large (exponential) number of paths – path explosion

10 Symbolic Program State  Symbolic values of program variables  Path condition (PC)  Logical formula over symbolic inputs  Accumulates constraints that inputs have to satisfy for the particular path to be executed  If a path is feasible its PC is satisfiable  Program location

11 Symbolic Execution Tree  Characterizes execution paths constructed during symbolic execution  Nodes are symbolic program states  Edges are labeled with program transitions

12 Example 1) int x, y; 2) if (x > y) { 3) x = x + y; 4) y = x – y; 5) x = x – y; 6) if (x > y) 7) assert false; 8) }

13 x:X, y:Y PC:true x:X, y:Y PC:true x:X, y:Y PC:X>Y x:X, y:Y PC:X>Y x:X, y:Y PC:X<=Y x:X, y:Y PC:X<=Y x:X+Y, y:Y PC:X>Y x:X+Y, y:Y PC:X>Y x:X+Y, y:X PC:X>Y x:X+Y, y:X PC:X>Y x:Y, y:X PC:X>Y x:Y, y:X PC:X>Y x:Y, y:X PC:X>Y Æ Y>X x:Y, y:X PC:X>Y Æ Y>X x:Y, y:X PC:X>Y Æ Y<=X x:Y, y:X PC:X>Y Æ Y<=X true false SAT UNSAT SAT 1) int x, y; 2) if (x > y) { 3) x = x + y; 4) y = x – y; 5) x = x – y; 6) if (x > y) 7) assert false; 8) }

14 Active Automata Learning  D. Angluin, 1987: “Learning Regular Sets from Queries and Counterexamples”  Algorithm is called L*  L* learns unknown regular language U (over alphabet  ) and produces minimal DFA A such that L(A) = U  Complexity of the original algorithm is O(|  |*|A| 3 )

15 Active Automata Learning cont.  L* learner communicates with a teacher using two types of queries  Membership queries: Should word w be included in L(A)?  Expected answer: yes/no  Equivalence queries: Here is a conjectured DFA A – is L(A) = U?  Expected answer: yes/no+counterexample

16 L* Learner Teacher word w yes/no DFA A yes/no+cex DFA A

17 PSYCO Algorithm

18 Interface Learning with L*  L* uses a teacher to answer the following queries  Membership queries  Whether or not a given sequence of method calls leads to an error or not in the implementation  Equivalence queries  Whether a conjectured DFA captures all the behaviors of the implementation

19 Answering Membership Queries  L* uses a teacher to answer the following queries  Membership queries  Whether or not a given sequence of method calls leads to an error or not in the implementation  Equivalence queries  Whether a conjectured DFA captures all the behaviors of the implementation

20 Running Example class Example { private static int x = 0; private static int y = 0; public static void init(int p, int q) { x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; }

21 Executing query class Example { private static int x = 0; private static int y = 0; public static void init(int p, int q) { x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; } x:P, y:Q PC: true x:P, y:Q PC: true OK PC: Q != 10 OK PC: Q != 10 p:P, q:Q PC: true p:P, q:Q PC: true ERROR PC: Q == 10 ERROR PC: Q == 10

22 Executing query class Example { private static int x = 0; private static int y = 0; public static void init(int p, int q) { x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; } x:P, y:Q PC: true x:P, y:Q PC: true OK PC: Q != 10 OK PC: Q != 10 p:P, q:Q PC: true p:P, q:Q PC: true ERROR PC: Q == 10 ERROR PC: Q == 10

23 Executing query class Example { private static int x = 0; private static int y = 0; public static void init(int p, int q) { x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; } x:P, y:Q PC: true x:P, y:Q PC: true p:P, q:Q PC: true p:P, q:Q PC: true OK PC: Q != 10 OK PC: Q != 10 ERROR PC: Q == 10 ERROR PC: Q == 10

24 Refinement: Split init public static void init(int p, int q) { x = p; y = q; } public static void init_0(int p, int q) { assume q != 10; init(p, q); } public static void init_1(int p, int q) { assume q == 10; init(p, q); } x:P, y:Q PC: true x:P, y:Q PC: true p:P, q:Q PC: true p:P, q:Q PC: true OK PC: Q != 10 OK PC: Q != 10 ERROR PC: Q == 10 ERROR PC: Q == 10 init_0 := init[q != 10] init_1 := init[q == 10]

25 Restart Learning public static void init(int p, int q) { x = p; y = q; } public static void init_0(int p, int q) { assume q != 10; init(p, q); } public static void init_1(int p, int q) { assume q == 10; init(p, q); } new learner alphabet: {init_0, init_1, a, b} learning restarts, re-using results from previous iterations x:P, y:Q PC: true x:P, y:Q PC: true p:P, q:Q PC: true p:P, q:Q PC: true OK PC: Q != 10 OK PC: Q != 10 ERROR PC: Q == 10 ERROR PC: Q == 10

26 Executing query class Example { private static int x = 0; private static int y = 0; public static void init_0(int p, int q) { assume q != 10; x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; } x:P, y:Q PC: true x:P, y:Q PC: true x:P, y:10 PC: P = 0 x:P, y:10 PC: P = 0 x:P, y:11 PC: P != 0 x:P, y:11 PC: P != 0 OK PC: P != 0 OK PC: P != 0 p:P, q:Q PC: true p:P, q:Q PC: true ERROR PC: P = 0 ERROR PC: P = 0

27 Executing query class Example { private static int x = 0; private static int y = 0; public static void init_0(int p, int q) { assume q != 10; x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; } x:P, y:Q PC: true x:P, y:Q PC: true x:P, y:10 PC: P = 0 x:P, y:10 PC: P = 0 x:P, y:11 PC: P != 0 x:P, y:11 PC: P != 0 OK PC: P != 0 OK PC: P != 0 p:P, q:Q PC: true p:P, q:Q PC: true ERROR PC: P = 0 ERROR PC: P = 0

28 Executing query x:P, y:Q PC: true x:P, y:Q PC: true x:P, y:10 PC: P = 0 x:P, y:10 PC: P = 0 x:P, y:11 PC: P != 0 x:P, y:11 PC: P != 0 OK PC: P != 0 OK PC: P != 0 p:P, q:Q PC: true p:P, q:Q PC: true ERROR PC: P = 0 ERROR PC: P = 0 class Example { private static int x = 0; private static int y = 0; public static void init_0(int p, int q) { assume q != 10; x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; }

29 Executing query class Example { private static int x = 0; private static int y = 0; public static void init_0(int p, int q) { assume q != 10; x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; } x:P, y:Q PC: true x:P, y:Q PC: true x:P, y:10 PC: P = 0 x:P, y:10 PC: P = 0 x:P, y:11 PC: P != 0 x:P, y:11 PC: P != 0 OK PC: P != 0 OK PC: P != 0 p:P, q:Q PC: true p:P, q:Q PC: true ERROR PC: P = 0 ERROR PC: P = 0

30 Executing query class Example { private static int x = 0; private static int y = 0; public static void init_0(int p, int q) { assume q != 10; x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; } x:P, y:Q PC: true x:P, y:Q PC: true x:P, y:10 PC: P = 0 x:P, y:10 PC: P = 0 x:P, y:11 PC: P != 0 x:P, y:11 PC: P != 0 OK PC: P != 0 OK PC: P != 0 p:P, q:Q PC: true p:P, q:Q PC: true ERROR PC: P = 0 ERROR PC: P = 0

31 Refinement: Split init_0 ERROR PC: P = 0 ERROR PC: P = 0 public static void init_0(int p, int q) { assume q != 10; x = p; y = q; } public static void init_0_0(int p, int q) { assume p == 0 && q != 10; init(p, q); } public static void init_0_1(int p, int q) { assume p != 0 && q != 10; init(p, q); } x:P, y:Q PC: true x:P, y:Q PC: true x:P, y:10 PC: P = 0 x:P, y:10 PC: P = 0 x:P, y:11 PC: P != 0 x:P, y:11 PC: P != 0 OK PC: P != 0 OK PC: P != 0 p:P, q:Q PC: true p:P, q:Q PC: true init_0_0 := init[q != 10 && p == 0] init_0_1 := init[q != 10 && p != 0]

32 Restart Learning ERROR PC: P = 0 ERROR PC: P = 0 public static void init_0(int p, int q) { assume q != 10; x = p; y = q; } public static void init_0_0(int p, int q) { assume p == 0 && q != 10; init(p, q); } public static void init_0_1(int p, int q) { assume p != 0 && q != 10; init(p, q); } new learner alphabet: {init_0_0, init_0_1, init_1, a, b} learning restarts x:P, y:Q PC: true x:P, y:Q PC: true x:P, y:10 PC: P = 0 x:P, y:10 PC: P = 0 x:P, y:11 PC: P != 0 x:P, y:11 PC: P != 0 OK PC: P != 0 OK PC: P != 0 p:P, q:Q PC: true p:P, q:Q PC: true

33 Answering Equivalence Queries  L* uses a teacher to answer the following queries  Membership queries  Whether or not a given sequence of method calls leads to an error or not in the implementation  Equivalence queries  Whether a conjectured DFA captures all the behaviors of the implementation

34 Unbounded Loops in Conjectures  Component have no loops, but conjectures do!  We unroll unbounded loops in conjectures a bounded number of times

35 Answering Equivalence Queries  Walk the conjectured automaton and extract  all legal method sequences to a given depth k  all illegal method sequences  for each illegal sequence of depth n, extract the legal sequence of depth n - 1  We then use membership queries to check the outcome of each sequence  If a sequence is misclassified by the learner, we have a counterexample for L*

36 Running Example: Depth is 2 class Example { private static int x = 0; private static int y = 0; public static void init(int p, int q) { x = p; y = q; } public static void a() { if (x == 0) y = 10; else y = 11; } public static void b() { if (y == 10) assert false; }

37 Running Example: Depth is 3

38 Implementation and Experiments

39 Architecture of PSYCO

40 Implementation of PSYCO  Implemented on top of Java PathFinder (JPF) software model checking infrastructure http://babelfish.arc.nasa.gov/trac/jpf  PSYCO-related modules  jpf-psyco : interface generation for Java classes including parameters  uses jpf-learn and jpf-jdart  jpf-learn : implements L*  jpf-jdart : symbolic execution in JPF  actually DART/concolic

41 Experiments ExampleMethodsk-maxk-minConjecturesRefinementsAlphabetStates S IGNATURE 5722054 P IPED O UTPUT S TREAM 4722153 I NT M ATH 81117163 A LT B IT 22748355 CEV-F LIGHT R ULE 3333253 CEV1833106249 k-max is the maximum exploration depth reached in one hour k-min is the depth when we realized the expected interface Automata do not change between k-min and k-max, and are k-max-full

42 Summary

43  Combined automata learning and symbolic techniques for temporal interface generation  Generating richer interfaces with symbolic method guards  Implemented a prototype tool in Java PathFinder  Works well on realistic examples  Equivalence queries are a potential bottleneck

44 Our Contribution cont.  We learn 3-valued Deterministic Finite Automata mod(p, q) [q > 0 && p >= 0] mod(p, q) [q <= 0 || p < 0] div(p, q) [q == 0] div(p, q) [q != 0] ERROR DON’T KNOW INITIAL

45 Using 3-Valued DFA mod(p, q) [q > 0 && p >= 0] mod(p, q) [q <= 0 || p < 0] div(p, q) [q == 0] div(p, q) [q != 0] ERROR INITIAL Underlying solver returns “Don’t Know”

46 Using 3-Valued DFA cont.  We learn 3-valued Deterministic Finite Automata mod(p, q) [q > 0 && p >= 0] mod(p, q) [q <= 0 || p < 0] div(p, q) [q == 0] div(p, q) [q != 0] DON’T KNOW INITIAL ERROR

47 Definition of k-full Interface  Interface is k-safe if all legal sequences in the automata to depth k are also legal executions in the component  Interface is k-permissive if all illegal sequences in the automata to depth k also lead to errors in the component  Interface is k-tight if all sequences to depth k leading to the don’t know state in the automata cannot be resolved in the component  Interface that is k-safe, k-permissive, and k-tight is k-full

48 Guarantees of PSYCO Algorithm  Theorem: If the behavior of a component C can be characterized by an interface DFA, then PSYCO terminates with a k-full interface for C.  Proof is in the SAS paper  No unbounded loops/recursion in components  No “mixed parameters”


Download ppt "Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić."

Similar presentations


Ads by Google