Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis.

Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis

Announcements I am setting up a new page for the class at www.rpi.edu/~milana2/csci6961 Email: milana2@rpi.edu, milanova@cs.rpi.edumilana2@rpi.edu

Outline Data-flow frameworks –The “Maximal Fixed Point” (MFP) solution –The “Meet Over all Paths” (MOP) solution Analysis of object references –Class Hierarchy Analysis (CHA) –Rapid Type Analysis (RTA)

The MOP Solution 1. x:=a*b 2. if y<=a*b 3. a:=a+1 4. x:=x*b 5. goto 2 The MOP at entry of n is V f p (init(ρ)) The MOP over-approximates run-time dataflow facts. The MOP is the best summary of dataflow facts. p in paths from ρ to n MOP at entry of 3: f 2 (f 1 (Ø)) U f 2 (f 5 (f 4 (f 3 (f 2 (f 1 (Ø)))))) U f 2 (f 5 (f 4 (f 3 (f 2 (f 5 (f 4 (f 3 (f 2 (f 1 (Ø)))))))))) U … = {(x,1),(x,4),(a,3)} This MOP over-approximates the reaching definitions at entry of 3: E.g., suppose that at the beginning y=1, a=1 and b=2. The actual reaching definitions at entry of 3: {(x,1)} !!! T F

The MFP Solution 1. x:=a*b 2. if y<=a*b 3. a:=a+1 4. x:=x*b 5. goto 2 The MFP at entry of 3 is the in(3) obtained as a solution of the following equations through fixed-point iteration: in(1) = Ø out(1) = f 1 (in(1)) T F in(2) = out(1) U out(5) out(2) = f 2 (in(2)) in(3) = out(2) out(3) = f 3 (in(3)) in(4) = out(3) out(4) = f 4 (in(4)) in(5) = out(4) out(5) = f 5 (in(5)) = in(5)

The MOP and MFP Solutions {} {(x,1)}{(x,4)}{(a,3)} {(x,1),(x,4)}{(x,4),(a,3)}{(x,1),(a,3)} {(x,1),(x,4),(a,3)} 1. x:=a*b 2. if y<=a*b 3. a:=a+1 4. x:=x*b 5. goto 2 0 1 in(2), in(3), in(4) in(1) in(2), in(3), in(5) in(4) in(5)

MOP vs. MFP For distributive functions the dataflow analysis can merge paths (p1, p2), without loss of precision! –E.g., f p1 (0) need not be calculated explicitly –MFP=MOP Due to Kam and Ullman, 1976,1977: This is not true for monotone functions –MFP≥MOP. In general, MOP is undecidable A solution S, S≥MOP, is an unsafe solution –Other terms: unsafe, incorrect, unsound solution

Many Applications! White-box testing: compute coverage Regression testing Reverse engineering Restructuring: automated refactoring Static debugging –Memory errors –Concurrency bugs

Analysis of object references Analysis of object-oriented programs –Java Class Analysis problem: Given a reference variable x, what are the classes of the objects that x refers to at runtime? Points-to Analysis problem: Given a reference variable x, what are the objects that x refers to at runtime?

Example: BoolExp hierarchy class BoolExp { public: BoolExp(); virtual bool Evaluate(Context&)=0; }; class Constant : public BoolExp { public: Constant(bool); virtual bool Evaluate (Context&); private: bool _constant; }; Constant::Constant(bool c) { _constant = c; } bool Constant::Evaluate(Context& aContext) { return _constant; } class VarExp : public BoolExp { public: VarExp(char *); virtual bool Evaluate (Context&); private: char* _name; }; VarExp::VarExp(char * n) { _name = n; } bool VarExp::Evaluate(Context& aContext) { return aContext.Lookup(_name); }

Example: BoolExp hierarchy class AndExp : public BoolExp { public: AndExp(BoolExp*, BoolExp*); virtual bool Evaluate (Context&);NOTE: NEED DESTRUCTORS!!! private: BoolExp* _operand1; BoolExp* _operand2; }; AndExp::AndExp(BoolExp* op1, BoolExp* op2) { _operand1=op1; _operand2=op2; } bool AndExp::Evaluate(Context& aContext) { return _operand1->Evaluate(aContext) && _operand2->Evaluate(aContext); } class OrExp : public BoolExp { public: OrExp(BoolExp*, BoolExp*); virtual bool Evaluate (Context&); private: BoolExp* _operand1; BoolExp* _operand2; }; OrExp::OrExp(BoolExp* op1, BoolExp* op2) { _operand1=op1; _operand2=op2; } bool OrExp::Evaluate(Context& aContext) { return _operand1->Evaluate(aContext) || _operand2->Evaluate(aContext); }

A client of the BoolExp hierarchy main() { Context theContext; VarExp* x = new VarExp(“X”); VarExp* y = new VarExp(“Y”); BoolExp* exp = new AndExp( new Constant(true), new OrExp(x, y) ); theContext.Assign(x, true); theContext.Assign(y, false); bool result = exp->Evaluate(theContext); }

Java Example: BoolExp hierarchy public abstract class BoolExp { public boolean Evaluate(Context c); }; public class Constant extends BoolExp { private boolean _constant; public boolean Evaluate(Context c) { return _constant; } public class VarExp extends BoolExp { private String _name; public boolean Evaluate(Context c) { return c.Lookup(_name); }

Java Example: BoolExp hierarchy public class AndExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2; public AndExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) && _operand2.Evaluate(c); } public class OrExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2; public OrExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) || _operand2.Evaluate(c); }

A client of the BoolExp hierarchy in Java main() { Context theContext; VarExp x = new VarExp(“X”); VarExp y = new VarExp(“Y”); BoolExp exp = new AndExp( new Constant(true), new OrExp(x, y) ); theContext.Assign(x, true); theContext.Assign(y, false); boolean result = exp.Evaluate(theContext); } exp: {AndExp} That is: At runtime exp may refer to (i.e., may point to) an object of class AndExp, but may not refer to an object of class OrExp!

Java Example: BoolExp hierarchy public class AndExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2; public AndExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) && _operand2.Evaluate(c); } public class OrExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2; public OrExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) || _operand2.Evaluate(c); } _operand1: {Constant}_operand2: {OrExp} _operand1: {VarExp} _operand2: {VarExp}

Class information: applications Compilers: can we devirtualize a virtual function call x.m()/x->m()? Software engineering –The calling relations in the program: call graph –Testing –Most interesting analyses require this information

Some terminology Intraprocedural analysis –So far, we assumed there are no procedure calls! –Analysis that works within a procedure and approximates (or does not need) flow into and from procedures Interprocedural analysis –Takes into account procedure calls and tracks flow into and from procedures –Many issues: Parameter passing mechanisms Context Call graph! Functions as parameters! –We will get back to this in a few classes…

Scalability For most analyses (including class analysis) we need interprocedural analysis on very large programs Can the analysis handle large programs? –100K LOC, up to 45M LOC? Approximations of standard fixed point iteration –Reduce Lattice –Reduce CFG –Make transfer functions converge faster –Other…

Today’s class Some simple interprocedural class analyses Class analysis: Given a reference variable x, what are the classes of the objects that x refers to at runtime? Class Hierarchy Analysis (CHA) Rapid Type Analysis (RTA)

Class Hierarchy Analysis (CHA) The simplest method of inferring information about reference variables –Look at the class hierarchy In Java, if a reference variable r has a type A, the possible classes of run-time objects are included in the subtree of A. Denoted by cone(A). –At virtual call site r.m find the methods that may be called based on the hierarchy information J. Dean, D. Grove, and C. Chambers, Optimization of OO Programs Using Static Class Hierarchy Analysis, ECOOP’95

Example public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } // there are no other creation sites // or calls in the program f() A BC GDE

Example A BC GDE f() public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } … } // there are no other creation sites // or calls in the program The solution for reference variables by CHA is: a may refer to objects of classes {A,B,C,D,E,G}, d may refer to objects of class {D}, e may refer to objects of class {E}, and g to {G}. Cone(C) f()

Example public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } … } // there are no other creation sites // or calls in the program main A.fB.fC.fG.f A BC GDE f() a.f():

Example: Applies-to Sets public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } … } // there are no other creation sites // or calls in the program main A.fB.fC.fG.f A BC GDE f() a.f(): Applies-to sets: A.f = {A}; B.f = {B}; G.f = {G}; C.f = {C,D,E}

Observations on CHA Do we need to resolve the class of the receiver uniquely in order to devirtualize a call? Applies-to set for each method –At a call site r.f(), take the set of possible classes for the receiver r; intersect this set with each possible method’s applies-to set. –If only one method’s set has a non-empty intersection, then invoke the method directly. –Otherwise, the call cannot be resolved.

Rapid Type Analysis Improves on Class Hierarchy Analysis Interleaves construction of the call graph with the analysis (known as on-the-fly call graph construction) Only expands calls if it has seen an instantiated object of appropriate type Makes assumption that the whole program is available! David Bacon and Peter Sweeney, “Fast Static Analysis of C++ Virtual Function Calls”, OOPSLA ‘96

Example public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } // there are no other creation // sites or calls in the // program RTA starts in main; Sees D, and E are instantiated; Expands a.f() into C.f() only. Never reaches B.foo() and never sees G instantiated. main A.fB.fC.fG.f

RTA Keeps two sets, I (the set of instantiated classes), and R (the set of reachable methods) Starts from main, I = {}, R = {main} Analyze calls in reachable methods: r.f() –Finds potential targets according to CHA: X.f, Y.f, etc. –If Applies-to(X.f) intersects with I, make X.f a real target, and add X.f to R Analyze instantiation sites in reachable methods: r = new A() –Add A to I –Find all analyzed calls r.f() with potential targets X.f triggered by A (i.e., A in Applies-to(X.f) at r.f()). Make X.f a real target, and add X.f to R.

Example (continued) public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } // there are no other creation // sites or calls in the // program main A.fB.fC.f G.f {A}{B} {C,D,E}{G}

Comparisons class A { public : virtual int foo() { return 1; }; }; class B: public A { Public : virtual int foo() { return 2; }; virtual int foo(int i) { return i+1; }; }; void main() { B* p = new B; int result1 = p->foo(1); int result2 = p->foo(); A* q = p; int result3 = q->foo(); } Bacon-Sweeny, OOPSLA’96 CHA resolves result2 call uniquely to B.foo(); however, it does not resolve result3. RTA resolves result3 uniquely because only B has been instantiated.

Type Safety Limitations CHA and RTA assume type safety of the code they examine! //#1 void* x = (void *) new B; B* q = (B*) x; //a safe downcast int case1 = q->foo() //#2 void* x = (void *) new A; B* q = (B*) x; //an unsafe downcast int case2 = q->foo()//probably no error //#3 void* x = (void *) new A; B* q = (B *) x; //an unsafe downcast int case3 = q->foo(66);//run-time error A B foo() foo() foo(int)

Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis.

Similar presentations

Presentation on theme: "Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis.

Similar presentations

Presentation on theme: "Course Outline Traditional Static Program Analysis –Classic analyses and applications Software Testing, Refactoring Dynamic Program Analysis."— Presentation transcript:

Similar presentations

About project

Feedback