Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability Jakob Rehof and Manuel Fähndrich Microsoft Research.

Presentation on theme: "Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability Jakob Rehof and Manuel Fähndrich Microsoft Research."— Presentation transcript:

Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability Jakob Rehof and Manuel Fähndrich Microsoft Research

Type-Based Program Analysis Common vocabulary Data access paths Function summary Context-sensitivity Directional flow Type-based Type structure (  ) Function type (->) Type instantiation, polymorphism (  ) Subtyping (  )

+CS -DI ( ,=) +CS +DI GOAL: Scaleable Flow Analysis of H.O. Programs w. Polymorphic Subtyping -CS +DI (=,  ) +CS +DI ( ,  ) Precision and Cost -CS -DI (=,=) Type-based Higher-order Context-sensitive (CS) Directional (DI)

Outline Goals Problems and Results Current Flow Analysis w.  +  Our Solution Summary

Current Method (  ) Polymorphism by copying types (  ) Subtyping by constrained types (  +  )  constraint copying

Problems w. Current Method Constraint copying is expensive (memory) Constraint simplification is hard Previous algorithm (Mossin) No on-demand algorithms (n = size of type-annotated program)

Results No constraint copying On-demand queries All flow in

Outline Goals Problems and Results Current Flow Analysis w.  +  Our Solution Summary

Current Flow Analysis w.  +  (Mossin) max(s,t) = if s<=t then t else s real * real -> real standard type

Current Flow Analysis w.  +  max(s:a,t:b) = (if s<=t then t else s) :c {a  c, b  c} => real:a * real:b -> real:c analysis type subtyping constraints flow label

Current Flow Analysis w.  +  max(s:a,t:b) = (if s<=t then t else s) :c {a  c, b  c} => real:a * real:b -> real:c

max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0,y0) max(x1,y1) Current Flow Analysis w.  + 

max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0  c0,b0  c0}=>c0

max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0  c0,b0  c0}=>c0 {a1  c1,b1  c1}=>c1 Current Flow Analysis w.  + 

with and

Without Subtyping: norm(x,y ) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s real:a * real:a -> real:a  :a’

Without Subtyping: norm(x:a’,y:a’) = let m = max(x,y) in scale(x,y,m) end; scale(z,w,n) = (z/n,w/n) max(s:a,t:a) = if s<=t then t else s real:a * real:a -> real:a 

Outline Goals Problems and Results Current Flow Analysis w.  +  Our Solution Summary

Flow Analysis Overview Source Code Type Instantiation Graph Flow Graph A B Type Inference

Flow Analysis Overview Source Code Type Instantiation Graph Flow Graph A B Type Inference CFL- Reachability Polymorphic Subtyping

Eliminating constraint copies max(s:a,t:b) : {a  c, b  c} => real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 {a0  c0, b0  c0} => real:a0 * real:b0 -> real:c0 {a1  c1, b1  c1} => real:a1 * real:b1 -> real:c1

1. Get a graph max(s:a,t:b) : real:a * real:b -> real:c max(x0:a0,y0:b0):c0 max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1

2. Label instantiation sites max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1

3. Represent substitutions max(s:a,t:b) : real:a * real:b -> real:c a a0 a a1 b b0 b b1 c c0 c c1 i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1

3.a. … as a graph max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i

3.a. … as a graph max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

4. Eliminate constraint copies ! max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

? ? ? max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j

Type Theory to the Rescue ! Polarity (+,-) ->    - + + - - + +

5. Polarities (+,-) max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

6. Reverse negative edges max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

7. Recover flow max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + +

8. Be careful ! max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 i i i j j j - - - - + + Spurious !

9. Do CFL-reachability max(s:a,t:b) : real:a * real:b -> real:c i: max(x0:a0,y0:b0):c0 j: max(x1:a1,y1:b1):c1 real:a0 * real:b0 -> real:c0 real:a1 * real:b1 -> real:c1 [i ]i [j ]j M  [k M ]k d d CFG

Further Issues Polymorphic type structure Recursive type structure –context-sensitive data-dependence analysis is uncomputable [Reps 00] –our techniques require finite types –regular unbounded data types handled via finite approximations: recursive type expressions

One-level implementation GOLF analysis system for C by Manuvir Das (MSR) and Ben Liblit (Berkeley) Exhaustive points-to sets for MS Word 97, 1.4 Mloc, in 2 minutes

Outline Goals Problems and Results Current Flow Analysis w.  +  Our Solution Summary

Elimination of constraint copying Reformulation of polymorphic subtyping with instantiation constraints Transfer of CFL-reachability techniques to type-based flow analysis

Scaleable Program Analysis Project (MSR, spt ) +CS -DI ( ,=) -CS +DI (=,  ) +CS +DI ( ,  ) -CS -DI (=,=) [ RF, POPL 01 ] [ Das, PLDI 00 ] [ FRD, PLDI 00 ] research.microsoft.com/spa

Summary Type-based flow analysis –all flow in, n = typed pgm size –context-sensitive (polymorphism) –directional (subtyping) –demand-driven algorithm –incorporates label-polymorphic recursion –works directly on H.O. programs –structured data of finite type –unbounded data structures via approx.

CFL Formulation S  P N P  M P | [ P |  N  M N | ] N |  M  [k M ]k | M M | d | 

Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end a a0, b b0, c c0, a a1, b b1, c c1 a  cb  ca  cb  c  |-; ; e : c0*c1

Type System e = let max(s:a,t:b) = … in (max(x0:a0,y0:b0), max(x1:a1,y1:b1)) end a a0, b b0, c c0, a a1, b b1, c c1 a  cb  ca  cb  c  |-; ; e : c0*c1 instantiation constraints subtyping constraints type environment

Download ppt "Type-Based Flow Analysis: From Polymorphic Subtyping to CFL-Reachability Jakob Rehof and Manuel Fähndrich Microsoft Research."

Similar presentations