# Chair of Software Engineering The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011.

## Presentation on theme: "Chair of Software Engineering The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011."— Presentation transcript:

Chair of Software Engineering The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011

Claims Theory:  Theory of aliasing  Loss of precision is small  New concepts, in particular inverse variables  Abstract, does not mention stack & heap  Simple, implementable  Insights into the essence of object-oriented programming Practice:  Alias calculus  Almost entirely automatic  Implemented 2

Reference Steps Towards a Theory and Calculus of Aliasing International Journal of Software and Informatics July 2011 http://se.ethz.ch/~meyer/publications/aliasing/alias- revised.pdf 3

The question under study Given expressions e and f (of reference types) and a program location p : At p, can e and f ever be attached to the same object? 4 (If so, we say that e and f are aliased to each other, meaning potentially aliased.) e f

“Given” expressions only Consider from y := x loop y := y a end 5 x a a a a y may become aliased to: x, x a, x a a, x a a a etc. (infinite set of expressions!)

An example of alias analysis y x  Consider two linked list structures known through x and y: right item Computing the alias relation shows that:  If x ≠ y, then no cell reachable from x ( or ) can be reached from y ( or ), and conversely  Without this assumption, such aliasing is possible 6

Why alias analysis is important 1. Without it, cannot apply standard proof techniques to programs involving pointers 2. Concurrent program analysis, in particular deadlock 3. Program optimization 7 -- y. a = b x. set_a (c) ? a x y set_a (c) b c -- x. a = c -- c = c, i.e. True Understand as x. a := c -- y. a = b

Basic notion Definition: Not necessarily transitive: if c then x := y else y := z end 8 A binary relation is an alias relation if it is symmetric and irreflexive Can alias x to y and y to z but not x to z

Formulae of interest The calculus defines, for any instruction p and any alias relation a,the value of a » p denoting: The aliasing relation resulting from executing p from an initial state in which the aliasing relation is a For an entire program: compute  » p 9

The programming language  skip  create x  x := y  forget x  (p ; q)  then p else q end 10 Eiffel: x := Void Java etc.: x = null; p, q, …: instructions x, y, …: variables cut x, y  p n -- for integer n  loop p end  r do p end  call r  z := x y  x r …  Current E0: basic constructs E1: cut E4: O-O E2: loops E3: procedures

Describing an alias relation If r is a relation in E  E, the following is an alias relation: r (r  r -1 ) ― Id [E] Example: {[x, x], [x, y], [y, z]} = Generalized to sets: {x, y, z} = =  11 Set difference Identity on E Set of binary relations on E; formally: P (E x E) {[x, y], [y, x], [y, z], [z, y]}  {[x, y], [y, x], [x, z], [z, x], [y, z], [z, y]} “Complete” alias relation

Canonical form & alias diagrams Canonical form of an alias relation: union of complete alias relations, e.g., meaning None of the sets of expressions is a subset of another 12 x, y, y, z, x, u, v x u, v y x y, z Make it canonical: x,x, x, y y {x, y}  { y, z}  {x, u, v} (not canonical) An alias diagram: yy

Alias calculus for basic operations (E0) a » skip = a a » (then p else q end ) = (a » p)  (a » q) a » (p ; q)= (a » p) » q a » (forget x) = a \- {x} a » (create x) = a \- {x} a » (x := y) = a [x: y] 13 a deprived of all pairs involving x e.g. x, y, y, z, x, u, v \- {x, u} = y, z, u, v Override, see next

The forget rule a » (forget x) = a \- {x} 14 y x, y, z x, u, v x, y a deprived of all pairs involving x

The assignment rule (E0) a » (x := y) = a [x: y] with: a [x: y] =given b = a \- {x} then b  ({x}  (b / y)) end 15  Symmetrize and de-reflect a deprived of all pairs involving x All pairs [x, u] where u is either aliased to y in b or y itself

Operations on alias relations For an alias relation a in E  E, an expression x, and a set of expressions A  E, the following are alias relations: r \– A =r — E x A a / y ={z: E | (z = y)  [y, z]  a} 16  “Quotient”, similar to equivalence class in equivalence relation  “Minus” Set of all expressions

The assignment rule (E0) All u aliased to y in b, plus y itself a [x: y] =given b = a \- {x} then b  ({x}  ( b / y )) end 17  Symmetrize and de-reflect a deprived of all pairs involving x All pairs [x, u] where u is either aliased to y in b or y itself Value of a » (x := y)

z := x Assignment example 1 18 x u, v x,, y Before, z After

Assignment example 2 19 x u, v x,, y x x := u Before After

Assignment example 3 20 x u, v x,, y x x, z x, x := z Before After

The assignment rule (E0) a [x: y] =given b = a \- {x} then b  ({x}  (b / y)) end 21  Value of a » (x := y)

The cut instruction E1 is E0 plus the instruction cut x, y Semantics: remove aliasing, if any, between x and y E0: basic constructs E1: cut 22

Cut example 1 23 x u, v x,, y x cut x, y Before After

Cut example 2 24 x, y u, v x, x, v cut x, u Before After

Cut rule a » cut x, y= a ― x, y 25 Set difference

The role of cut cut x, y informs the alias calculus with non-alias properties coming from other sources Example: if m < n then x := u else x := y end m := m + 1 if m < n then z := x end But here x cannot be aliased to y (only to u). The alias theory does not know this property! To take advantage of it, add the instruction This expression represents check x /= y end (Eiffel) assert x != y ;(JML, Spec#) 26 Alias relation:  x, u, z, x, y, z x, u, x, y cut x, y;

Introducing repetitions E2 is E1 plus:  p n (for integer n): n executions of p (auxiliary notion)  loop p end : any sequence (incl. empty) of executions of p 27 E0: basic constructs E1: cut E2: loops

E2 alias calculus a » p 0 = a a » p n+1 = (a » p n ) » p-- For n  0 -- Also equal to (a » p) » p n a » (loop p end)=  (a » p n ) 28 n  Nn  N

Loop aliasing theorem (1) For any a and p, there exists a constant N  N such that a » (loop p end)=  (a » p n ) Proof : the sequence s n =  (a » p k ) is non-decreasing (with respect to inclusion) on a finite set More generally, for every construct p of E2, the function a | (a » p) is non-decreasing 29 n: 0 N k: 0 n a » (loop p end) =  (a » p n ) n  Nn  N

Loop aliasing theorem (2) a » (loop p end) is also the fixpoint of the sequence t 0 = a t n+1 = t n  (t n » p) Gives a practical way to compute a » (loop p end) Proof: by induction. If s n is original sequence  (a » p n ), prove separately s n  t n and t n  s n 30 k: 0 n

Introducing procedures: E3 A program is now a sequence of procedure definitions (one designated as main): r i (f)do p i end Instructions: as before, plus call r i (  ) -- Procedure call 31 E0: basic constructs E1: cut E2: loops E3: procedures Alias calculus notations:  r denotes body of r (i.e. r i = p i )  r denotes formals of r (here f)

Handling arguments The calculus will treat call r (  ) as r :=  ; call r (With recursion, possible loss of precision) 32 Generalize notation a [x: y] to lists: use a [  :  ] as abbreviation for (…((a [  1 :  1 ])[  2 :  2 ]) …[  n :  n ] For example: a [r :  ] i.e. formal 1 := actual 1 ;… ; formal n := actual n

Call rule Without arguments: a » call r = a » r 33 Formal arguments of r With arguments: a » call r (v) = a [r :  ] » r Body of r

Using the call rule 34 a » call r (  ) = a [r :  ] » r Because of recursion, no longer just definition but equation For entire set of procedures P, this gives a vector equation a » P= AL (a » P) Interpret as fixpoint equation and solve iteratively (Fixpoint exists: increasing sequence on finite set)

Handling an actual O-O language  Replace every routine argument by a variable (ensure unique names)  Replace every function r by a procedure, and the associated Result by a variable Result r Loss of precision in the case of recursion 35

Example translation Eiffel: consider a call with r (t: T; u: U): T do e := t f := u Result := f end 36 E3 Translation:c with r do e := t r f := u r Result r := f end t r := a u r := b call r r (a, b)

Object-oriented mechanisms: E4 “General relativity”:  1. Qualified expressions: x y Can be used as source (not target!) of assignments x := y z  2. Qualified calls: call x r (v)  3. Current 37 E0: basic constructs E1: cut E4: O-O E2: loops E3: procedures

Assignment (original rule) a [x: y] =given b = a \- {x} then b  ({x}  (b / y) ) end 38  a deprived of all pairs involving x This includes [x, y] ! All pairs [x, u] where u is either aliased to y in b or y itself All u aliased to y in b, plus y itself Example: x := y z Value of a » (x := y)

Assigning a qualified expression := x y 39 x y x x z x does not get aliased to x y! (only to any z that was aliased to x y) x := x y

Assignment rule revisited a [x: y] =given b = a \– {x} then b  ({x}  (b / y)) end 40  a deprived of all pairs involving x or an expression starting with x Example: x := y z Value of a » (x := y)

Alias diagrams (E0 to E3) 41 x u, v y, z x,, y Source node Value nodes Single source node (represents stack) Each value node represents a set of possible run-time values Links: only from source to value nodes (will become more interesting with E4!) Edge label: set of variables; indicates they can all be aliased to each other

Alias diagrams (E4) := x y 42 x y x x z Links may now exist between value nodes (now called object nodes) Cycles possible (see next) Source node Value nodes Object nodes

New laws, inverse variables x Current = x Current x = x x’ x = Current x x’ = Current Current’ = Current 43

New form of call: qualified In E4: call x r (a, b, …) 44

Distribution operator: For a list  = : x  = For a relation r in E  E : x r= {[x u, x v] | [u, v]  r} Example: x ( u, v, w, u, y ) = x u, x v, x w, x u, x y 45

Handling arguments (unqualified call) a » call r (  ) = a [r :  ] » r 46

Handling arguments: unqualified call a » call r (  ) = a [ ` r :  ] » call r ) 47 Was written r

Handling arguments: qualified call a » call x r (  ) = a [ x r :  ] » call x r ) Treat call x r (v) as x formals :=  ; call x r 48 x x’x’ Current target

Handling arguments: an example Eiffel: x r (a, b) With, in a class C: r (t: T ; u: U) Handled as: x t := a x u := b call x r x x’x’ Current target

Without arguments: unqualified call rule a » call r = a » r 50 Body of r

Qualified call rule a » call x r = x ((x ’ a) » r) 51 Example: d := c x r (d) with r (u: U) do v := u end Handled as: d := c call with r do v := u end u := x ’ d x r x x’x’ c u, v, d target Current Inverse variable

The rule in action a » call x r = x ((x ’ a) » r) 52 Alias relation: c, d x ’ c, x ’ d Prefix with x ’ : u, x ’ c, x ’ d d := c call with r do v := u end u := x ’ c x r v, u, x ’ c, x ’ d Prefix with x : x v, x u, c, d x x’x’ c u, x ’ c, x ’ d v, x x x x c, d Current c x ’ c, x c, x ’ c, x ’ d x d u, v, x x, d target Current x’x’

Seen from the remote side a » call x r = x ((x ’ a) » r) 53 Alias relation: c, d x ’ c, x ’ d Prefix with x ’ : u, x ’ c, x ’ d E4 version: d := c call with r do v := u end u := x ’ c x r v, u, x ’ c, x ’ d Prefix with x : x v, x u, c, d x x’x’ c u,, x ’ c, x ’ d v, x x x x c, d x ’ c, x c, x ’ c, x ’ d x d u,u, v, x x, d target Current x’x’ x u, x v

About the qualified call rule a » call x r = x ((x ’ a) » r) Thus we are permitted to prove that the unqualified call creates certain aliasings, on the assumption that it starts in its own alias environment but has access to the caller’s environment through the inverted variable, and then to assert categorically that the qualified call has the same aliasings transposed back to the original environment. This change of environment to prove the unqualified property, followed by a change back to the original environment to prove the qualified property, explains well the aura of magic which attends a programmer's first introduction to object-oriented programming. 54

The full qualified call rule As two separate rules:  a » call x r = x ((x ’ a) » r)  a » call x r (  ) = a [x r :  ] » call x r ) As a single rule: a » call x r (  ) = x (x ’ a [ r : x ’  ]) » r) \– x r 55 Hide internals of r

Termination? The original termination argument does not hold any more Consider from y := x loop y := y a end y may become aliased to: x, x a, x a a, x a a a etc. (infinite set of expressions!) 56 x a a a a

Termination: the question under study Given expressions e and f (of reference types) and a program location p : At p, can e and f ever be attached to the same object? 57

The alias calculus a » skip = a a » (then p else q end)= (a » p)  (a » q) a » (p ; q)= (a » p) » q a » (forget x)= a \- {x} a » (create x)= a \- {x} a » (x := y) = a [x: y] a » cut x, y= a – x, y a » p 0 = a a » p n+1 = (a » p n ) » p a » (loop p end)=  (a » p n ) a » call r (  )= (a [ r :  ]) » r a » call x r (  ) = x (x’ (a [x r :  ]) » r) \– x r 58 n  Nn  N Plus: x Current = x Current x = x x’ x = Current x x’ = Current Current’= Current a [x: y] = given b = a\- {x} then b  ({x} x (b/y)) end

There is no backward alias calculus Consider create y create z x := y 59 x, y ? x := z x, z ?

Notation Targets of an instruction p: p  This is the set of variables that p may modify e.g. (x :=y ; y := z)  = {x, y} 60

Semantics of the alias calculus Let Var be the set of variables and a an alias relation, the following assertion expresses that there is no aliasing except as implied by a: Weak soundness of definition of » for an instruction p: Soundness has less demanding precondition but assumes some white-box knowledge: 61 a –  x ≠ y =  [x, y]  (Var  Var) – Id – a {(a ) – } p {(a » p) – } We may ignore variables modified by p \- p 

There is no backward rule! Consider the definition of weak soundness: 62 It is possible to reconstruct a from a –, but not from a – \- p  (Same for strong soundess) {(a) – } p {(a » p) – }

Why alias analysis is important 1. Without it, cannot apply standard proof techniques to programs involving pointers 2. Concurrent program analysis, in particular deadlock 3. Program optimization 63

Coffman deadlock Consider a set of processors and a set of resources. At every execution time t, for every processor p, two disjoint sets of resources are defined:  H t (p)-- Has set: resources that p has acquired  W t (p)-- Wait set: resources that p has requested A deadlock exists if for some set D of processors:  p: D |  p’ : D | W t (p)  H t (p’) ≠  (In such a case p ≠ p’) 65 Not the same thing as reverse of liveness

Absolute deadlock freedom A system, made of a set of program elements E, is absolutely deadlock-free if for all times t  r: E |  r’ : E | W t (r)  H t (r’) =  (Works for r = r’ since W t (r)  H t (r) =  ) Can also be written:  r: E |  r’ : E |   : W t (r) |   H t (r’) 66

Strategy for detecting deadlock For every program element r:  Compute W (r) and H (r)  Determine with which other elements r’ it can run parallel 67  r, r’: E | r // r’  (W (r)  H (r’) =  )

The SCOOP model A primary characteristic of SCOOP is that the model removes the distinction between resources and processors Properties:  Any processor p is such that p  H (p)  Any variable or expression e has an associated processor, its handler  Any execution of a qualified call x  f (a, b, …) (separate or not) satisfies x  H ( )  For any call r of actual arguments args including uncontrolled arguments U, W (r) is { | a  U}  Without lock passing, H (r) for any program element r in a routine or formal separate arguments S is  { | a  S} 68

Processor abstraction In SCOOP:  Any variable or expression e has an associated processor, its handler The computation of H and W sets only involves the handler Strategy:  Processor abstraction: identify every variable or expression e with its processor  Perform alias analysis  Use it to compute H and W sets as unions for all possible aliases  Check W (r)  H (r’) =  for any two calls r, r’, including when they are the same call 69

Dining philosophers class MEAL create make feature p1, p2: separate PHILOSOPHER ; f1, f2: separate FORK make do create f1; create f2 create p1  make (f1, f2); create p2  make (f2, f1) end go_right (a, b: separate PHILOSOPHER) do p1  eat_right; p2  eat_right end go_wrong (a, b: separate PHILOSOPHER) do p1  eat_wrong; p2  eat_wrong end end class PHILOSOPHER create make feature left, right: separate FORK make (u, v: separate FORK) do left:= u ; right := v end eat_right do pick_two (left, right) end pick_two (x, y: separate FORK) do x  use; y  use end eat_wrong do pick_in_turn (left) end pick_in_turn (z: separate FORK) do pick_two (z, right) end end 70 H = {x, y},  H = {f1, f2, …} W =  H = {z}, W = {right},  H = {f1, f2, …},  W = {f1, f2, …}

Claims Theory:  Theory of aliasing  Loss of precision is small  New concepts, in particular inverse variables  Abstract, does not mention stack & heap  Simple, implementable  Insights into the essence of object-oriented programming Practice:  Alias calculus  Almost entirely automatic  Implemented 71

Approaches for comparison Separation logic Shape analysis (with abstract interpretation) Ownership Dynamic frames 72

73 Hoare-style reasoning Assignment rule: {P (e)} x := e {P (x)}

74 Hoare-style reasoning require do := whatever + 10000 y := y + 1 ensure end y < 3 -- y + 1 < 3 x x y + 1 < 3 -- y + 1 < 3 Assignment rule: {P (e)} x := e {P (x)} require ensure y < 3

75 The effect of pointers (references) -- y. a = b x. set_a (c) ? a x y set_a (c) b c -- x. a = c -- True Understand as x. a := c -- y. a = b

The question under study Given expressions e and f (of reference types) and a program location p : At p, can e and f ever be attached to the same object? 76

77 The effect of pointers : with alias analysis x. set_a (c) -- y. a = b a x y set_a (c) b c -- x, y -- x. a = b Understand as x. a := c --. c = b -- x, y x may be aliased to y

Alias relations Relation of interest: “In the computation, e might become aliased to f” Definition: Not necessarily transitive: if c then x := y else y := z end 78 A binary relation is an alias relation if it is symmetric and irreflexive Can alias x to y and y to z but not x to z

Alias diagrams (E0 to E3) 79 x u, v y, z x,, y Source node Value nodes Single source node (represents stack) Each value node represents a set of possible run-time values Links: only from source to value nodes (will become more interesting with E4!) Edge label: set of expressions; indicates they can all be aliased to each other In canonical form: no label is subset of another; each label has at least 2 expressions

Approaches for comparison Separation logic Shape analysis Ownership Dynamic frames 80

Achievements Theory of aliasing Simple (about a dozen rules) New concepts: inverse variables, modeling Current Graphical formalism (alias diagrams), canonical form Implemented Almost entirely automatic (except for occasional cut) Small loss of precision, i.e. not too conservative Abstract: does not mention stack and heap Covers object-oriented programming Faithful to O-O spirit; see qualified call rule Can cover full modern O-O language Potential solution to “frame problem” 81 a » call x f = x ((x ’ a) » call f)

Download ppt "Chair of Software Engineering The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011."

Similar presentations