Presentation is loading. Please wait.

Presentation is loading. Please wait.

Refactoring Functional Programs Simon Thompson with Huiqing Li Claus Reinke www.cs.kent.ac.uk/projects/refactor-fp.

Similar presentations


Presentation on theme: "Refactoring Functional Programs Simon Thompson with Huiqing Li Claus Reinke www.cs.kent.ac.uk/projects/refactor-fp."— Presentation transcript:

1 Refactoring Functional Programs Simon Thompson with Huiqing Li Claus Reinke

2 AFP042 Session 2

3 AFP043 Overview Review mini-project. Implementation of HaRe. Larger-scale examples. Case study.

4 AFP044 Mini-project feedback Refactorings performed. Refactorings and language features? Machine support feasible? Useful? ‘Not-quite’ refactorings? Support possible here?

5 AFP045 Examples Argument permutations (NB partial application). (Un)group arguments. Slice function for a component of its result. Error handling / exception handling.

6 AFP046 More examples Introduce type synonym, selectively. Introduce ‘branded’ type. Modify the return type of a function from T to Maybe T, Either T S, [T]. Ditto for input types … and modify variable names correspondingly.

7 AFP047 Implementing HaRe

8 AFP048 Proof of concept … To show proof of concept it is enough to: build a stand-alone tool, work with a subset of the language, pretty print the results of refactorings.

9 AFP049 … or a useful tool? Integrate with existing program development tools: stand-alone program links to editors emacs and vim, any other IDEs also possible. Work with the complete language: Haskell 98? Preserve the formatting and comments in the refactored source code. Allow users to extend and script the system.

10 AFP0410 The refactorings in HaRe Rename Delete Lift / Demote Introduce definition Remove definition Unfold Generalise Add / remove params All these refactorings are module aware. Move def between modules Delete /add to exports Clean imports Make imports explicit Data type to ADT

11 AFP0411 The Implementation of HaRe Information gathering Pre-condition checking Program transformation Program rendering

12 AFP0412 Information needed Syntax: replace the function called sq, not the variable sq …… parse tree. Static semantics: replace this function sq, not all the sq functions …… scope information. Module information: what is the traffic between this module and its clients …… call graph. Type information: replace this identifier when it is used at this type …… type annotations.

13 AFP0413 Infrastructure: decisions Build a tool that can interoperate with emacs, vim, … yet act separately. Leverage existing libraries for processing Haskell 98, for tree transformation … as few modifications as possible. Be as portable as possible, in the Haskell space. Abstract interface to compiler internals?

14 AFP0414 Haskell landscape (end 2002) Parser:many Type checker:few Tree transformations:few Difficulties Haskell 98 vs. Haskell extensions. Libraries: proof of concept vs. distributable. Source code regeneration. Real project

15 AFP0415 Programatica Project at OGI to build a Haskell system … … with integral support for verification at various levels: assertion, testing, proof etc. The Programatica project has built a Haskell front end in Haskell, supporting syntax, static, type and module analysis … … freely available under BSD licence.

16 AFP0416 The Implementation of HaRe Information gathering Pre-condition checking Program transformation Program rendering

17 AFP0417 First steps … lifting and friends Use the Haddock parser … full Haskell given in 500 lines of data type definitions. Work by hand over the Haskell syntax: 27 cases for expressions … Code for finding free variables, for instance …

18 AFP0418 Finding free variables ‘by hand’ instance FreeVbls HsExp where freeVbls (HsVar v) = [v] freeVbls (HsApp f e) = freeVbls f ++ freeVbls e freeVbls (HsLambda ps e) = freeVbls e \\ concatMap paramNames ps freeVbls (HsCase exp cases) = freeVbls exp ++ concatMap freeVbls cases freeVbls (HsTuple _ es) = concatMap freeVbls es … etc.

19 AFP0419 This approach Boilerplate code … 1000 lines for 100 lines of significant code. Error prone: significant code lost in the noise. Want to generate the boiler plate and the tree traversals … … DriFT: Winstanley, Wallace … Strafunski: Lämmel and Visser

20 AFP0420 Strafunski Strafunski allows a user to write general (read generic), type safe, tree traversing programs, with ad hoc behaviour at particular points. Top-down / bottom up, type preserving / unifying, fullstopone

21 AFP0421 Strafunski in use Traverse the tree accumulating free variables from components, except in the case of lambda abstraction, local scopes, … Strafunski allows us to work within Haskell … Other options? Generic Haskell, Template Haskell, AG, …

22 AFP0422 Rename an identifier rename:: (Term t)=>PName->HsName->t->Maybe t rename oldName newName = applyTP worker where worker = full_tdTP (idTP ‘adhocTP‘ idSite) idSite :: PName -> Maybe PName idSite name orig) | v == oldName = return (PN newName orig) idSite pn = return pn

23 AFP0423 The coding effort Transformations: straightforward in Strafunski … … the chore is implementing conditions that the transformation preserves meaning. This is where much of our code lies.

24 AFP0424 Move f from module A to B Is f defined at the top-level of B ? Are the free variables in f accessible within module B ? Will the move require recursive modules? Remove the definition of f from module A. Add the definition to module B. Modify the import/export lists in module A, B and the client modules of A and B if necessary. Change uses of A.f to B.f or f in all affected modules. Resolve ambiguity.

25 AFP0425 The Implementation of HaRe Information gathering Pre-condition checking Program transformation Program rendering

26 AFP0426 Program rendering example -- This is an example module Main where sumSquares x y = sq x + sq y where sq :: Int->Int sq x = x ^ pow pow = 2 :: Int main = sumSquares Promote the definition of sq to top level

27 AFP0427 Program rendering example module Main where sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int sq :: Int->Int->Int sq pow x = x ^ pow main = sumSquares Using a pretty printer: comments lost and layout quite different.

28 AFP0428 Program rendering example -- This is an example module Main where sumSquares x y = sq x + sq y where sq :: Int->Int sq x = x ^ pow pow = 2 :: Int main = sumSquares Promote the definition of sq to top level

29 AFP0429 Program rendering example -- This is an example module Main where sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int sq :: Int->Int->Int sq pow x = x ^ pow main = sumSquares Layout and comments preserved.

30 AFP0430 Token stream and AST White space and comments in the token stream. Modification of the AST guides the modification of the token stream. After a refactoring, the program source is extracted from the token stream not the AST. Heuristics associate comments with program entities.

31 AFP0431 Production tool Programatica parser and type checker Refactor using a Strafunski engine Render code from the token stream and syntax tree.

32 AFP0432 Production tool (optimised) Programatica parser and type checker Refactor using a Strafunski engine Render code from the token stream and syntax tree. Pass lexical information to update the syntax tree and so avoid reparsing

33 AFP0433 What have we learned? Emerging Haskell libraries make it practical(?) Efficiency and robustness type checking large systems, linking, editor script languages (vim, emacs). Limitations of editor interactions. Reflections on Haskell itself.

34 AFP0434 Refactoring Refactoring comes in many forms micro refactoring as a part of program development, major refactoring as a preliminary to revision, dealing with legacy code, as a part of debugging, understanding, …

35 AFP0435 Reflections on Haskell Cannot hide items in an export list (cf import). Field names for prelude types? Scoped class instances not supported. ‘Ambiguity’ vs. name clash. ‘Tab’ is a nightmare! Correspondence principle fails …

36 AFP0436 Correspondence Operations on definitions and operations on expressions can be placed in one to one correspondence (R.D.Tennent, 1980)

37 AFP0437 Correspondence Definitions where f x y = e f x | g1 = e1 | g2 = e2 Expressions let \x y -> e f x = if g1 then e1 else if g2 … …

38 AFP0438 Function clauses f x | g1 = e1 f x | g2 = e2 Can ‘fall through’ a function clause … no direct correspondence in the expression language. f x = if g1 then e1 else if g2 … No clauses for anonymous functions … no reason to omit them.

39 AFP0439 Work in progress ‘Fold’ against definitions … find duplicate code. All, some or one? Effect on the interface … f x = … e … e … Traditional program transformations Short-cut fusion Warm fusion

40 AFP0440 Where next? Opening up to users: API or little language? Link with other IDEs (and front ends?). Detecting ‘bad smells’. More useful refactorings supported by us. Working without source code.

41 AFP0441 API Refactorings Refactoring utilities Strafunski Haskell

42 AFP0442 DSL Refactorings Refactoring utilities Strafunski Haskell Combining forms

43 AFP0443 Larger-scale examples More complex examples in the functional domain; often link with data types. Dawning realisation that can some refactorings are pretty powerful. Bidirectional … no right answer.

44 AFP0444 Algebraic or abstract type? data Tr a = Leaf a | Node a (Tr a) (Tr a) Tr Leaf Node flatten :: Tr a -> [a] flatten (Leaf x) = [x] flatten (Node s t) = flatten s ++ flatten t

45 AFP0445 Algebraic or abstract type? data Tr a = Leaf a | Node a (Tr a) (Tr a) isLeaf = … isNode = … … Tr isLeaf isNode leaf left right mkLeaf mkNode flatten :: Tr a -> [a] flatten t | isleaf t = [leaf t] | isNode t = flatten (left t) ++ flatten (right t)

46 AFP0446 Algebraic or abstract type?  Pattern matching syntax is more direct … … but can achieve a considerable amount with field names. Other reasons? Simplicity (due to other refactoring steps?).  Allows changes in the implementation type without affecting the client: e.g. might memoise Problematic with a primitive type as carrier. Allows an invariant to be preserved.

47 AFP0447 Outside or inside? data Tr a = Leaf a | Node a (Tr a) (Tr a) isLeaf = … isNode = … … Tr isLeaf isNode leaf left right mkLeaf mkNode flatten :: Tr a -> [a] flatten t | isleaf t = [leaf t] | isNode t = flatten (left t) ++ flatten (right t)

48 AFP0448 Outside or inside? data Tr a = Leaf a | Node a (Tr a) (Tr a) isLeaf = … isNode = … flatten t = … Tr isLeaf isNode leaf left right mkLeaf mkNode flatten

49 AFP0449 Outside or inside?  If inside and the type is reimplemented, need to reimplement everything in the signature, including flatten. The more outside the better, therefore.  If inside can modify the implementation to memoise values of flatten, or to give a better implementation using the concrete type. Layered types possible: put the utilities in a privileged zone.

50 AFP0450 Memoise flatten :: Tr a->[a] data Tree a = Leaf { val::a } | Node { val::a, left,right::(Tree a) } leaf = Leaf node = Node flatten (Leaf x) = [x] flatten (Node x l r) = (x : (flatten l ++ flatten r)) data Tree a = Leaf { val::a, flatten:: [a] } | Node { val::a, left,right::(Tree a), flatten::[a] } leaf x = Leaf x [x] node x l r = Node x l r (x : (flatten l ++ flatten r))

51 AFP0451 Memoise flatten Invisible outside the implementation module, if tree type is already an ADT. Field names in Haskell make it particularly straightforward.

52 AFP0452 Data type or existential type? data Shape = Circle Float | = forall a. Sh a => Shape a Rect Float Float class Sh a where area :: Shape -> Float area :: a -> Float area (Circle f) = pi*r^2 perim :: a -> Float area (Rect h w) = h*w data Circle = Circle Float perim :: Shape -> Float perim (Circle f) = 2*pi*r instance Sh Circle perim (Rect h w) = 2*(h+w) area (Circle f) = pi*r^2 perim (Circle f) = 2*pi*r data Rect = Rect Float instance Sh Rect area (Rect h w) = h*w perim (Rect h w) = 2*(h+w)

53 AFP0453 Constructor or constructor? data Expr = Epsilon |.... | = Epsilon |.... | Then Expr Expr | Then Expr Expr | Star Expr Star Expr | Plus Expr plus e = Then e (Star e)

54 AFP0454 Monadification: expressions data Expr = Lit Integer | -- Literal integer value Vbl Var | -- Assignable variables Add Expr Expr | -- Expression addition: e1+e2 Assign Var Expr -- Assignment: x:=e type Var = String type Store = [ (Var, Integer) ] lookup :: Store -> Var -> Integer lookup st x = head [ i | (y,i) <- st, y==x ] update :: Store -> Var -> Integer -> Store update st x n = (x,n):st

55 AFP0455 Monadification: evaulation eval :: Expr -> evalST :: Expr -> Store -> (Integer, Store) State Store Integer eval (Lit n) st evalST (Lit n) = (n,st) = do return n eval (Vbl x) st evalST (Vbl x) = (lookup st x,st) = do st <- get return (lookup st x)

56 AFP0456 Monadification: evaulation 2 eval :: Expr -> evalST :: Expr -> Store -> (Integer, Store) State Store Integer eval (Add e1 e2) st evalST (Add e1 e2) = (v1+v2, st2) = do where v1 <- evalST e1 (v1,st1) = eval e1 st v2 <- evalST e2 (v2,st2) = eval e2 st1 return (v1+v2) eval (Assign x e) st evalST (Assign x e) = (v, update st' x v) = do where v <- evalST e (v,st') = eval e st st <- get put (update st x v) return v

57 AFP0457 Classes and instances Type Store = [Int] empty :: Store empty = [] get :: Var -> Store -> Int get v st = head [ i | (var,i) <- st, var==v] set :: Var -> Int -> Store -> Store set v i = ((v,i):)

58 AFP0458 Classes and instances Type Store = [Int] empty :: Store get :: Var -> Store -> Int set :: Var -> Int -> Store -> Store empty = [] get v st = head [ i | (var,i) <- st, var==v] set v i = ((v,i):)

59 AFP0459 Classes and instances class Store a where empty :: a get :: Var -> a -> Int set :: Var -> Int -> a -> a instance Store [Int] where empty = [] get v st = head [ i | (var,i) <- st, var==v] set v i = ((v,i):) Need newtype wrapper in Haskell 98 … end

60 AFP0460 Not just programming Paper or presentation moving sections about; amalgamate sections; move inline code to a figure; animation; … Proof introduce lemma; remove, amalgamate hypotheses, … Program the topic of the lecture

61 AFP0461 Evolving the evidence Dependable System Evolution is the software engineering grand challenge. Systems built with evidence of their dependability. But how to evolve the evidence with the system? Refactoring proofs, test coverage data etc.

62 AFP0462 Understanding a program Take a working semantic tableau system written by an anonymous 2nd year student … … refactor to understand its behaviour. Nine stages of unequal size. Reflections afterwards.

63 AFP0463 An example tableau  ((A  C)  ((A  B)  C))  ((A  B)  C) (A  C)  C AA  (A  B)  C   AB Make B True Make A and C False

64 AFP0464 v1: Name types Built-in types [Prop] [[Prop]] used for branches and tableaux respectively. Modify by adding type Branch = [Prop] type Tableau = [Branch] Change required throughout the program. Simple edit: but be aware of the order of substitutions: avoid type Branch = Branch

65 AFP0465 v2: Rename functions Existing names tableaux removeBranch remove become tableauMain removeDuplicateBranches removeBranchDuplicates and add comments clarifying the (intended) behaviour. Add test datum. Discovered some edits undone in stage 1. Use of the type checker to catch errors. test will be useful later?

66 AFP0466 v3: Literate  normal script Change from literate form: Comment … > tableauMain tab > =... to -- Comment … tableauMain tab =... Editing easier: implicit assumption was that it was a normal script. Could make the switch completely automatic?

67 AFP0467 v4: Modify function definitions From explicit recursion: displayBranch :: [Prop] -> String displayBranch [] = [] displayBranch (x:xs) = (show x) ++ "\n" ++ displayBranch xs to displayBranch :: Branch -> String displayBranch = concat. map (++"\n"). map show Abstraction: move from explicit list representation to operations such as map and concat which could be over any collection type. First time round added incorrect (but type correct) redefinition … only spotted at next stage. Version control: un/redo etc.

68 AFP0468 v5: Algorithms and types (1) removeBranchDup :: Branch -> Branch removeBranchDup [] = [] removeBranchDup (x:xs) | x == findProp x xs = [] ++ removeBranchDup xs | otherwise = [x] ++ removeBranchDup xs findProp :: Prop -> Branch -> Prop findProp z [] = FALSE findProp z (x:xs) | z == x = x | otherwise = findProp z xs

69 AFP0469 v5: Algorithms and types (2) removeBranchDup :: Branch -> Branch removeBranchDup [] = [] removeBranchDup (x:xs) | findProp x xs = [] ++ removeBranchDup xs | otherwise = [x] ++ removeBranchDup xs findProp :: Prop -> Branch -> Bool findProp z [] = False findProp z (x:xs) | z == x = True | otherwise = findProp z xs

70 AFP0470 v5: Algorithms and types (3) removeBranchDup :: Branch -> Branch removeBranchDup = nub findProp :: Prop -> Branch -> Bool findProp = elem

71 AFP0471 v5: Algorithms and types (4) removeBranchDup :: Branch -> Branch removeBranchDup = nub Fails the test ! Two duplicate branches output, with different ordering of elements. The algorithm used is the 'other' nub algorithm, nubVar : nub [1,2,0,2,1] = [1,2,0] nubVar [1,2,0,2,1] = [0,2,1] Code using lists in a particular order to represent sets.

72 AFP0472 v6: Library function to module Add the definition: nubVar = … to the module ListAux.hs and replace the definition by import ListAux Editing easier: implicit assumption was that it was a normal script. Could make the switch completely automatic?

73 AFP0473 v7: Housekeeping Remanings: including foo and bar and contra (becomes notContra ). An instance of filter, looseEmptyLists is defined using filter, and subsequently inlined. Put auxiliary function into a where clause. Generally cleans up the script for the next onslaught.

74 AFP0474 v8: Algorithm (1) splitNotNot :: Branch -> Tableau splitNotNot ps = combine (removeNotNot ps) (solveNotNot ps) removeNotNot :: Branch -> Branch removeNotNot [] = [] removeNotNot ((NOT (NOT _)):ps) = ps removeNotNot (p:ps) = p : removeNotNot ps solveNotNot :: Branch -> Tableau solveNotNot [] = [[]] solveNotNot ((NOT (NOT p)):_) = [[p]] solveNotNot (_:ps) = solveNotNot ps

75 AFP0475 v8: Algorithm (2) splitXXX removeXXX solveXXX for each of nine rules. The algorithm applies rules in a prescribed order, using an integer value to pass information between functions. Aim: generic versions of split remove solve Change order of rule application … effect on duplicates. Add map sort to top level pipeline before duplicate removal.

76 AFP0476 v9: Replace lists by sets. Wholesale replacement of lists by a Set library. mapmapSet foldrfoldSet (careful!) filterfilterSet The library exposes the representation: pick, flatten. Use with discretion … further refactoring possible. Library needed to be augmented with primRecSet :: (a -> Set a -> b -> b) -> b -> Set a -> b

77 AFP0477 v9: Replace lists by sets (2) Drastic simplification: no explicit worries about … ordering (and equality), (removal of) duplicates. Hard to test intermediate stages: type change is all or nothing … … work with dummy definitions and the type checker. Further opportunities: why choose one rule from a set when could apply to all elements at once? Gets away from picking on one value (and breaking the set interface).

78 AFP0478 Conclusions of the case study Heterogeneous process: some small, some large. Are all these stages strictly refactorings: some semantic changes always necessary too? Importance of type checking for hand refactoring … … and testing when any semantic changes. Undo, redo, reordering the refactorings … CVS. In this case, directional … not always the case.

79 AFP0479 Teaching and learning design Exciting prospect of using a refactoring tool as an integral part of an elementary programming course. Learning a language: learn how you could modify the programs that you have written … … appreciate the design space, and … the features of the language.

80 AFP0480 Conclusions Refactoring + functional programming: good fit. Real benefit from using available libraries … with work. Want to use the tool in building itself. Much more to do than we have time for.


Download ppt "Refactoring Functional Programs Simon Thompson with Huiqing Li Claus Reinke www.cs.kent.ac.uk/projects/refactor-fp."

Similar presentations


Ads by Google