The Formalisation of Haskell Refactorings Huiqing Li Simon Thompson Computing Lab, University of Kent
18/06/2015TFP Outline Refactoring HaRe: The Haskell Refactorer Formalisation of Haskell Refactorings Formalisation of Generalise a Definition Conclusion and Future Work
18/06/2015TFP Refactoring What? Changing the structure of existing code without changing its meaning. Where and why? Development, maintenance, … To make the code easier to understand and modify To improve code reuse, quality and productivity Essential part of the programming process.
18/06/2015TFP HaRe – The Haskell Refactorer A tool for refactoring Haskell 98 programs. Full Haskell 98 coverage. Driving concerns: usability and extensibility. Implemented in Haskell, using Programatica’s frontends and Strafunski’s generic traversals. Integrated with the two program editors: (X)Emacs and Vim. Preserves both comments and layout style of the source.
18/06/2015TFP Refactorings Implemented in HaRe Structural Refactorings Module Refactorings Data-Oriented Refactorings
18/06/2015TFP Refactorings Implemented in HaRe Structural Refactorings Generalise a definition module Main (main) where f y = y : f (y + 1) main = print $ f 10 module Main (main) where f z y = y : f z (y + z) main y = print $ f 1 10
18/06/2015TFP Refactorings Implemented in HaRe Structural Refactorings (cont.) Rename an identifier Promote/demote a definition to widen/narrow its scope Delete an unused function Duplicate a definition Unfold a definition Introduce a definition to name an identified expression Add an argument to a function Remove an unused argument from a function
18/06/2015TFP Refactorings Implemented in HaRe Module Refactorings Move a definition from one module to another module module Test (f) where f y = y : f (y + 1) module Main where import Test main = print $ f 10 module Test ( ) where module Main where import Test f y = y : f (y + 1) main = print $ f 10
18/06/2015TFP Refactorings Implemented in HaRe Module Refactorings (cont.) Clean the imports Make the used entities explicitly imported Add an item to the export list Remove an item from the export list
18/06/2015TFP Refactorings Implemented in HaRe Data-oriented Refactorings From concrete to abstract data-type (ADT), which is a composite refactoring built from a sequence of primitive refactorings. Add field labels Add discriminators Add constructors Remove (nested) patterns Create ADT interface
18/06/2015TFP Formalisation of Refactorings Advantages: Clarify the definition of refactorings in terms of side- conditions and transformations. Improve our confidence in the behaviour-preservation of refactorings. Guide the implementation of refactorings. Reduce the need for testing. Challenges: Haskell is a non-trivial language. Haskell does not have an officially defined semantics.
18/06/2015TFP Formalisation of Refactorings Our Strategy: Start from a simple language ( letrec ). Extend the language gradually to formalise more complex refactorings.
18/06/2015TFP Formalisation of Refactorings The specification of a refactoring contains four parts: The representation of the program before the refactorings, say P 1 The side-conditions for the refactoring. The representation of the program after the refactorings, say P 2. A proof showing that P 1 and P 2 have the same functionality under the side-conditions.
18/06/2015TFP Formalisation of Refactorings The -calculus with letrec ( letrec ) Syntax of letrec terms. E ::= x | x.E | E 1 E 2 | letrec D in E D ::= | x i =E i | D, D Use the call-by-name semantics developed by Zena M. Ariola and Stefan Blom in the paper Lambda Calculi plus letrec.
18/06/2015TFP Formalisation of Generalisation Recall the example module Main (main) where f y = y : f (y + 1) main = print $ f 10 module Main (main) where f z y = y : f z (y + z) main = print $ f 1 10
18/06/2015TFP Formalisation of Generalisation Formal definition of Generalisation using letrec Given the expression: Assume E is a sub-expression of E i, and E i = C[E]. letrec x 1 =E 1,..., x i =E i,..., x n =E n in E 0
18/06/2015TFP Formalisation of Generalisation Formal definition of Generalisation using letrec The condition for generalising the definition x i =E i on E is: x i FV(E ) Æ 8 x, e: (x 2 FV(E ) Æ e 2 sub(E i, C) ) x 2 FV(e)) module Main (main) where f y = y : f (y + 1) main = print $ f 10
18/06/2015TFP Formalisation of Generalisation Formal definition of Generalisation using letrec The condition for generalising the definition x i =E i on E is: x i FV(E ) Æ 8 x, e: (x 2 FV(E ) Æ e 2 sub(E i, C) ) x 2 FV(e)) module Main (main) where f y = y : f (y + 1) main = print $ f 10
18/06/2015TFP Formalisation of Generalisation Formal definition of Generalisation using letrec The condition for generalising the definition x i =E i on E is: x i FV(E ) Æ 8 x, e: (x 2 FV(E ) Æ e 2 sub(E i, C) ) x 2 FV(e)) module Main (main) where f y = y : f (y + 1) main = print $ f 10
18/06/2015TFP Formalisation of Generalisation Formal definition of Generalisation using letrec The condition for generalising the definition x i =E i on E is: x i FV(E ) Æ 8 x, e: (x 2 FV(E ) Æ e 2 sub(E i, C) ) x 2 FV(e)) module Main (main) where f y = y : f (y + 1) main = print $ f 10
18/06/2015TFP Formalisation of Generalisation Formal definition of Generalisation using letrec After generalisation, the original expression becomes: letrec x 1 = E 1 [x i := x i E],..., x i = z.C[z][x i :=x i z],..., x n = E n [x i := x i E] in E 0 [x i := x i E], where z is a fresh variable. module Main (main) where f z y = y : f z (y + z) main = print $ f 1 10 module Main (main) where f y = y : f (y + 1) main = print $ f 10
18/06/2015TFP Formalisation of Generalisation Formal definition of Generalisation using letrec Proof. Decompose the transformation into a number of sub steps, if each sub step is behaviour-preserving, then the transformation is behaviour-preserving.
18/06/2015TFP Formalisation of Generalisation Step1: add definition x = z.C[z], where x and z are fresh variables, and C[E]=E i. module Main (main) where f y = y : f (y +1) x z y = y : f ( y + z) main = print $ f 10 letrec x 1 =E 1,..., x i =E i, x = z.C[z],..., x n =E n in E 0 Step 2: Replace E i with x E. (Note: E i = x E)
18/06/2015TFP Formalisation of Generalisation Step 2: Replace E i with x E. (Note: E i = x E) module Main (main) where f y = x 1 y x z y = y : f ( y + z) main = print $ f 10 letrec x 1 =E 1,..., x i = x E, x = z.C[z],..., x n =E n in E 0 Step 3: Unfolding x i in the right-hand side of x.
18/06/2015TFP Formalisation of Generalisation Step 3: Unfolding x i in the right-hand side of x. module Main (main) where f y = x 1 y x z y = y : x 1 ( y + z) main = print $ f 10 letrec x 1 =E 1,..., x i = x E, x = z.C[z] [x_i:= x E],..., x n =E n in E 0 Step 4: In the definition of x, replace E with z, and prove this does not change the semantics of x E.
18/06/2015TFP Formalisation of Generalisation Step 4: In the definition of x, replace E with z. and prove this does not change the semantics of x E. module Main (main) where f y = x 1 y x z y = y : x z ( y + z) main = print $ f 10 letrec x 1 =E 1,..., x i = x E, x = z.C[z] [x_i:= x z],..., x n =E n in E 0 Step 5: Unfolding the occurrences of x i.
18/06/2015TFP Formalisation of Generalisation Step 5: Unfolding the occurrences of x i. module Main (main) where f y = x 1 y x z y = y : x z ( y + z) main = print $ x 1 10 letrec x 1 =E 1 [x i := x E],..., x i = x E, x = z.C[z] [x i := x z],..., x n =E n [x i := x E] in E 0 [x i := x E] Step 6: Remove the definition of x i.
18/06/2015TFP Formalisation of Generalisation Step 6: Remove the definition of x i. module Main (main) where x z y = y : x z ( y + z) main = print $ x 1 10 letrec x 1 =E 1 [x i := x E],..., x = z.C[z] [x i := x z],..., x n =E n [x i := x E] in E 0 [x i := x E] Step 7: Rename x to x i and simplify the substitution.
18/06/2015TFP Formalisation of Generalisation module Main (main) where f z y = y : f z ( y + z) main = print $ f 1 10 letrec x 1 =E 1 [x i := x E] [x:=x i ],..., x = z.C[z] [x i := x z] [x:=x i ],..., x n =E n [x i := x E] [x:=x i ] in E 0 [x i := x E] [x:=x i ] letrec x 1 = E 1 [x i := x i E],..., x i = z.C[z][x i :=x i z],..., x n = E n [x i := x i E] in E 0 [x i := x i E]
18/06/2015TFP Formalisation of Refactorings letrec has been extended to model the Haskell module system ( M ). The move a definition from one module to another refactoring has also been formalised using M.
18/06/2015TFP Conclusion and Future Work Formalisation helps to clarify the side-conditions and transformation rules. Improves our confidence about the behaviour- preservation of refactorings. Future: Extend the calculus to formalise more complex refactorings. Formalise the composition of refactorings.
18/06/2015TFP Thank You