Refactoring Functional Programs Claus Reinke Simon Thompson TCS Seminar, 1 October 2001.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Introduction to Compilation of Functional Languages Wanhe Zhang Computing and Software Department McMaster University 16 th, March, 2004.
Refactoring Functional Programs Huiqing Li Claus Reinke Simon Thompson Computing Lab, University of Kent
Refactoring Functional Programs Simon Thompson with Huiqing Li Claus Reinke
Progress on ‘HaRe: The Haskell Refactorer’ Huiqing Li, Claus Reinke, Simon Thompson Computing Laboratory, University of Kent Refactoring is the process.
An Abstract Interpretation Framework for Refactoring P. Cousot, NYU, ENS, CNRS, INRIA R. Cousot, ENS, CNRS, INRIA F. Logozzo, M. Barnett, Microsoft Research.
Programming Languages
Addressing the Challenges of Current Software. Questions to Address Why? What? Where? How?
ML Datatypes.1 Standard ML Data types. ML Datatypes.2 Concrete Datatypes  The datatype declaration creates new types  These are concrete data types,
8. Introduction to Denotational Semantics. © O. Nierstrasz PS — Denotational Semantics 8.2 Roadmap Overview:  Syntax and Semantics  Semantics of Expressions.
Getting started with ML ML is a functional programming language. ML is statically typed: The types of literals, values, expressions and functions in a.
0 PROGRAMMING IN HASKELL Chapter 10 - Declaring Types and Classes.
Clean code. Motivation Total cost = the cost of developing + maintenance cost Maintenance cost = cost of understanding + cost of changes + cost of testing.
Test-Driven Development and Refactoring CPSC 315 – Programming Studio.
COMPSCI 105 S Principles of Computer Science 12 Abstract Data Type.
Konstanz, Jens Gerken ZuiScat An Overview of data quality problems and data cleaning solution approaches Data Cleaning Seminarvortrag: Digital.
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming The software development method algorithms.
Software Engineering and Design Principles Chapter 1.
The Formalisation of Haskell Refactorings Huiqing Li Simon Thompson Computing Lab, University of Kent
Refactoring Haskell Programs Huiqing Li Computing Lab, University of Kent
©Ian Sommerville 2000Software Engineering, 6/e, Chapter 91 Formal Specification l Techniques for the unambiguous specification of software.
The Haskell Refactorer, HaRe, and its API Huiqing Li Claus Reinke Simon Thompson Computing Lab, University of Kent
C++ for Engineers and Scientists Third Edition
Templates. Objectives At the conclusion of this lesson, students should be able to Explain how function templates are used Correctly create a function.
Refactoring Functional Programs Huiqing Li Claus Reinke Simon Thompson Computing Lab, University of Kent.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 10 Slide 1 Formal Specification.
Language Evaluation Criteria
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
Introduction Ellen Walker CPSC 201 Data Structures Hiram College.
Advanced Programing practices
ML Datatypes.1 Standard ML Data types. ML Datatypes.2 Concrete Datatypes  The datatype declaration creates new types  These are concrete data types,
1 Chapter 4: Selection Structures. In this chapter, you will learn about: – Selection criteria – The if-else statement – Nested if statements – The switch.
Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:
Design of Algorithms by Induction Part 1 Algorithm Design and Analysis Week 3 Bibliography: [Manber]- Chap.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
ECE450 - Software Engineering II1 ECE450 – Software Engineering II Today: Design Patterns IX Interpreter, Mediator, Template Method recap.
Supported by ELTE IKKK, Ericsson Hungary, in cooperation with University of Kent Erlang refactoring with relational database Anikó Víg and Tamás Nagy Supervisors:
Refactoring1 Improving the structure of existing code.
Lee CSCE 314 TAMU 1 CSCE 314 Programming Languages Haskell: Types and Classes Dr. Hyunyoung Lee.
Refactoring Deciding what to make a superclass or interface is difficult. Some of these refactorings are helpful. Some research items include Inheritance.
CS Data Structures I Chapter 2 Principles of Programming & Software Engineering.
Com Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole &
Refactoring 2. Admin Blackboard Quiz Acknowledgements Material in this presentation was drawn from Martin Fowler, Refactoring: Improving the Design of.
Simple Classes. ADTs A specification for a real world data item –defines types and valid ranges –defines valid operations on the data. Specification is.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
CSE Winter 2008 Introduction to Program Verification January 31 proofs through simplification.
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
Interfaces About Interfaces Interfaces and abstract classes provide more structured way to separate interface from implementation
SEG 4110 – Advanced Software Design and Reengineering Topic T Introduction to Refactoring.
Refactoring1 Improving the structure of existing code.
Refactoring Agile Development Project. Lecture roadmap Refactoring Some issues to address when coding.
PROGRAMMING PRE- AND POSTCONDITIONS, INVARIANTS AND METHOD CONTRACTS B MODULE 2: SOFTWARE SYSTEMS 13 NOVEMBER 2013.
0 PROGRAMMING IN HASKELL Based on lecture notes by Graham Hutton The book “Learn You a Haskell for Great Good” (and a few other sources) Odds and Ends,
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Tevfik Bultan Lecture 4: Introduction to C: Control Flow.
More Data Types CSCE 314 Spring CSCE 314 – Programming Studio Defining New Data Types Three ways to define types: 1.type – Define a synonym for.
Object Design More Design Patterns Object Constraint Language Object Design Specifying Interfaces Review Exam 2 CEN 4010 Class 18 – 11/03.
Defining Classes Modules and ADTs CSCE 314 Spring 2016.
C++ for Engineers and Scientists Second Edition Chapter 4 Selection Structures.
Principles and examples
String is a synonym for the type [Char].
PROGRAMMING IN HASKELL
The Object-Oriented Thought Process Chapter 05
Aspect Validation: Connecting Aspects and Formal Methods
PROGRAMMING IN HASKELL
Improving the structure of existing code
Types and Classes in Haskell
PROGRAMMING IN HASKELL
Advanced Programing practices
CSCE 314: Programming Languages Dr. Dylan Shell
Presentation transcript:

Refactoring Functional Programs Claus Reinke Simon Thompson TCS Seminar, 1 October 2001

Refactoring TCS Overview Refactoring: what does it mean? The background in OO … and the functional context. Simple examples. Discussion. Taxonomy. More examples. Case study: semantic tableau.

Refactoring TCS What is refactoring? Improving the design of existing code … … without changing its functionality.

Refactoring TCS When refactor? Prior to changing the functionality of a system. refactormodify

Refactoring TCS When refactor? After a first attempt: improving 'rubble code'. buildrefactor

Refactoring TCS What is refactoring? There is no one correct design. Development time re-design. Understanding the design of someone else's code. Refactoring happens all the time … … how best to support it?

Refactoring TCS FP + SE Functional programming provides a different view of the programming process … … however, it's not that different. Software engineering ideas + functional programming design; testing; metrics; refactoring.

Refactoring TCS Refactoring functional programs Particularly suited: build a prototype: revise, redesign, extend. Semantic basis supports verified transformations. Strong type system useful in error detection. Testing should not be forgotten.

Refactoring TCS Genesis Refactoring comes from the OO community … … associated with Martin Fowler, Kent Beck (of extreme programming), Bill Opdyke, Don Roberts. Loosely linked with the idea of design pattern.

Refactoring TCS Genesis (2) 'Refactoring comes from the OO community' In fact the SE community got there first … … Program restructuring to aid software maintenance, PhD thesis, William Griswold, OO community added the name, support for OO features and put it all into practice.

Refactoring TCS Design pattern A stereotypical piece of design / implementation. Often not embodied in a programming construct … … might be in a library, or … more diffuse than that.

Refactoring TCS What refactoring is not Changing functionality. Transformational programming … in the sense of the squiggol school, say.

Refactoring TCS Generalities Changes not limited to a single point or indeed a single module: diffuse and bureaucratic. Many changes bi-directional. Tool support very valuable.

Refactoring TCS Simple examples Simple examples from the Haskell domain follow. Idea: refactoring in practice would consist of a sequence of small - almost trivial - changes … … come back to this.

Refactoring TCS Renaming f x y = …  Name may be too specific, if the function is a candidate for reuse. findMaxVolume x y = …  Make the specific purpose of the function clearer.

Refactoring TCS Lifting / demoting f x y = … h … where h = …  Hide a function which is clearly subsidiary to f ; clear up the namespace. f x y = … (h x y) … h x y = …  Makes h accessible to the other functions in the module (and beyond?).

Refactoring TCS Naming a type f :: Int -> Char g :: Int -> Int …  Reuse supported (a synonym is transparent, but can be misleading). type Length = Int f :: Length -> Char g :: Int -> Length  Clearer specification of the purpose of f, g. (Morally) can only apply to lengths.

Refactoring TCS Opaque type naming f :: Int -> Char g :: Int -> Int …  Reuse supported. type Length = Length {length::Int} f' :: Length -> Char g' :: Int -> Length f' = f. length g' = Length. g  Clearer specification of the purpose of f, g. Can only apply to lengths.

Refactoring TCS The scope of changes f :: Int -> Char f :: Length -> Char g :: Int -> Int g :: Int -> Length Need to modify … … the calls to f … the callers of g.

Refactoring TCS Bureaucracy How to support these changes? Editor plus type checker. The rôle of testing in the OO context. In the functional context much easier to argue that verification can be used.

Refactoring TCS Machine support Different levels possible Show all call sites, all points at which a particular type is used … … change at all these sites. Integration with existing tools (vi, etc.).

Refactoring TCS More examples More complex examples in the functional domain; often link with data types. Three case studies shapes: from algebraic to existential types; a collection of modules for regular expressions, NFAs and DFAs: heavy use of sets; a collection of student implementations of propositional semantic tableaux.

Refactoring TCS Algebraic or abstract type? data Tr a = Leaf a | Node a (Tr a) (Tr a) flatten :: Tr a -> [a] flatten (Leaf x) = [x] flatten (Node s t) = flatten s ++ flatten t isLeaf, isNode, leaf, left, right, mkLeaf, mkNode flatten :: Tr a -> [a] flatten t | isleaf t = [leaf t] | isNode t = flatten (left t) ++ flatten (right t)

Refactoring TCS Algebraic or abstract type?  Pattern matching syntax is more direct … … but can achieve a considerable amount with field names. Other reasons?  Allows changes in the implementation type without affecting the client: e.g. might memoise values of a function within the representation type (itself another refactoring…). Allows an invariant to be preserved.

Refactoring TCS Migrate functionality isLeaf, isNode, leaf, left, right, mkLeaf, mkNode depth :: Tr a -> Int depth t | isleaf t = 1 | isNode t = 1 + max (depth (left t)) (depth (right t)) isLeaf, isNode, leaf, left, right, mkLeaf, mkNode, depth

Refactoring TCS Migrate functionality  If the type is reimplemented, need to reimplement everything in the signature, including depth. The smaller the signature the better, therefore.  Can modify the implementation to memoise values of depth, or to give a more efficient implementation using the concrete type.

Refactoring TCS Algebraic or existential type? data Shape = Circle Float | Rect Float Float … area :: Shape -> Float area (Circle f) = pi*r^2 area (Rect h w) = h*w perim :: Shape -> Float … data Shape = forall a. Sh a => Shape a class Sh a where area :: a -> Float perim :: a -> Float data Circle = Circle Float instance Sh Circle area (Circle f) = pi*r^2 perim (Circle f) = 2*pi*r

Refactoring TCS Algebraic or existential?  Pattern matching is available. Possible to deal with binary methods: how to deal with == on Shape as existential type?  Can add new sorts of Shape e.g. Triangle without modifying existing working code. Functions are distributed across the different Sh types.

Refactoring TCS Replace function by constructor data Expr = Star Expr | Then Expr Expr | … plus e = Then e (Star e)  plus is just syntactic sugar; reduce the number of cases in definitions. [Character range is another, more pertinent, example.] data Expr = Star Expr | Plus Expr | Then Expr Expr | …  Can treat Plus differently, e.g. literals (Plus e) = literals e but require each function over Expr to have a Plus clause.

Refactoring TCS Set comprehensions (Haskell-specific) makeSet [ f x y | x <- flatten xs, y <- flatten ys, p x y ]  The notation looks more like {f x y | x<-xs, y<-ys, p x y}; the monadic notation has a different connotation. do x <- xs y <- ys guard (p x y) return (f x y)  Doesn't require the abstraction to be broken by flatten :: Set a -> [a] Might not be possible to define a flatten function.

Refactoring TCS Other examples... Modify the return type of a function from T to Maybe T, Either T T' or [T]. Would be nice to have field names in Prelude types. Add an argument; (un)group arguments; reorder arguments. Move to monadic presentation. Flat or layered datatypes ( Expr : add BinOp type). Various possibilities for error handling/exceptions.

Refactoring TCS What now? Grant application: catalogue, tools, cast studies. Online catalogue started. Develop taxonomy. A 'live' example?

Refactoring TCS A classification scheme name (a phrase) label (a word) left-hand code right-hand code comments l to r r to l general primitive / composed cross-references internal external (Fowler) category (just one) or … … classifiers (keywords) language specific (Haskell, ML etc.) feature (lazy etc.) conditions left / right analysis required (e.g. names, types, semantic info.) which equivalence? version info date added revision number

Refactoring TCS Case study: semantic tableaux Take a working semantic tableau system written by an anonymous 2nd year student … … refactor as a way of understanding its behaviour. Nine stages of unequal size. Reflections afterwards.

Refactoring TCS An example tableau  ( (A  C)  ((A  B)  C) )  ((A  B)  C) (A  C)  C AA  (A  B)  C   AB Make B True Make A and C False

Refactoring TCS v1: Name types Built-in types [Prop] [[Prop]] used for branches and tableaux respectively. Modify by adding type Branch = [Prop] type Tableau = [Branch] Change required throughout the program. Simple edit: but be aware of the order of substitutions: avoid type Branch = Branch

Refactoring TCS v2: Rename functions Existing names tableaux removeBranch remove become tableauMain removeDuplicateBranches removeBranchDuplicates and add comments clarifying the (intended) behaviour. Add test datum. Discovered some edits undone in stage 1. Use of the type checker to catch errors. test will be useful later?

Refactoring TCS v3: Literate  normal script Change from literate form: Comment … > tableauMain tab > =... to -- Comment … tableauMain tab =... Editing easier: implicit assumption was that it was a normal script. Could make the switch completely automatic?

Refactoring TCS v4: Modify function definitions From explicit recursion: displayBranch :: [Prop] -> String displayBranch [] = [] displayBranch (x:xs) = (show x) ++ "\n" ++ displayBranch xs to displayBranch :: Branch -> String displayBranch = concat. map (++"\n"). map show More abstract … move somewhat away from the list representation to operations such as map and concat which could appear in the interface to any collection type. First time round added incorrect (but type correct) redefinition … only spotted at next stage. Version control: undo, redo, merge, … ?

Refactoring TCS v5: Algorithms and types (1) removeBranchDup :: Branch -> Branch removeBranchDup [] = [] removeBranchDup (x:xs) | x == findProp x xs = [] ++ removeBranchDup xs | otherwise = [x] ++ removeBranchDup xs findProp :: Prop -> Branch -> Prop findProp z [] = FALSE findProp z (x:xs) | z == x = x | otherwise = findProp z xs

Refactoring TCS v5: Algorithms and types (2) removeBranchDup :: Branch -> Branch removeBranchDup [] = [] removeBranchDup (x:xs) | findProp x xs = [] ++ removeBranchDup xs | otherwise = [x] ++ removeBranchDup xs findProp :: Prop -> Branch -> Bool findProp z [] = False findProp z (x:xs) | z == x = True | otherwise = findProp z xs

Refactoring TCS v5: Algorithms and types (3) removeBranchDup :: Branch -> Branch removeBranchDup = nub findProp :: Prop -> Branch -> Bool findProp = elem

Refactoring TCS v5: Algorithms and types (4) removeBranchDup :: Branch -> Branch removeBranchDup = nub Fails the test ! Two duplicate branches output, with different ordering of elements. The algorithm used is the 'other' nub algorithm, nubVar : nub [1,2,0,2,1] = [1,2,0] nubVar [1,2,0,2,1] = [0,2,1] The code is dependent on using lists in a particular order to represent sets.

Refactoring TCS v6: Library function to module Add the definition: nubVar = … to the module ListAux.hs and replace the definition by import ListAux Editing easier: implicit assumption was that it was a normal script. Could make the switch completely automatic?

Refactoring TCS v7: Housekeeping Remanings: including foo and bar and contra (becomes notContra ). An instance of filter, looseEmptyLists is defined using filter, and subsequently inlined. Put auxiliary function into a where clause. Generally cleans up the script for the next onslaught.

Refactoring TCS v8: Algorithm (1) splitNotNot :: Branch -> Tableau splitNotNot ps = combine (removeNotNot ps) (solveNotNot ps) removeNotNot :: Branch -> Branch removeNotNot [] = [] removeNotNot ((NOT (NOT _)):ps) = ps removeNotNot (p:ps) = p : removeNotNot ps solveNotNot :: Branch -> Tableau solveNotNot [] = [[]] solveNotNot ((NOT (NOT p)):_) = [[p]] solveNotNot (_:ps) = solveNotNot ps

Refactoring TCS v8: Algorithm (2) splitXXX removeXXX solveXXX are present for each of nine rules. The algorithm applies rules in a prescribed order, using an integer value to pass information between functions. Aim: generic versions of split remove solve Have to change order of rule application … … which has a further effect on duplicates. Add map sort to top level pipeline prior to duplicate removal.

Refactoring TCS v9: Replace lists by sets. Wholesale replacement of lists by a Set library. mapmapSet foldrfoldSet (careful!) filterfilterSet The library exposes the representation: pick, flatten. Use with discretion … further refactoring possible. Library needed to be augmented with primRecSet :: (a -> Set a -> b -> b) -> b -> Set a -> b

Refactoring TCS v9: Replace lists by sets (2) Drastic simplification: no need for explicit worries about … ordering and its effect on equality, … (removal of) duplicates. Difficult to test whilst in intermediate stages: the change in a type is all or nothing … … work with dummy definitions and the type checker. Further opportunities: … why choose one rule from a set when could apply to all elements at once? Gets away from picking on one value (and breaking the set interface).

Refactoring TCS Conclusions of the case study Heterogeneous process: some small, some large. Are all these stages strictly refactorings: some semantic changes always necessary too? Importance of type checking for hand refactoring … … and testing when any semantic changes. Undo, redo, reordering the refactorings … CVS. In this case, directional … not always the case.

Refactoring TCS What next? Put the catalogue into the full taxonomic form. Continue taxonomy: look at larger case studies etc. Towards a tool design.