Presentation on theme: "Introduction to Compilation of Functional Languages Wanhe Zhang Computing and Software Department McMaster University 16 th, March, 2004."— Presentation transcript:
Introduction to Compilation of Functional Languages Wanhe Zhang Computing and Software Department McMaster University 16 th, March, 2004
Functional Programs Based on the idea that a program is a function with one input parameter, its input and one result, its output. Difference between functional and imperative language is efficiency consideration and readability. We can see that from the factorial function below:
Compilation of Functional Languages A short tour of Haskell Compiling functional languages Polymorphic type checking Desugaring
Tour of Haskell Function application syntax: f 11 13 1)No bracket around the arguments, allows currying to be expressed naturally. 2)Function application binds stronger than any operator. g n+1 is (g n) + 1 rather than g (n+1)
Tour of Haskell Offside rule Lists List comprehension Pattern matching Polymorphic typing Referential transparency Higher-order functions Lazy evaluation
Offside rule Divide x 0 = inf Divide x y = x/y An equation consists of a left-hand side, followed by the = token, followed by the right-hand side. No explicit token to denote the end of each equation. If treats line break as the terminator is inconvenient. Offside rule controls the bounding box of an expression.
Offside rule Everything below and to the right of the = token is defined to be part of the expression making up the right-hand side. The right-hand side terminates before the first token that is ‘offside’-to the left-of the = position.
Lists The polymorphic typing of Haskell, does not allow lists to contain elements of different.  [1,2,3,4] = (1:(2:(3:(4:)))) [“red”, “yellow”] [1..10]
List Comprehension Syntax that closely matches set notation. S = [n^2 | n <- [1..100], odd n] List comprehension generates lists rather than sets: ordering is important and elements may occur multiple times in list comprehensions. It is convenient to use when generating new lists from old ones.
List Comprehension qsort  =  qsort (x:xs) = qsort [y | y <- xs, y<x] ++ [x] ++ qsort [y | y = x]
Pattern Matching fac 0 = 1 fac n = n*fac (n-1) Function equations are matched from top to bottom; the patterns in them are matched from left to right. Pattern matching can be translated easily into equivalent definitions based on if-then-else constructs. Fac n = if (n == 0) then 1 else n * fac (n-1)
Polymorphic Typing An expression is said to be polymorphic if it ‘has many types’. List  has many types: list of characters, list of numbers and an infinite number of others. The main advantage of polymorphic typing is that functions and data structures can be reused for any desired type instance. Type checking will discussed later.
Referential Transparency A fixed relation between inputs and output: f arg will produce the same output no matter what the overall state of the computation is. Imperative languages, assignments to global variables and through pointers may cause two calls f arg to yield different result. The advantage is that it simplifies program analysis and transformation. The bad thing is that it prevents the programmer from writing space-efficient programs that use in-space updates. Add_one  =  add_one (x:xs) = x+1 : add_one xs In imperative language, we can update the input list in- place.
Higher-order Functions Higher-order function is defined as a function that takes a function as an argument, or delivers one as a result. Imperative languages barely support higher-order functions: functions may perhaps be passed as parameters, cannot create a new one. Two way to create new functions: 1) diff f = f_ where f_ x = (f ( x + h ) – f x) / h h = 0.0001 diff returns as its result a ‘new’ function that is composed out of already existing functions. 2) diff f x = ( f ( x + h ) – f x) / h where h = 0.0001 Apply an existing function to a number of arguments that is less than the arity of the function.-------Currying
Lazy Evaluation Lazy evaluation relaxes these constraints by specifying that a subexpression will only be evaluated when its value is needed for the progress of the computation.
Compiling Functional Languages Below is the compiler phase handles which aspect of Haskell:
The Functional Core It must be high-level enough to serve as an easy target for the front-end that compiles the syntactic sugar away. It must be low-level enough to allow concise descriptions of optimizations, which are often expressed as a case analysis of all core constructs.
Functional Core of Haskell Basic data types, including int, char, and bool; (user-defined) structured data types; Typed non-nesting functions; Local bindings as part of let-expressions; Expressions consisting of identifiers, arithmetic operators, if-then-else compounds, and function applications; Higher-order functions; (cannot map onto C) Lazy evaluation semantics. (cannot map onto C)
Polymorphic Type checking We illustrate this by an example: map f  =  map f ( x : xs ) = f x : map f xs First equation: map :: a -> [b] -> [c] Second equation: map :: (b -> c) -> [b] -> [c]) For the second, x is an element of the list with type [b] and that f x is a part of map’s result list with type [c], so the type of f is b -> c
Polymorphic Function Application Map :: ( a -> b ) -> [a] -> [b] length :: [c] -> Int map length The type checker must unify the type of length, which is [c] -> Int, with the type of map’s first argument, a -> b. => a = [c], b = Int. Map :: ( [c] -> Int ) -> [[c]] -> [Int] map length :: [[c]] -> [Int]
Desugaring Transform a Haskell program into its functional-core equivalent. We will focus on translating lists, pattern matching, list comprehension, and nested functions to core constructs.
The Translation of Lists Three forms of syntactic sugar :,.. The operator : constructs a node with three fields: a type tag Cons, an element, and a list. x : xs is transformed to (Cons x xs) [1,2] is transformed to (Cons 1(Cons 2 )) [1..] is usually translated to calls of library functions that express these lists in terms of : and .
A constant yields an equality test. A variable imposes no constraint at all. Constructor patterns, require additional support to provide type information at run time. We must be able to verify that an argument matches the constructor specified in pattern
Constructor Patterns The run-time support to provide the _type_constr function that returns the constructor tag of an arbitrary structured type element. Reference the fields in the constructor type. The run-time assists us by providing the generic _type_field n function that returns the nth field of any structured type. We will illustrate above by an example below:
Constructor Patterns take 0 xs =  take n  =  take n (x: xs) = x : take (n-1) xs
Optimization The code has already been type-checked at compile time, any second argument in a call of take is guaranteed to be a list. So the last equation need not verify that the argument matches the constructor pattern, and the error guard can be omitted too.
The Translation of List Comprehension [expression | qualifier,..., qualifier] Qualifier is either a generator or a filter. A generator is of the form var <- list expression; it introduces a variable iteration over a list. A filter is a Boolean expression, which constrains the variables generated by earlier qualifiers.
The Translation of List Comprehension The transformation works by processing the qualifiers from left to right one at a time. This approach naturally leads to a recursive scheme as presented below.
The Translation of List Comprehension Transformation rule (1) covers the base case where no more qualifiers are present in the list comprehension. The filter qualifier is handled in transformation rule(2), where F stands for the filter and Q stands the remaining sequence of qualifiers. The generator qualifier e <- L is covered in rule (3) The generator produces zero or more elements e drawn from a list L. We must generate code to iterate over all elements e, Compute the remainder Q of the list comprehension for each value of e, and concatenate the – possible empty- result lists into a single list. The key idea for rule (3) is a nested function takes element e and produces the list of values that Q can assume for e. The function then is called over all the elements in L.
The Translation of List Comprehension We need calling a function for all elements in a list; Concatenation the elements of the resulting lists into one. Map function does not work. map f  =  map f ( x : xs ) = f x : map f xs It simply concatenates the results of function applications, and would yield a list of lists in this case. Modified map: mappend :: (a -> [b]) -> [a] -> [b] mappend f  =  mappend f (x:xs) = f x ++ mappend f xs
The Translation of List Comprehension Below we will illustrate the theory above: Pyth n = [(a, b, c) | a <- [1.. n], b <- [a.. n], c <- [b.. n], a^2 + b^2 == c^2] Transformation of Pyth: Pyth n = mappend f_bc2 [1..n] where f_bc2 a = mappend f_2 [b.. n] where f_2 c = if (a^2 + b^2 == c^2) then [(a, b, c)] else 
The Translation of Nested Functions Since most target languages of functional compilers don’t support nested routines. The functional core excludes nested functions. Using lexical pointers to activation records in combination with higher-order functions and lazy evaluation causes dynamic scope violations, since a call to a nested function may escape its lexical scope at run time, rendering its lexical pointer invalid. For example, a nested function can be returned as the result of higher-order function; lazy evaluation can delay the execution of a call to the nested function until the caller has returned its value, contains a reference to the suspended call.
The Translation of Nested Functions Example: Sv_mul defines the multiplicaiton of a scalar and a vector.
Analysis of the Example Call map to apply the nested function s_mul to each element in the vector list. At run time, the interpreted code for sv_mul returns a graph holding the unevaluated expression map s_mul vec. If we return the routine value map s_mul vec, the activation record of sv_mul will be removed before the nested function s_mul is ever applied.
The Translation of Nested Functions The functional core supports currying( partial parameterization). Translating a nested routine f to a global routine is just a matter of extending it with additional parameters p 1,p 2 … p a that capture the out-of-scope pointers; each usage of the nested function f must be replaced with a curried call: f p 1 …p a
The Translation of Nested Functions Lift the nested s_mul into a global funciton sv_mul_dot_s_mul Extend the function heading with an additional parameter named scal capuring the pointer to the scal parameter of the outer sv_mul function. All calls of s_mul are replaced by the expression sv_mul_dot_s_mul scal.
Conclusion Short tour of Haskell General concept of Compiler for Functional programs Type checking Desugaring----The most important part Questions? ☺