# Lazy Evaluation aka deferred execution aka call-by-need.

## Presentation on theme: "Lazy Evaluation aka deferred execution aka call-by-need."— Presentation transcript:

Lazy Evaluation aka deferred execution aka call-by-need

What should / does this do? f n = [ 2*n, g n ] where g x = ---------------------------------- head (f 10)

Lazy evaluation Allows (conceptually) infinite structures. Simplifies our concerns and thinking. Can sidestep landmines in the code. twos = [ 2, 2.. ] ones = 1 : ones -- circular defn sum ((take 10 twos) ++ (take 10 ones)) zip [1.. ] names -- more useful!

Create infinite list of small rands: rands is recursive, with no base case! Because of laziness, going down the recursion is deferred. This random generator will not satisfy the purists! rands seed = let r = (17*seed+1) `mod` 1024 in r : rands r -- use r as next seed take 5 (rands 2)  [35,596,917,230,839] sort (take 100 (rands 42))  [21,56,73,83,93,110,130,...

The Sieve of Eratosthenes: prime finder sieve:: [Int] -> [Int] hundredthPrime = head (drop 99 primes) sieve (p:ps) = p:sieve [n | n <- x, (n `mod` p) /= 0] primes = sieve [2..]

Searching state spaces.... Many problems start in an initial state, We terminate successfully if the current state is the goal state, else we generate successor states and queue them for later inspection, we may need to avoid going round in circles, the (possibly infinite) set of reachable states is called the state space.

Laziness is a decoupling mechanism! Interleaving the logic for generating states while searching is much messier than independently generating all possible states, separately consuming them until we've found as many solutions as we want. So a lazy stream between the producer and consumer of your states is an excellent way to architect your software. Standard producer/consumer solutions recommend separate threads, and a decoupling queue or "pipe" which requires locks and synchronization, etc. to achieve this "separation of concerns". Laziness does it for us, almost for free! (performance  ) solutions = [s | s <- allStates, isGoal s]

Generating all states, even for infinite state space… By putting new children at the back of the pending queue, ( ss ), we generate the possible states in breadth-first order. The first solution will always be one of the shortest. Depth-first state generations may need additional logic for cycle-avoidance. (i.e. in a maze with cycles or a Rubik cube you're going to need to keep track of states already visited.) allStates = gen [initState] -- set up queue where gen [ ] = [ ] gen (h:ss) = h : gen (ss ++ children h)

Fun(ctional) feature: C#, Python, etc. coroutines IEnumerable intsFrom(int n) { // Generate infinite lazy stream starting at n. while (true) { yield return n; // Build closure, return n. n++; // Reactivation resumes here. } int search() { // Search for suitable product. foreach (int i in intsFrom(1)) { if (isSolution(i)) return i; } Look Ma! It’s a separate producer and consumer, with no buffers, no synchronization locks, no threads! Wow! So easy!

Laziness can be counter-intuitive The sort is O(n 2 ), but... ins a [ ] = [a] ins a (b:bs) | a <= b = a:b:bs | True = b:(ins a bs) isort [] = [] isort (e:es) = ins e (isort es) min1 xs = head (isort xs) runs in O(n) -- linear -- time. Explain why.

ins a [ ] = [a] ins a (b:bs) | a <= b = a:b:bs | True = b:(ins a bs) isort [] = [] isort (e:es) = ins e (isort es) How can that be? The expensive part of the algorithm gets deferred, and we never have to do the work if we only ever ask for the head of the result! So the performance of ins and isort now depends on the context in which it is used! Ouch!

Getting caught again … The "root" of the expression trees (which drives the reductions / computation) is different in these two cases. foldl recurses all the way to the empty list before it can return a value. i.e. foldl cannot process infinite lists, and even if f can shortcut (as || and && can), it gets no chance to do its sidestepping magic to terminate the list traversal. In the foldr case, however, f is at the root of the tree, and it conrols what computation happens. So if f does not need its right argument, the computation terminates immediately. foldl f z [] = z foldl f z (x:xs) = foldl f (f z x) xs foldr f z [] = z foldr f z (x:xs) = f x (foldr op id xs)

Oops! anyL pred xs = foldl (||) False (map pred xs) anyR pred xs = foldr (||) False (map pred xs) anyR even [1..10]  True (108 reductions) anyR even [1..1000]  True (108 reductions) anyR even [1..]  True (82 reductions) anyL even [1..10]  True (260 reductions) anyL even [1..1000]  True (17090 reductions) anyL even [1..]  ERROR - C stack overflow

Laziness is often inefficient! We have to create closures to "remember where and how to resume the computation at some later time". What complicates this is the state of the environment - especially the bindings of non-local identifiers. These non-local values (and how they are bound to the identifiers) must be available when the computation is reactivated. Suspending computation, then having to reactivate it, is costly in space and time.

Laziness is frequent source of space leaks http://encyclopedia2.thefreedictionary.com/space+leak space leak - A data structure which grows bigger, or lives longer, than might be expected. Such unexpected memory use can cause a program to require more garbage collections or to run out of heap. Space leaks in functional programs usually result from excessive laziness. heap

Strictness analysis A function f is strict in an argument x if calling f will always require x to be evaluated. Then it pays to have the caller eagerly evaluate the expression to bind to x, rather than lazily defer it as a closure, before calling f. Lots of research to automatically discover strictness, to improve performance. But it is a tough analysis: recursive functions that are higher-order make it theoretically undecidable to always tell if an argument x will definitely be used by f.

Strictness annotations... Programmer provides annotations to indicate where the compiler should produce eager rather than lazy code. Haskell GHC compiler - can make huge difference to mark fields in data declarations or function arguments as strict. -- The ! marks tell GHC not to defer edge evaluation -- when constructing a new State value data State = W !Edge !Edge !Edge !Edge !Edge !Edge !Edge deriving (Show, Eq)

Why is laziness used? Theoretically, the best. If an expression can be reduced to a result, a lazy evaluator is guaranteed to always find that result... Eager evaluators might crash doing unnecessary work on some landmine! The mental abstraction is great: we generate infinite lists of all primes, we decouple our producers and consumers, the code is cleaner!

My own opinions: Laziness tends to lead one into trouble debugging and tracing, fighting space leaks, understanding performance, and struggling to annotate the code so that it works. It took > 10 hours to get my dictionary generator for Rubik's world working in Haskell, vs. 1-2 hours in C# F# is inherently eager. You have to ask for laziness. Compare to Haskell that is lazy, with annotations to ask for eagerness. I'm liking F#'s choice more and more. Laziness is powerful as a mental tool: you need to be able to think and design with that extra mental abstraction and the decoupling it brings. In practice, it is less often useful.

A state search problem... Four students have just one torch with only 17 minutes of battery life remaining. They need to cross a rickety bridge which can only hold two people at a time. It is dark, so crossing the bridge requires the torch. Each student needs a different amount of time to cross the bridge: 1, 2, 5 and 10 minutes respectively. Plan a sequence that will get them all across the bridge safely. (This does have a solution, not a trick, like waiting for daybreak!)

Summary Laziness allows us to work with conceptually infinite structures and computation, By postponing dangerous computation it finds results whenever they are available, so lazy evaluation is theoretically preferable, It is a powerful mental tool, But laziness can be counter-intuitive! Space leaks, performance hits, thinkos of note! Countered by strictness analysis, annotations We're seeing coroutine and closure mechanisms in most mainstream languages now.