Testing and Debugging (Depuração em Haskell) Complementa as seções anteriores Original autorizado por: John Hughes Adaptado.

Testing and Debugging (Depuração em Haskell) Complementa as seções anteriores Original autorizado por: John Hughes http://www.cs.chalmers.se/~rjmh/ Adaptado por: Claudio Cesar de Sá

What’s the Difference? Testing means trying out a program which is believed to work, in an effort to gain confidence that it is correct. If no errors are revealed by thorough testing, then, probably, relatively few errors remain. Debugging means observing a program which is known not to work, in an effort to localise the error. When a bug is found and fixed by debugging, testing can be resumed to see if the program now works. This lecture: describes recently developed tools to help with each activity.

Debugging Here’s a program with a bug: median xs = isort xs !! (length xs `div` 2) isort = foldr insert [] insert x [] = [x] insert x (y:ys) | x<y = x:y:ys | x>=y = y:x:ys Median> median [8,4,6,10,2,7,3,5,9,1] 2 Median> isort [8,4,6,10,2,7,3,5,9,1] [1,8,4,6,10,2,7,3,5,9] A test reveals median doesn’t work. We start trying functions median calls. isort doesn’t work either.

Debugging Tools The Manual Approach We choose cases to try, and manually explore the behaviour of the program, by calling functions with various (hopefully revealing) arguments, and inspecting their outputs. The Automated Approach We connect a debugger to the program, which lets us observe internal values, giving us more information to help us diagnose the bug.

The Haskell Object Observation Debugger Provides a function which collects observations of its second argument, tagged with the String, and returns the argument unchanged. Think of it as like connecting an oscilloscope to the program: the program's behaviour is unchanged, but we see more. (You need to import the library which defines observe in order to use it: add at the start of your program). observe :: String -> a -> a import Observe

Listas> :l listas_haskell.hs Reading file "listas_haskell.hs": Reading file "/usr/share/hugs/lib/exts/Observe.lhs": Reading file "listas_haskell.hs": Hugs session for: /usr/share/hugs/lib/Prelude.hs /usr/share/hugs/lib/exts/Observe.lhs listas_haskell.hs Listas> Garantindo que Observe.lhs foi carregado...

What Do Observations Look Like? Median> sum [observe "n*n" (n*n) | n <- [1..4]] 30 >>>>>>> Observations <<<<<< n*n 1 4 9 16 We add a ''probe'' to the program The values observed are displayed, titled with the name of the observation.

Observing a List Median> sum (observe "squares" [n*n | n <- [1..4]]) 30 >>>>>>> Observations <<<<<< squares (1 : 4 : 9 : 16 : []) Observing the entire list lets us see the order of values also. Now there is just one observation, the list itself. Lists are always observed in ''cons'' form.

Observing a Pipeline Median> (sum. observe "squares". map (\x->x*x)) [1..4] 30 >>>>>>> Observations <<<<<< squares (1 : 4 : 9 : 16 : []) We can add observers to ''pipelines'' -- long compositions of functions -- to see the values flowing between them.

Observing Counting Occurrences countOccurrences = map (\ws -> (head ws, length ws)). observe "after groupby". groupBy (==). observe "after sort". sort. observe "after words”. words Add observations after each stage.

Observing Counting Occurrences Main> countOccurrences "hello clouds hello sky" [("clouds",1),("hello",2),("sky",1)] >>>>>>> Observations <<<<<< after groupby (("clouds" : []) : ("hello" : "hello" : []) : ("sky" : []) : []) after sort ("clouds" : "hello" : "hello" : "sky" : []) after words ("hello" : "clouds" : "hello" : "sky" : [])

Observing Consumers An observation tells us not only what value flowed past the observer -- it also tells us how that value was used! Main> take 3 (observe "xs" [1..10]) [1,2,3] >>>>>>> Observations <<<<<< xs (1 : 2 : 3 : _) The _ is a ''don't care'' value -- certainly some list appeared here, but it was never used!

Observing Length Main> length (observe "xs" (words "hello clouds")) 2 >>>>>>> Observations <<<<<< xs (_ : _ : []) The length function did not need to inspect the values of the elements, so they were not observed!

Observing Functions We can even observe functions themselves! Main> observe "sum" sum [1..5] 15 >>>>>>> Observations <<<<<< sum { \ (1 : 2 : 3 : 4 : 5 : []) -> 15 } Observe ''sum'' sum is a function, which is applied to [1..5] We see arguments and results, for the calls which actually were made!

Observing foldr Recall that : foldr (+) 0 [1..4] = 1 + (2 + (3 + (4 + 0))) Let’s check this, by observing the addition function. Main> foldr (observe "+" (+)) 0 [1..4] 10 >>>>>>> Observations <<<<<< + { \ 4 0 -> 4, \ 3 4 -> 7, \ 2 7 -> 9, \ 1 9 -> 10 }

Observing foldl We can do the same thing to observe foldl, which behaves as foldl (+) 0 [1..4] = (((0 + 1) + 2) + 3) + 4 Main> foldl (observe "+" (+)) 0 [1..4] 10 >>>>>>> Observations <<<<<< + { \ 0 1 -> 1, \ 1 2 -> 3, \ 3 3 -> 6, \ 6 4 -> 10 }

How Many Elements Does takeWhile Check? takeWhile isAlpha ''hello clouds hello sky'' == ''hello'' takeWhile isAlpha selects the alphabetic characters from the front of the list. How many times does takeWhile call isAlpha?

How Many Elements Does takeWhile Check? Main> takeWhile (observe "isAlpha" isAlpha) "hello clouds hello sky" "hello" >>>>>>> Observations <<<<<< isAlpha { \ ' ' -> False, \ 'o' -> True, \ 'l' -> True, \ 'e' -> True, \ 'h' -> True } takeWhile calls isAlpha six times -- the last call tells us it’s time to stop.

Observing Recursion fac 0 = 1 fac n | n>0 = n * fac (n-1) Main> observe "fac" fac 6 720 >>>>>>> Observations <<<<<< fac { \ 6 -> 720 } We did not observe the recursive calls! We observe this use of the function.

Observing Recursion fac = observe "fac" fac' fac' 0 = 1 fac' n | n>0 = n * fac (n-1) Main> fac 6 720 >>>>>>> Observations <<<<<< fac { \ 6 -> 720, \ 5 -> 120, \ 4 -> 24, \ 3 -> 6, \ 2 -> 2, \ 1 -> 1, \ 0 -> 1 } We observe all calls of the fac function.

Debugging median median xs = observe "isort xs" (isort xs) !! (length xs `div` 2) Main> median [4,2,3,5,1] 2 >>>>>>> Observations <<<<<< isort xs (1 : 4 : 2 : 3 : 5 : []) Wrong answer: the median is 3 Wrong (unsorted) result from isort

Debugging isort isort :: Ord a => [a] -> [a] isort = foldr (observe "insert" insert) [] Main> median [4,2,3,5,1] 2 >>>>>>> Observations <<<<<< insert { \ 1 [] -> 1 : [], \ 5 (1 : []) -> 1 : 5 : [], \ 3 (1 : 5 : []) -> 1 : 3 : 5 : [], \ 2 (1 : 3 : 5 : []) -> 1 : 2 : 3 : 5 : [], \ 4 (1 : 2 : 3 : 5 : []) -> 1 : 4 : 2 : 3 : 5 : [] } All well, except for this case

Debugging insert insert x [] = [x] insert x (y:ys) | x<y = observe "x<y" (x:y:ys) | x>=y = observe "x>=y" (y:x:ys) Main> median [4,2,3,5,1] 2 >>>>>>> Observations <<<<<< x>=y (1 : 5 : []) (1 : 3 : 5 : []) (1 : 2 : 3 : 5 : []) (1 : 4 : 2 : 3 : 5 : []) Observe the results from each case Only the second case was used!

The Bug! I forgot the recursive call… insert x [] = [x] insert x (y:ys) | x<y = x:y:ys | x>=y = y:insert x ys Main> median [4,2,3,5,1] 3 Bug fixed! The right answer

Summary ➢ The observe function provides us with a wealth of information about how programs are evaluated, with only small changes to the programs themselves ➢ That information can help us understand how programs work (foldr, foldrl, takeWhile etc.) ➢ It can also help us see where bugs are.

Testing Testing means trying out a program which is believed to work, in an effort to gain confidence that it is correct. Testing accounts for more than half the development effort on a large project (I’ve heard all from 50-80%). Fixing a bug in one place often causes a failure somewhere else -- so the entire system must be retested after each change. At Ericsson, this can take three months!

''Hacking'' vs Systematic Testing ''Hacking'' Systematic testing Try some examples until the software seems to work. Record test cases, so that tests can be repeated after a modification (regression testing). Document what has been tested. Establish criteria for when a test is successful - - requires a specification. Automate testing as far as possible, so you can test extensively and often.

QuickCheck: A Tool for Testing Haskell Programs Based on formulating properties, which can be tested repeatedly and automatically document what has been tested define what is a successful outcome are a good starting point for proofs of correctness Properties are tested by selecting test cases at random!

Random Testing? Is random testing sensible? Surely carefully chosen test cases are more effective? By taking 20% more points in a random test, any advantage a partition test might have had is wiped out. D. Hamlet ✔ QuickCheck can generate 100 random test cases in less time than it takes you to think of one! ✔ Random testing finds common (i.e. important!) errors effectively.

A Simple QuickCheck Property prop_Sort :: [Int] -> Bool prop_Sort xs = ordered (sort xs) Check that the result of sort is ordered. Random values for xs are generated. Main> quickCheck prop_Sort OK, passed 100 tests. The tests were passed.

Some QuickCheck Details import QuickCheck prop_Sort :: [Int] -> Bool prop_Sort xs = ordered (sort xs) Main> quickCheck prop_Sort OK, passed 100 tests. We must import the QuickCheck library. The type of a property must not be polymorphi c. quickCheck is an (overloaded) higher order function! We give properties names beginning with ”prop_” so we can easily find and test all the properties in a module.

A Property of insert prop_Insert :: Int -> [Int] -> Bool prop_Insert x xs = ordered (insert x xs) Main> quickCheck prop_Insert Falsifiable, after 4 tests: -2 [5,-2,-5] Whoops! This list isn’t ordered!

Discards test cases which are not ordered. A Corrected Property of insert prop_Insert :: Int -> [Int] -> Property prop_Insert x xs = ordered xs ==> ordered (insert x xs) Main> quickCheck prop_Insert OK, passed 100 tests. Result is no longer a simple Bool. Read it as ”implies”: if xs is ordered, then so is (insert x xs).

Using QuickCheck to Develop Fast Queue Operations What we’re going to do: Explain what a queue is, and give slow implementations of the queue operations, to act as a specification. Explain the idea behind the fast implementation. Formulate properties that say the fast implementation is ”correct”. Test them with QuickCheck.

What is a Queue? Leav e from the front Join at the back Examples Files to print Processes to run Tasks to perform

What is a Queue? A queue contains a sequence of values. We can add elements at the back, and remove elements from the front. We’ll implement the following operations: empty :: Queue a-- an empty queue isEmpty :: Queue a -> Bool-- tests if a queue is empty add :: a -> Queue a -> Queue a-- adds an element at the back front :: Queue a -> a-- the element at the front remove :: Queue a -> Queue a-- removes an element from the front

The Specification: Slow but Simple type Queue a = [a] empty = [] isEmpty q = q==empty add x q = q++[x] front (x:q) = x remove (x:q) = q Addition takes time depending on the number of items in the queue!

The Idea: Store the Front and Back Separately bcdefghiaj Old Fast to remove Slow to add bcde ihgf a j New Fast to add Fast to remove Periodically move the back to the front.

The Fast Implementation type Queue a = ([a],[a]) flipQ ([],b) = (reverse b,[]) flipQ (x:f,b) = (x:f,b) emptyQ = ([],[]) isEmptyQ q = q==emptyQ addQ x (f,b) = (f,x:b) removeQ (x:f,b) = flipQ (f,b) frontQ (x:f,b) = x Make sure the front is never empty when the back is not.

Relating the Two Implementations What list does a ”double-ended” queue represent? retrieve :: Queue a -> [a] retrieve (f, b) = f ++ reverse b What does it mean to be correct? ✔ retrieve emptyQ == empty ✔ isEmptyQ q == isEmpty (retrieve q) ✔ retrieve (addQ x q) == add x (retrieve q) ✔ retrieve (removeQ q) == remove (retrieve q)and so on.

Using Retrieve Guarantees Consistent Results Example frontQ (removeQ (addQ 1 (addQ 2 emptyQ))) ==front (retrieve (removeQ (addQ 1 (addQ 2 emptyQ)))) ==front (remove (retrieve (addQ 1 (addQ 2 emptyQ)))) ==front (remove (add 1 (retrieve (addQ 2 emptyQ)))) ==front (remove (add 1 (add 2 (retrieve emptyQ)))) ==front (remove (add 1 (add 2 empty)))

QuickChecking Properties prop_Remove :: Queue Int -> Bool prop_Remove q = retrieve (removeQ q) == remove (retrieve q) Main> quickCheck prop_Remove 4 Program error: {removeQ ([],sized_v1740 (instArbitrary_v1… Removing from an empty queue!

Correcting the Property prop_Remove :: Queue Int -> Property prop_Remove q = not (isEmptyQ q) ==> retrieve (removeQ q) == remove (retrieve q) Main> quickCheck prop_Remove 0 Program error: {removeQ ([],[Arbitrary_arbitrary instArbitrary… How can this be?

Making Assumptions Explicit We assumed that the front of a queue will never be empty if the back contains elements! Let’s make that explicit: goodQ :: Queue a -> Bool goodQ ([],[]) = True goodQ (x:f,b) = True goodQ ([],x:b) = False prop_Remove q = not (isEmptyQ q) && goodQ q ==> retrieve (removeQ q) == remove (retrieve q) NOW IT WORK S!

How Do We Know Only Good Queues Arise? Queues are built by add and remove: addQ x (f,b) = (f,x:b) removeQ (x:f,b) = flipQ (f,b) New properties: prop_AddGood x q = goodQ q ==> goodQ (addQ x q) prop_RemoveGood q = not (isEmptyQ q) && goodQ q ==> goodQ (removeQ q)

Whoops! Main> quickCheck prop_AddGood Falsifiable, after 0 tests: 2 ([],[]) addQ x (f,b) = (f,x:b) removeQ (x:f,b) = flipQ (f,b) See the bug?

Whoops! Main> quickCheck prop_AddGood Falsifiable, after 0 tests: 2 ([],[]) addQ x (f,b) = flipQ (f,x:b) removeQ (x:f,b) = flipQ (f,b)

Looking Back Formulating properties let us define precisely how the fast queue operations should behave. Using QuickCheck found a bug, and revealed hidden assumptions which are now explicitly stated. The property definitions remain in the program, documenting exactly what testing found to hold, and providing a ready made test-bed for any future versions of the Queue library. We were forced to reason much more carefully about the program’s correctness, and can have much greater confidence that it really works.

Summary Testing is a major part of any serious software development. Testing should be systematic, documented, and repeatable. Automated tools can help a lot. QuickCheck is a state-of-the-art testing tool for Haskell.

The remaining slides discuss an important subtlety when using QuickCheck

Testing the Buggy insert prop_Insert :: Int -> [Int] -> Property prop_Insert x xs = ordered xs ==> ordered (insert x xs) Main> quickCheck prop_Insert Falsifiable, after 51 tests: 5 [-3,4] Yields [-3,5,4] Why so many tests?

Observing Test Data prop_Insert :: Int -> [Int] -> Property prop_Insert x xs = ordered xs ==> collect (length xs) (ordered (insert x xs)) Main> quickCheck prop_Insert OK, passed 100 tests. 43% 0. 37% 1. 11% 2. 8% 3. 1% 4. Collect values during testing. Distribution of length xs. Random lists which happen to be ordered are likely to be short!

A Better Property prop_Insert :: Int -> Property prop_Insert x = forAll orderedList (\xs -> collect (length xs) (ordered (insert x xs))) Main> quickCheck prop_Insert2 OK, passed 100 tests. 22% 2. 15% 0. 14% 1. 8% 6. 8% 5. 8% 4. 8% 3. 4% 8. 3% 9. 3% 11. 2% 12. 2% 10. 1% 7. 1% 30. 1% 13. Read this as:  xs  orderedList. …

What is forAll? A higher order function! forAll :: (Show a, Testable b) => Gen a -> (a -> b) -> Property A generator for test data of type a. A function, which given a generated a, produces a testable result. forAll orderedList (\xs -> collect (length xs) (ordered (insert x xs)))

What is orderedList? A test data generator: orderedList :: Gen [Int] A ”generator for” a, behaves like IO a ”a command producing a”. Some primitive generators arbitrary :: Arbitrary a => Gen a oneof :: [Gen a] -> Gen a frequency :: [(Int,Gen a)] -> Gen a

80% of the time, generate another list ns of elements >=n, and return n:ns. Defining orderedList orderedList :: Gen [Int] orderedList = do n <- arbitrary listFrom n where listFrom n = frequency [(1,return []), (4,do m <- arbitrary ns <- listFrom (n+abs m) return (n:ns))] We can use the do syntax to write generators, like IO, but we cannot mix Gen and IO! Choose an n, and make a list of elements >= n. Choose a number >= n. 20% of the time, just stop.

Testing and Debugging (Depuração em Haskell) Complementa as seções anteriores Original autorizado por: John Hughes Adaptado.

Similar presentations

Presentation on theme: "Testing and Debugging (Depuração em Haskell) Complementa as seções anteriores Original autorizado por: John Hughes Adaptado."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Testing and Debugging (Depuração em Haskell) Complementa as seções anteriores Original autorizado por: John Hughes Adaptado.

Similar presentations

Presentation on theme: "Testing and Debugging (Depuração em Haskell) Complementa as seções anteriores Original autorizado por: John Hughes Adaptado."— Presentation transcript:

Similar presentations

About project

Feedback