Semi-Explicit Parallel Programming in Haskell Satnam Singh Microsoft Research Cambridge Leeds2009.

Semi-Explicit Parallel Programming in Haskell Satnam Singh Microsoft Research Cambridge Leeds2009

public class ArraySummer { private double[] a; // Encapsulated array private double sum; // Variable used to compute sum // Constructor requiring an initial value for array public ArraySummer(double[] values) { a = values; } // Method to compute the sum of segment of the array public void SumArray(int fromIndex, int toIndex, out double arraySum) { sum = 0; for (int i = fromIndex; i < toIndex; i++) sum = sum + a[i]; arraySum = sum; }

thread 1 thread 2 ThreadCreate thread.Start thread.Join

class Program { static void Main(string[] args) { const int testSize = 100000000; double[] testValues = new double[testSize] ; for (int i = 0; i < testSize; i++) testValues[i] = i/testSize; ArraySummer summer = new ArraySummer(testValues) ; Stopwatch stopWatch = new Stopwatch(); stopWatch.Start(); double testSum ; summer.SumArray(0, testSize, out testSum); TimeSpan ts = stopWatch.Elapsed; Console.WriteLine("Sum duration (mili-seconds) = " + stopWatch.ElapsedMilliseconds); Console.WriteLine("Sum value = " + testSum); Console.ReadKey(); }

class Program { static void Main(string[] args) { const int testSize = 100000000; double[] testValues = new double[testSize]; for (int i = 0; i < testSize; i++) testValues[i] = i / testSize; ArraySummer summer = new ArraySummer(testValues); Stopwatch stopWatch = new Stopwatch(); stopWatch.Start(); double testSumA = 0 ; double testSumB; Thread sumThread = new Thread(delegate() { summer.SumArray(0, testSize / 2, out testSumA); }); sumThread.Start(); summer.SumArray(testSize/2+1, testSize, out testSumB); sumThread.Join(); TimeSpan ts = stopWatch.Elapsed; Console.WriteLine("Sum duration (mili-seconds) = " + stopWatch.ElapsedMilliseconds); Console.WriteLine("Sum value = " + (testSumA+testSumB)); Console.ReadKey(); }

The Accidental Semi-colon ;

A ; B ; createThread (A) ; B; A B AB

Execution Model fib 0 = 0 fib 1 = 1 fib n = fib (n-1) + fib (n-2) fib 0 = 0 fib 1 = 1 fib n = fib (n-1) + fib (n-2) 10 9 9 8 8 3 3 5 5 8 8 6 6 5 5 8 8 1 1 1 1 “Thunk” for “fib 10” Pointer to the implementation Storage slot for the result Values for free variables

wombat and numbat wombat :: Int -> Int wombat n = 42*n numbat :: Int -> IO Int numbat n = do c <- getChar return (n + ord c) pure function side-effecting function Computation inside a ‘monad’

IO (), pronounced “IO unit” numbat :: IO () numbat = do c <- getChar putChar (chr (1 + ord c))

f (g + h) z!!2mapM f [a, b,..., g] infer type [Int] -> BoolIO String pure function deterministic stateful operation may be non-deterministic

Functional Programming to the Rescue? Why not evaluate every-sub expression of our pure functional programs in parallel? –execute each sub-expression in its own thread? The 80s dream does not work: –granularity –data-dependency

Infix Operators mod a b mod 7 3 = 1 Infix with backquotes: a `mod` b 7 `mod` 3 = 1

x `par` y x is sparked for speculative evaluation a spark can potentially be instantiated on a thread running in parallel with the parent thread x `par` y = y typically x used inside y blurRows `par` (mix blurCols blurRows)

x `par` (y + x) x y y is evaluated first x x is evaluated second x is sparked x fizzles

x `par` (y + x) x y y is evaluated on P1 x x is taken up for evaluation on P2 x is sparked on P1 P1P2

par is Not Enough pseq :: a -> b -> b pseq is strict in its first argument but not in its second argument Related function: – seq :: a -> b -> b –Strict in both arguments –Compiler may transform seq x y to seq y x –No good for controlling order for evaluation for parallel programs

Don Stewart Parallel fib with threshold cutoff = 35 -- Threshold for parallel evaluation -- Sequential fib fib' :: Int -> Integer fib' 0 = 0 fib' 1 = 1 fib' n = fib' (n-1) + fib' (n-2) -- Parallel fib with thresholding fib :: Int -> Integer fib n | n < cutoff = fib' n | otherwise = r `par` (l `pseq` l + r) where l = fib (n-1) r = fib (n-2) -- Main program main = forM_ [0..45] $ \i -> printf "n=%d => %d\n" i (fib i)

Parallel fib performance

Parallel quicksort (wrong) quicksortN :: (Ord a) => [a] -> [a] quicksortN [] = [] quicksortN [x] = [x] quicksortN (x:xs) = losort `par` hisort `par` losort ++ (x:hisort) where losort = quicksortN [y|y <- xs, y < x] hisort = quicksortN [y|y = x]

What went wrong? cons cell Unevaluated thunk losort

forceList forceList :: [a] -> () forceList [] = () forceList (x:xs) = x `seq` forceList xs

Parallel quicksort (right) quicksortF [] = [] quicksortF [x] = [x] quicksortF (x:xs) = (forceList losort) `par` (forceList hisort) `par` losort ++ (x:hisort) where losort = quicksortF [y|y <- xs, y < x] hisort = quicksortF [y|y = x]

parSumArray :: Array Int Double -> Double parSumArray matrix = lhs `par` (rhs`pseq` lhs + rhs) where lhs = seqSum 0 (nrValues `div` 2) matrix rhs = seqSum (nrValues `div` 2 + 1) (nrValues-1) matrix

Strategies Haskell provides a collection of evaluation strategies for controlling the evaluation order of various data-types. Users have to define indicate how their own types are evaluated to a normal form. Algorithms + Strategy = Parallelism, P. W. Trinder, K. Hammond, H.-W. Loidl and S. L. Peyton Jones. http://www.macs.hw.ac.uk/~dsg/gph/papers/h tml/Strategies/strategies.htmlhttp://www.macs.hw.ac.uk/~dsg/gph/papers/h tml/Strategies/strategies.html

Explicitly Creating Threads forkIO :: IO () -> ThreadID Creates a lightweight Haskell thread, not an operating system thread.

Inter-thread Communication putMVar :: MVar a -> IO () takeMVar :: MVar a -> IO a

MVars mv... putMVar mv 52... v <- takeMVar mv... 52empty

Rendezvous threadA :: MVar Int -> MVar Float -> IO () threadA valueToSendMVar valueReceivedMVar = do -- some work -- new perform rendezvous by sending 72 putMVar valueToSendMVar 72 -- send value v <- takeMVar valueToReadMVar putStrLn (show v)

Rendezvous threadB :: MVar Int -> MVar Float -> IO () threadB valueToReceiveMVar valueToSendMVar = do -- some work -- now perform rendezvous by waiting on value z <- takeMVar valueToReceiveMVar putMVar valueToSendMVar (1.2 * z) -- continue with other work

Rendezvous main :: IO () main = do aMVar <- newEmptyMVar bMVar <- newEmptyMVar forkIO (threadA aMVar bMVar) forkIO (threadB aMVar bMVar) threadDelay 1000 -- BAD!

fib again fib :: Int -> Int -- As before fibThread :: Int -> MVar Int -> IO () fibThread n resultMVar = putMVar resultMVar (fib n) sumEuler :: Int -> Int -- As before

fib fixed fibThread :: Int -> MVar Int -> IO () fibThread n resultMVar = do pseq f (return ()) putMVar resultMVar f where f = fib n

$ time fibForkIO +RTS -N1 real 0m40.473s user 0m0.000s sys 0m0.031s $ time fibForkIO +RTS -N2 real 0m38.580s user 0m0.000s sys 0m0.015s

43 “STM”s in Haskell data STM a instance Monad STM -- Monads support "do" notation and sequencing -- Exceptions throw :: Exception -> STM a catch :: STM a -> (Exception->STM a) -> STM a -- Running STM computations atomically :: STM a -> IO a retry :: STM a orElse :: STM a -> STM a -> STM a -- Transactional variables data TVar a newTVar :: a -> STM (TVar a) readTVar :: TVar a -> STM a writeTVar :: TVar a -> a -> STM ()

Transactional Memory do {...this...} orelse {...that...} tries to run “this” If “this” retries, it runs “that” instead If both retry, the do-block retries. GetEither() will thereby wait for there to be an item in either queue Q1 Q2 R void GetEither() { atomic { do { i = Q1.Get(); } orelse { i = Q2.Get(); } R.Put( i );}

ThreadScope GHC run-time can generate eventlogs. Instrument: –thread creating, start/stop, migration –GCs ThreadScope graphical viewer Q: how to mine / understand the information?

Lots Unsaid xperf / VTune correlation Verification Debugging Parallel garbage collection

Summary Three ways of writing parallel and concurrent programs in Haskell: –`par` and `pseq` (semi-explicit parallelism) –Mvars (explicit concurrency) –STM (explicit concurrency with transactions) Implicit concurrency Pure functional programming has pros and cons for parallel programming. Can mainstream languages take advantage of the same techniques? How can visualization help with performance tuning?

Semi-Explicit Parallel Programming in Haskell Satnam Singh Microsoft Research Cambridge Leeds2009.

Similar presentations

Presentation on theme: "Semi-Explicit Parallel Programming in Haskell Satnam Singh Microsoft Research Cambridge Leeds2009."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Semi-Explicit Parallel Programming in Haskell Satnam Singh Microsoft Research Cambridge Leeds2009.

Similar presentations

Presentation on theme: "Semi-Explicit Parallel Programming in Haskell Satnam Singh Microsoft Research Cambridge Leeds2009."— Presentation transcript:

Similar presentations

About project

Feedback