Presentation is loading. Please wait.

Presentation is loading. Please wait.

Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for.

Similar presentations


Presentation on theme: "Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for."— Presentation transcript:

1 Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for general Parallel Orbit calculations for use in SymGrid-Par.

2 A General Orbit Calculation Explore a solution space given: An initial set of values; A set of generators; Used in computational algebra: Symmetry of solutions: chemistry, quantum physics, etc. Rubik’s Cube (Permutations). Sequential implementations already exist, concerns about performance.

3 The Orbit Calculation 1 Starting set f :: Int -> Int f x = (x+1) `mod` 4

4 The Orbit Calculation 1 Accumulating set f :: Int -> Int f x = (x+1) `mod` 4 f 1 2

5 The Orbit Calculation 1 Accumulating set f :: Int -> Int f x = (x+1) `mod` 4 f 2 2 3

6 The Orbit Calculation 1 Accumulating set f :: Int -> Int f x = (x+1) `mod` 4 f 3 2 3 0

7 The Orbit Calculation 1 Accumulating set f :: Int -> Int f x = (x+1) `mod` 4 f 0 2 3 0

8 State-of-the-art A sequential version already exists in GAP. But we need to be able to compute the orbit of millions of iterations. Parallel version exists in C using hash tables: But fine-tuned to a very specific problem (direct condensation) May not be scalable? There is also a new parallel implementation in GAP (Shpectorov) tuple-based implementation uses SCSCP and dedicated hash-table servers We need a general skeleton that can be used for arbitrary orbits

9 SymGrid-Par

10

11 The Orbit - Sequential Version To our knowledge, this is the first version ever implemented in Haskell. orbitMul :: (Ord a, Eq a) => [ a -> a ] -> [a] -> [a] -> [a] orbitMul gens [] set = set orbitMul gens (t:ts) set = orbitMul gens (ts++new) set' where (new, set') = applyGens gens [t] set [] applyGens =... img =... queue of tasks generators accumulating set of results

12 The Orbit - Sequential genimg :: Eq a => (a->a) -> [a] -> [a] -> ([a], [a]) genimg g queue@(t:ts) set = if img `elem` set then ([], set ) else (img : queue, img : set ) where img = g q img represents the generator applied to the task (the image of the generator application). Need to check for membership in the result set. add img to new task queue add img to set of results

13 The Orbit - Sequential applyGens :: Eq a => [ a -> a ] -> [a] -> [a] -> [a] -> ([a],[a]) applyGens [] q s q' = (q', s) applyGens (g:gs) queue set q' = applyGens gs queue set' (q'++queue') where (queue', set') = genimg g queue set Recurse over list of generators. Pass result of an img into next generator application.

14 Parallel Orbit Need a queue to express tasks waiting to be processed. We need to distribute the queue over available PEs. We use a Task Farm (master/worker) approach

15 Task Farm (Master/Worker)

16 Extending to a Parallel Orbit The orbit is not quite a true farm, however. Results from workers must be accumulated and checked for duplicates… a set? Non-duplicates are released as new tasks. Moreover, we must be sure that the orbit will terminate!

17 Parallel Orbit

18 Parallel Orbit Calculation orbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a] orbitPar orbitfun gens init = … workerProcs = [ process (concat. (Data.List.map (orbitfun gens))) | n <- [1..noPe] ] toWorker tasks = unshuffle noPe tasks process abstraction

19 Parallel Orbit Calculation orbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a] orbitPar orbitfun gens init = … addNewTask set (t:ts) c | not (t `member` set) = t : addNewTask (Data.Set.insert t set) ts ((c-1)+nGens) | c <= 1 = [] | otherwise = addNewTask set ts (c - 1) workerProcs = … toWorker tasks = … count of potential tasks

20 Simple Test Case Test case that gives similar (tunable) granularities. Deliver wide range of result values. Change size of result set by setSize. All tests seeded with 1. f1 s n = (fib ((n `mod` 20) + 10) + n) `mod` setSize f2 s n = (fib ((n `mod` 10) + 20) + n) `mod` setSize f3 s n = (fib ((n `mod` 19) + 10) + n - 1) `mod` setSize orbitOnList [] _ = [] orbitOnList (g:gens) list = map g list : orbitOnList gens list

21 Measurement Framework Executed on 8-core machine running at 2.66GHz. 4 GB of RAM. Compiled with GHC 6.82 -O2. Runtimes are given as an average over 10 runs. Performance of parallel version against single core parallel version.

22 Farm Speedup Against Par 1 setSize

23 Farm - Trace (64000)

24 Evaluation of the Task Farm Good for regular and well-balanced tasks. Static round-robin distribution. May suffer from load imbalance. Does not distribute tasks in a request driven way.

25 A Workpool Approach Distributes tasks in a request driven way when a task completes, its processor is added to the queue of idle processors Better for irregular and unbalanced tasks. Automatically deals with load imbalance. Still limited by master/worker ratio

26 Workpool

27 Workpool Speedup Against Par 1 setSize

28 Workpool - Trace (64000)

29 Conclusions Speedup appears almost linear up to a factor of 8.29 on 8 cores for a set size of 64000. Workpool is more efficient, and gives better speed ups for larger set sizes. Workpool may incur slight overhead, noticeable in small set sizes. Workpool is more balanced for larger set sizes.

30 Work in Progress Integrating orbit skeleton into SymGrid-Par. Use GAP to compute the computational algebra… … Haskell to exploit parallelism. Application to larger problems e.g. the braid orbit Develop tool support to aid parallel development e.g. using refactoring First of a series of domain-specific parallel skeletons duplicate elimination, completion algorithm, chain reduction, …

31 http://www.symbolic-computation.org

32 Future Work Complete SGP integration. Solve some real symbolic computing problems. Tool support for sequential -> parallel transformations? Implement more parallel skeletons: Parallel nub ?

33 Parallel Orbit Calculation orbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a] orbitPar orbitfun gens init = dat where newTasks = merge (zipWith (#) workerProcs (toWorker dat)) dat = (addNewTask empty (init' ++ newTasks) (length init')) init' = take noPe (cycle init) addNewTask set (t:ts) c | not (t `member` set) = t : addNewTask (Data.Set.insert t set) ts c' | c <= 1 = [] | otherwise = addNewTask set ts (c - 1) where c' = (c-1) + nGens workerProcs = [ process (concat. (Data.List.map (orbitfun gens))) | n <- [1..noPe] ] toWorker tasks = unshuffle noPe tasks

34 Eden Semi-explicit model of parallelism. Explicit process creation. Implicit thread creation: (unzip. streamf) :: Num a => [a] -> ([a],[a]) uncurry zip ((process (unzip. streamf) # [1..10]) where streamf args = map worker args worker x = (factorial x, fibonacci x)

35 Questions? http://www.symbolic-computation.org/The_SCIEnce_Project

36 Workpool orbitPar :: ([ a->a ] -> [a] -> [ [a] ]) -> [a->a] -> [a] -> [a] orbitPar orbitfun gens init = dat where (newReqs, newTasks) = (unzip. merge) (zipWith (#) workerProcs (toWorker dat)) dat = (addNewTask empty (init' ++ newTasks) (length init')) init' = take noPe (cycle init) addNewTask set (t:ts) c | not (t `member` set) = t : addNewTask (Data.Set.insert t set) ts c' | c <= 1 = [] | otherwise = addNewTask set ts (c - 1) where c' = (c-1) + nGens workerProcs = [ process (zip [n,n..]. (concat. (Data.List.map (orbitfun gens)))) | n <- [1..noPe] ] toWorker tasks = distribute tasks requests requests = initialReqs ++ newReqs


Download ppt "Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for."

Similar presentations


Ads by Google