Presentation is loading. Please wait.

Presentation is loading. Please wait.

F. Gava, HLPP 2005 Frédéric Gava A Modular Implementation of Parallel Data Structures in Bulk-Synchronous Parallel ML.

Similar presentations


Presentation on theme: "F. Gava, HLPP 2005 Frédéric Gava A Modular Implementation of Parallel Data Structures in Bulk-Synchronous Parallel ML."— Presentation transcript:

1 F. Gava, HLPP 2005 Frédéric Gava A Modular Implementation of Parallel Data Structures in Bulk-Synchronous Parallel ML

2 F. Gava, HLPP 2005 Outline  Introduction;  The BSML language;  Implementation of parallel data structures in BSML:  Dictionaries;  Sets;  Load-Balancing.  Application;  Conclusion and futur works.

3 F. Gava, HLPP 2005 Introduction  Parallel Computing for speed;  To complex for many non-computer scientists;  Need for models/tools of parallelism. Automatic Parallelization Concurrent Programming Structured Parallelism Algorithmic Skeletons BSP Data Structures Skeletons

4 F. Gava, HLPP 2005 Introduction (bis)  Observations:  Data Structures also important as algorithms;  Symbolic computations used massively those data structures.  Suggested solution, parallel implementations of data structures:  Interfaces as close as possible to the sequential ones;  Modular implementations to have a straightforward maintenance;  Load-balancing of the data.

5 F. Gava, HLPP 2005 BSML Outline:  Introduction;  BSML;  Parallel Data Structures in BSML;  Application;  Conclusion and futur works.

6 F. Gava, HLPP 2005  Advantages of the BSP model:  Portability;  Scalability, deadlock free;  Simple cost model  performance prediction.  Advantages of functional programming:  High-level features (higher order functions, pattern-matching, concrete types, etc…);  Savety of the environment;  Programs Proofs (proof of BSML programs using Coq). Bulk-Synchronous Parallelism + Functional Programming = BSML

7 F. Gava, HLPP 2005  Confluent language: deterministic algorithms;  Library for the « Objective Caml » language (called BSMLlib);  Operations to access to the BSP parameters;  5 primitives on a parallel data structure called parallel vector:  mkpar: create a parallel vector;  apply: parallel point-wise application;  put: send values within a vector;  proj: parallel projection;  super: BSP divide-and-conquer. The BSML Language

8 F. Gava, HLPP 2005 A BSML Program f p-1 …f1f1 f0f0 g p-1 …g1g1 g0g0 Sequential part Parallel part

9 Superthreads in BSML

10 F. Gava, HLPP 2005 Parallel Data Structures in BSML Outline:  Introduction;  BSML;  Parallel Data Structures in BSML;  Application;  Conclusion and futur works.

11 F. Gava, HLPP 2005 General Points  5 modules: Set, Map, Stack, Queue, Hashtable;  Interfaces:  Same as O’Caml ones;  With some specific parallel functions (squeletons) as parallel reduction;  Pure functional implementationx (for functional data);  Manual or Automatic load-balancing.

12 Modules in O’Caml  Interface:  Implementation:  Functor: module type Compare = sig type elt val compare : elt -> elt -> int end module CompareInt = struct type elt=int let tools =... let compare =... end module AbstractCompareInt = (CompareInt : Compare) module Make(Ord: Compare) = struct type elt = Ord.elt type t = Empty | Node of t * elt * t * int let mem e s =... end

13 F. Gava, HLPP 2005 module Make (Ord : OrderedType)(Bal:BALANCE) (MakeLocMap:functor(Ord:OrderedType) -> Map.S with type key=Ord.t) = struct module Local_Map = MakeLocMap(Ord) type key = Ord.t type 'a t = ('a Local_Map.t par) * int * bool type seq_t = Local_Map.t (* operators as skeletons *) end Parallel Dictionaries  A parallel map (dictionary) = a map on each processor:  We need to re-implement all the operations (data skeletons).

14 F. Gava, HLPP 2005 Insert a Binding  add: key  'a  'a t  'a t If rebalanced Otherwise

15 F. Gava, HLPP 2005 Parallel Iterator Let cardinal pmap=ParMap.fold (fun _ _ i  i+1) 0 pmap  Fold need to respect the order of the keys;  Parallel map  sequential map;  Too many communications…  async_fold: (key  'a  'b  'b)  'a t  'b  'b par let cardinal pmap=List.fold left (+) 0 (total(ParMap.async fold (fun _ _ i  i+1) pmap 0))

16 F. Gava, HLPP 2005 Parallel Sets  A sub-set on each processor;  Insert/Iteration as parallel maps;  But with some binary skeletons;  Load-balancing of couples of parallel sets using the superposition.

17 F. Gava, HLPP 2005 Difference  3 cases:  Two normal parallel sets;  One of the parallel sets has been rebalanced;  The two parallel sets have been rebalanced;  Imply a problem with duplicate elements.

18 F. Gava, HLPP 2005 Difference (third case) S1 S2

19 F. Gava, HLPP 2005 Load-Balancing (1)  « Same sizes » of the local data structures;  Better performances for parallel iterations;  Load-Balancing in 2 super-steps (M. Bamha and G. Hains) using a histogram

20 F. Gava, HLPP 2005  Generic code of the algorithm: Load-Balancing (2) rebalance: (    par)  (int     list  )  (      )    list      par      int par      Data ||  datas Select « n » messages Union Messages  data Datas  data || Histogram Data ||

21 F. Gava, HLPP 2005 Application Outline:  Introduction;  BSML;  Parallel Data Structures in BSML;  Application;  Conclusion and futur works.

22 F. Gava, HLPP 2005 Computation of the « nth » nearest neighbors atom in a molecule  Code from « Objective Caml for Scientists » (J. Harrop);  Molecule as a infinitely-repeated graph of atoms;  Computation of sets differences (the neighbors);  Replace « fold » with « async_fold »;  Experiments with a silicate of 100.000 atoms and with a cluster of 5/10 machines (Pentium IV, 2.8 Ghz, Gigabit Ethernet Card).

23 Experiments (1)

24 Experiments (2)

25 Experiments (3)

26 F. Gava, HLPP 2005 Conclusion and Futur Works Outline:  Introduction;  BSML;  Parallel Data Structures in BSML;  Application;  Conclusion and futur works.

27 F. Gava, HLPP 2005  BSML=BSP+ML;  Implementation of some data structures;  Modular for a simple development and maintenance;  Pure functional implementation;  Cost prediction with the BSP model;  Generic Load-balancing;  Application. Conclusion

28 F. Gava, HLPP 2005 Futur Works  Proof of the implementations (pure functional);  Implementation of another data structures (tree, priority list etc.);  Application to another scientist problems;  Comparison with another parallel ML (OCamlP3L, HirondML, OCaml-Flight, MSPML etc.);  Development of a modular and parallel graph library:  Edges as parallel maps;  Vertex as parallel sets.

29 F. Gava, HLPP 2005


Download ppt "F. Gava, HLPP 2005 Frédéric Gava A Modular Implementation of Parallel Data Structures in Bulk-Synchronous Parallel ML."

Similar presentations


Ads by Google