Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rhône-Alpes GRAAL Research Team Join work with DIET TEAM D istributed I nteractive.

Similar presentations


Presentation on theme: "Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rhône-Alpes GRAAL Research Team Join work with DIET TEAM D istributed I nteractive."— Presentation transcript:

1 Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rhône-Alpes GRAAL Research Team Join work with DIET TEAM D istributed I nteractive E ngineering T oolbox DIET Batch and Simbatch: a quick glance

2 RPC and Grid Computing: Grid RPC AGENT(s) S1S2 S3 S4 A, B, C Answer (C) S2 ! Request Op(C, A, B) Client

3 Outline 1.Introduction 2.Diet-Batch 3.Simbatch 4.Conclusion and perspectives

4 DIET Architecture LA MA LA Server front end Master Agent Local Agent Client MA JXTA FAST library Application Modeling System availabilities LDAPNWS

5 MA SeD_parallel Frontal NFS LSFPBS Loadleveler GLUE SeD_batch SeD_seq Parallel and batch submissions - 1/2 Parallel & sequential jobs → transparent for the user Submit a parallel job → system dependent  NFS: copy the code?  MPI: LAM, MPICH?  batch system dependent  Numerous batch systems (homogenization?)  Batch schedulers behaviour (queues, scripts, etc.)  Information about the internal scheduling process  Monitoring & Performance prediction SGEOAR LA

6 Parallel and batch submissions - 2/2 2 API  Client side  Request for seq, // resolution or let DIET choose the best  Server side  Script with generic mnemonics DIET_NAME_FRONTALE, DIET_NB_NODES, DIET_BATCH_NODESFILE  A program that must end with a call to diet_submit_call() Experiments

7 Performance prediction with batch system During the submission stage  Need to know when the task will begin/end  Need to decide how many processors will be used  Need performance prediction! Three means  Use a probabilistic tool  Ask the batch system (only available for MAUI and OAR 2.0)  Use a simulator

8 Batch scheduler overview Portable Batch System (PBS)  First Come First Served (FCFS) OAR (v. 1.6)  Conservative BackFilling (CBF) Torque + Maui  Only torque: FCFS  Maui  3 scheduling policies: BESTFIT, FIRSTFIT (CBF), GREEDY Sun Grid Engine (SGE)  FCFS Loadleveler  3 scheduling policies: FCFS, CBF, GANG  Possibility to plug external schedulers  EASY  Maui (should soon become the standard scheduler)

9 Grid simulator overview Data replication:  ChicSim :  I. Foster  PARallel Simulation Environment for Complex Systems  OptorSim:  W. H. Bell, D. G. Cameron, R. Carvajal-Schiaffino  JAVA Grid-economy  GridSim:  R.Buyya(Nimrod/G)  JAVA  Quite similar to Simgrid Non-specialized toolkit  Simgrid  H. Casanova, A. Legrand and M. Quinson  C

10 … and their drawbacks Minimal support for batch schedulers Sometimes lack of functionalities to create them Often difficult to reuse  Example: OptorSim No parallel tasks available  Backfilling impossible  Lack of realism

11 Simbatch in a nutshell Goals  Cluster simulation for enhancing realism  Prediction tool for DIET API for clients  Description of the platform in XML files  Use of the API in the deployment.xml file  Example 1: Creating a batch process on the host « Frontal »  Example 2: Creating a resource  Each batch must be described in simbatch.xml  A specific load can be simulated for each batch API for developers  Algorithms are plug-ins  Reusable functions  Find the first matching slot in a Gantt chart slot_t * find_first_slot(cluster_t c, int nb_nodes, double start_time, double duration);  Empty queues and reschedule void generic_reschedule(cluster_t cluster, void (*schedule)(cluster_t cluster, m_task_t task));

12 Experiment description 2 types of experiments  Validation by simulation: parameter variation  Topology, scheduling algorithm…  Comparison between simulated platform Task generation  Inter-arrival time: Poisson law, µ = 300s  Resources number: U(1,5)  Run time: U(600,1800)  Wall time: run time x U(1.1;1.3) Experiment platform  5 node cluster  Star topology  OAR v. 1.6

13 Validation

14 Simulation precision Number of tasks: 100 Makespan: 23h Error rate on the flow metrics around 1%

15 Conclusion and perspectives DIET-Batch  Diet is now able to handle batch schedulers  3 Sed types: sequential, batch, parallel  Good performance improvements Simbatch  Standalone simulations show good results  Configuration file available to simulate Lyon’s site  Excellent tool to replay load Next steps  Integrate Simbatch in DIET-Batch

16 Questions ? http://graal.ens-lyon.fr/DIET/ http://graal.ens-lyon.fr/simbatch/


Download ppt "Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rhône-Alpes GRAAL Research Team Join work with DIET TEAM D istributed I nteractive."

Similar presentations


Ads by Google