Presentation is loading. Please wait.

Presentation is loading. Please wait.

I2G CrossBroker Enol Fernández UAB

Similar presentations


Presentation on theme: "I2G CrossBroker Enol Fernández UAB"— Presentation transcript:

1 I2G CrossBroker Enol Fernández UAB
Dublin MPI Course, September 2007

2 Dublin MPI Course, 10-11 September 2007
Introduction CrossBroker does automatic scheduling in Grid Environments Resource discovery Resource Selection Job Execution Jobs not treated by gLite: parallel jobs (MPI) Run in more than one resource, in a coordinated fashion. Interactive jobs The user interacts with the application during its execution Dublin MPI Course, September 2007

3 Dublin MPI Course, 10-11 September 2007
Architecture CrossBroker Information Index Migrating Desktop Scheduling Agent Resource Searcher Replica Manager Application Launcher Condor-G DAGMan Comments on each module on the next slide! CE WN EGEE/Globus CE WN EGEE/Globus Dublin MPI Course, September 2007

4 Dublin MPI Course, 10-11 September 2007
Architecture Scheduling Agent Receives each job and keeps it in a persistent queue Contacts Resource Searcher and gets a list of available resources Selects resources and passes them to Application Launcher Resource Searcher Given a job description (JobAd), performs the matchmaking between job needs and available resources. Uses the Condor ClassAd library, originally designed for matches of a single job with a single resource. A set matching has been developed to support matches of a single job to a group of resources. Application Launcher Responsible for providing a reliable submission service of parallel applications on the Grid. Responsible for file staging at the remote site (executable and input/output files) Uses the services of Condor-G Not much of interest, I think the previous one is enough, but here is some explanation about it. Dublin MPI Course, September 2007

5 Dublin MPI Course, 10-11 September 2007
Parallel Job Support Support for parallel jobs: Open MPI PACX-MPI MPICH-P4 MPICH-G2 Plain (just the machines) Takes into account sites capabilites. Ability to define starter scripts/process to start the parallel job mpi-start is configured automatically and used by default. Dublin MPI Course, September 2007

6 Dublin MPI Course, 10-11 September 2007
Parallel Job Support Changes in JDL JOBTYPE: Normal: sequential jobs, just one CPU Parallel: more than one CPU SUBJOBTYPE: openmpi pacx-mpi mpich mpich-g2 plain JOBSTARTER (if not defined, mpi-start) JOBSTARTERARGUMENTS Dublin MPI Course, September 2007

7 Dublin MPI Course, 10-11 September 2007
Parallel Job Support Type = "Job"; VirtualOrganisation = "imain"; JobType = "Parallel"; SubJobType = "pacx-mpi"; NodeNumber = 5; Executable = "test-app"; Arguments = "-v"; InputSandbox = {"test-app", "inputfile"}; OutputSanbox = {"std.out", "std.err"}; StdErr = "std.err“; StdOutput = "std.out"; Rank = other.GlueHostBenchmarkSI00 ; Requirements = other.GlueCEStateStatus == "Production"; JDL example of a pacx job, that will be used later Dublin MPI Course, September 2007

8 Dublin MPI Course, 10-11 September 2007
MPI Across Sites CrossBroker search and selects sets of resources for the jobs There is no guarantee that all tasks of the same job will start at the same time 1st choice: select only sites with free resources. The job will run immediately. Unfortunately, free resources are not always available 2nd choice: allocate a resource temporally and wait until all other tasks show up. Timeshare the resource with a backfilling policy to avoid resource iddleness Dublin MPI Course, September 2007

9 Dublin MPI Course, 10-11 September 2007
MPI Across Sites CE CE2=aocegrid.uab.es FreeCPUs = 10 Disk = 100 AverageSI = 4000 CE3=bee001.ific.uv.es FreeCPUs = 3 AverageSI = 1000 CE1=zeus.cyf-kr.edu.pl FreeCPUs = 2 AverageSI = 2000 RS [Groups with 1 CEs] [Rank=2000] aocegrid.uab.es:2119/jobmanager-pbs-workq freeCPUs = 10 [Rank=1500] zeus.cyf-kr.edu.pl:2119/jobmanager-pbs-workq freeCPUs = 2 bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3 Rank=1000] lngrid02.lip.pt/jobmanager-pbs-workq freeCPUs = 2 bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3 CE CE5=lngrid02.lip.pt FreeCPUs = 2 Disk = 100 AverageSI = 1000 CE CE4= xgrid.icm.edu.pl FreeCPUs = 6 Disk = 100 AverageSI = 1000 [Groups with 1 CEs] [Rank=2000] aocegrid.uab.es:2119/jobmanager-pbs-workq freeCPUs = 10 [Groups with 2 CEs] [Rank=1500] zeus.cyf-kr.edu.pl:2119/jobmanager-pbs-workq freeCPUs = 2 bee001.ific.uv.es:2119/jobmanager-pbs-workq freeCPUs = 3 [Rank=1000] lngrid02.lip.pt:2129/jobmanager-pbs-workq MPI enabled CE Non-MPI enabled CE Example of resource selection for PACX: There are 5 sites, 1 is not MPI enabled The user asks for 5 CPUs so the possibilites are shown in the animation: 1st-> 1 site for the whole application in pink 2nd -> 2 sites which sum 5 CPUs 3rd -> 2 sites that also sum 5 CPUs The broker will always try to execute using the lower number of sites to avoid the high latency links between them Dublin MPI Course, September 2007

10 Dublin MPI Course, 10-11 September 2007
Time Sharing Grid Resource CrossBroker LRMS MPI JOB Scheduling Agent Condor-G These slides are an schema of what goes on with the glidein. It is not of much interest for the users since this is done transparently and without any notice for them. I think they are quite easy to follow: the mpi job request arrives at the broker, it will select one resource. Starting the application launcher (in pacx-mpi this includes the startupserver), and using CondorG, the CB will submit a job to the resource. This jobs is in fact an agent (glidein) that divides the WN into two virtual machines that can be controlled from the CB One of the VM is used for the MPI Job, that will wait suspended until all the other parts of the job are ready. In the meantime, the CB can do backfilling and use the other VM to execute other jobs. When all the MPI tasks are ready, then the MPI jobs is awaken and continues its execution. Dublin MPI Course, September 2007

11 Dublin MPI Course, 10-11 September 2007
Time Sharing Grid Resource CrossBroker LRMS MPI JOB Scheduling Agent Application Launcher Condor-G Dublin MPI Course, September 2007

12 Dublin MPI Course, 10-11 September 2007
Time Sharing Grid Resource CrossBroker LRMS MPI JOB Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G Dublin MPI Course, September 2007

13 Dublin MPI Course, 10-11 September 2007
Time Sharing Grid Resource CrossBroker LRMS MPI JOB Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G Dublin MPI Course, September 2007

14 Dublin MPI Course, 10-11 September 2007
Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G MPI TASK Waiting For rest of tasks Dublin MPI Course, September 2007

15 Dublin MPI Course, 10-11 September 2007
Time Sharing Grid Resource CrossBroker JOB LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G MPI TASK Dublin MPI Course, September 2007

16 Dublin MPI Course, 10-11 September 2007
Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G JOB MPI TASK BackFilling While the MPI waits Dublin MPI Course, September 2007

17 Dublin MPI Course, 10-11 September 2007
Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G MPI TASK JOB All tasks Ready! Dublin MPI Course, September 2007

18 Interactive Job Support
Scheduling priority Interactive jobs are sent to sites with available machines If there are not available machines, use time sharing Support for interactivity in all kinds of jobs sequential and all the MPI flavors CrossBroker injects intractive agents that enable communication between user and job Transparent to the user Full integration with glogin & gvid Dublin MPI Course, September 2007

19 Interactive Job Support
Changes in JDL INTERACTIVE: true/false. Indicates that the job is interactive and the broker should treat it with higher proirity INTERACTIVEAGENT INTERACTIVEAGENTARGUMENTS These attributes specify the command (and its arguments) used to communicate with the user. Dublin MPI Course, September 2007

20 Interactive Job Support
Type = "Job"; VirtualOrganisation = "imain"; JobType = "Parallel"; SubJobType = “openmpi"; NodeNumber = 11; Interactive = TRUE; InteractiveAgent = “glogin“; InteractiveAgentArguments = “-r –p :23433“; Executable = "test-app"; InputSandbox = {"test-app", "inputfile"}; OutputSanbox = {"std.out", "std.err"}; StdErr = "std.err“; StdOutput = "std.out"; Rank = other.GlueHostBenchmarkSI00 ; Requirements = other.GlueCEStateStatus == "Production"; JDL example of fusion application Dublin MPI Course, September 2007

21 Dublin MPI Course, 10-11 September 2007
Time Sharing Grid Resource CrossBroker INT. JOB LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G BATCH Same thing as the MPI, but with interactive and batch jobs. Just the final part of the scenario. The batch job can be suspended or lower its priority and this is controlled by the broker Dublin MPI Course, September 2007

22 Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent
Application Launcher VM1 VM2 Condor-G INT. JOB BATCH Startup-time Reduction Only one layer involved Dublin MPI Course, September 2007

23 Dublin MPI Course, 10-11 September 2007
Other features Intelligent job retrial disables submission to failing sites temporarily Fast notification of job status better interaction with the application gLite interoperability accepts jobs from gLite's UI able to submit jobs to gLite resources (LCG-CE and gLite CE) Dublin MPI Course, September 2007


Download ppt "I2G CrossBroker Enol Fernández UAB"

Similar presentations


Ads by Google