Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE is a project funded by the European Union under contract IST-2003-508833 Job Submission José R. Valverde EGEE NA4 Biomed Applications CNB/CSIC EMBnet/CNB.

Similar presentations


Presentation on theme: "EGEE is a project funded by the European Union under contract IST-2003-508833 Job Submission José R. Valverde EGEE NA4 Biomed Applications CNB/CSIC EMBnet/CNB."— Presentation transcript:

1 EGEE is a project funded by the European Union under contract IST-2003-508833 Job Submission José R. Valverde EGEE NA4 Biomed Applications CNB/CSIC EMBnet/CNB Grid, Web Services and Workflows Madrid, February 2007 www.eu-egee.org

2 2 The Grid Is like a classical massively parallel supercomputer  Use a front-end to manage jobs Differences  Many front-ends instead of just one  Physically CPUs are grouped in closely coupled clusters (NoW) Clusters are connected through WAN lines  Logically CPUs are partitioned into logical sets (V.O.s)  Politically Shared and distributed ownership

3 3 Working with the Grid Choose any front-end  Called User Interface  Log in with user and password  U.I.s can not be trusted Identify yourself to the Grid  Log in to the Grid (activate a proxy)  Authenticate with certificate and passphrase Manage Work  Submit jobs (executables and data)  Monitor jobs  Get results

4 4 Don't ask for miracles A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little nor too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity. A program should follow the 'Law of Least Astonishment'. What is this law? It is simply that the program should always respond to the user in the way that astonishes him least. A program, no matter how complex, should act as a single unit. The program should be directed by the logic within rather than by outward appearances. If the program fails in these requirements, it will be in a state of disorder and confusion. The only way to correct this is to rewrite the program. -- Geoffrey James, "The Tao of Programming"

5 5 The User Interface The Grid is like Hydra: one body and many heads  You can feed it through any of its many mouths You feed jobs to the Grid using a User Interface host  Anybody can install one  You can choose any one to connect Hence it cannot be trusted for user identification  You must authenticate twice Once to the chosen user interface Once to the Grid  And then you can feed the Grid anything you want

6 6 Getting started All work with the Grid must be done through a UI We'll use the one at CNB  villon.cnb.uam.es Connect using SSH  Use your own username and password  ssh -l username villon.cnb.csic.es Typical usage  ssh username@villon.cnb.csic.es Shortcut  ssh -X -Y username@villon.cnb.csic.es If you want to use remote X-windows software

7 7 Activating the Grid Direct authentication  Create a proxy certificate (grid-proxy-init)  Printing information on a proxy certificate (grid-proxy-info)  Destroying a proxy certificate (grid-proxy-destroy) Indirect authentication  Creating a long-term proxy (myproxy-init)  Printing information on a long-term proxy (myproxy-info)  Destroy a long-term proxy (myproxy-destroy)

8 8 Authentication An administrator must grant you access  You provide the administrator your identity  S/He grants you access But... how can s/he be sure your identity is real? You need a certificate signed by a trusted notary to prove it  First you make a request  Send it for submission to a Certification Authority (CA)  The CA checks personally with you that the certificate was actually generated by you and belongs to you  Then stamps the request with their signature  Now you have a certificate proving who you are

9 9 Who are you? The administrator simply grants access to the Grid to anybody using your certificate From there on you will be liable for anything done under this identity You may just store it openly  Very easy to use (for you and for anybody!) But you should protect it dearly  Using a passphrase A looooong one, difficult to guess  Makes it a little cumbersome for you to use (have to type it in)  And almost impossible for anybody else

10 10 voms-proxy-init To activate the Grid, issue the command: # voms-proxy-init -voms biomed Your identity: /C=ES/O=DATAGRID-ES/O=CNB/CN=Joe Random User Enter GRID pass phrase: Creating proxy............................................. Done Your proxy is valid until Thu Feb 8 00:53:18 2007

11 11 voms-proxy-info To print information about a proxy certificate, issue the command: # voms-proxy-info subject : /C=ES/O=DATAGRID-ES/O=CNB/CN=Joe Random User/CN=proxy issuer : /C=ES/O=DATAGRID-ES/O=CNB/CN=Joe Ramdon User identity : /C=ES/O=DATAGRID-ES/O=CNB/CN=Joe Ramdon User type : proxy strength : 512 bits path : /tmp/x509up_u14 timeleft : 11:58:04

12 12 voms-proxy-destroy To destroy a proxy certificate, issue the command: # voms-proxy-destroy

13 13 myproxy-init (1/2) To create and store a shadow or virtual certificate, useful when security is an issue, you use myproxy: # myproxy-init Your identity: /C=ES/O=DATAGRID-ES/O=CNB/CN=Joe Ramdon User Enter GRID pass phrase for this identity: Creating proxy................................................. Done Proxy Verify OK Your proxy is valid until: Wed Feb 14 12:25:33 2007 Enter MyProxy pass phrase: Verifying password - Enter MyProxy pass phrase: A proxy valid for 168 hours (7.0 days) for user jru now exists on myproxy.cern.ch.

14 14 You can activate the Grid for a specified amount of time by stating so when using the activation commands: # voms-proxy-init -hours # myproxy-init -c There are many more options that you can use. See the manual pages online for details: # man voms-proxy-init # man myproxy-init Specifying lifetime of proxy

15 15 myproxy-get-delegation Myproxy does not activate the Grid itself. It generates a shadow or virtual certificate to be used instead of the actual one (to increase security) To use this shadow or virtual certificate you use >> myproxy-get-delegation Enter MyProxy pass phrase: A proxy has been received for user jru in /tmp/x509up_u512 Note that now you do not need to use your real certificate passphrase!

16 16 myproxy-info To retrieve information about a myproxy certificate, issue the command: >> myproxy-info username: jru owner: /C=ES/O=DATAGRID-ES/O=CNB/CN=Joe Ramdon User timeleft: 167:59:06 (7.0 days)

17 17 myproxy-destroy To destroy a myproxy certificate, issue the command: # myproxy-destroy

18 18 We're in! Now that we know how to activate the Grid (using our own certificate directly or a virtual certificate indirectly) we can finally start working

19 19 Job Management Commands edg-job-list-match  glite-job-list-match edg-job-submit  glite-job-submit edg-job-status  glite-job-status edg-job-cancel  glite-job-cancel edg-job-get-output  glite-job-get-output

20 20 Job Submission >> edg-job-submit [–r ] [-c ] [-vo ] [-o ] -r the job is submitted directly to the computing element identified by -c the configuration file is pointed by the UI instead of the standard configuration file -vo the Virtual Organization (if user is not happy with the default for the UI or the one in the JDL) -o the generated edg_jobId is written in the Highly recommended!

21 21 Other Job commands (1/2) edg-job-list-match Lists resources matching a job description Without submitting the job edg-job-cancel –i (or edg_jobId) Cancels a given job

22 22 Other Job commands (2/2) edg-job-status --i (or edg_jobId) Displays the status of the job -i the status information about edg_jobId contained in the are displayed edg-job-get-output --dir –i Returns the job-output (the OutputSandbox files) to the user in the directory Creates a subdirectory named user_random-string

23 23 What is that sexy “job”? A command to execute Optional parameters Input data Auxiliary data Output results Special requirements  MPI  Interactive  Checkpointable  DAG...

24 24 Job description In order for the WMS to assign your job the best resources it needs to know what does your job need. What you actually submit to the Grid is a job description  Using an easy Job Description Language (JDL)  Describing the needs of your job

25 25 A sample description Type= "job"; JobType= "normal"; VirtualOrganisation = "biomed"; Executable= "/bin/sh"; StdOutput= "output.txt"; StdError= "error.txt"; InputSandbox= { "job.sh", "executable", "data", "configuration" }; OutputSandbox= { "output.txt", "error.txt", "results" }; RetryCount= 7; Arguments= "job.sh -i data -o results"; Environments= { "PATH=.:$PATH", "INSTALLDIR=." }; Requirements= RegExp("cnb.uam.es", other.GlueCEUniqueId); Rank= 1000 * (other.GlueCEInfoTotalCPUs - other.GlueCEStateWaitingJobs) / other.GlueCEInfoTotalCPUs; # Don't worry, it will be usually a lot simpler!

26 26 JDL syntax An attribute is a pair:  = ; In case of literal string values:  double quotes must be escaped with a backslash Arguments = " \"Hello\" 10";  the character “ ' ” cannot be specified  special characters such as &, |, >, < are only allowed if specified inside a quoted string and preceded by a triple \ –Arguments = " -f file1\\\&file2 " ; Comments must be preceded by a sharp character (#) or have to follow the C++ syntax The JDL is sensitive to blanks at the end of a line  they should not follow the semicolon (;) at the end of a line

27 27 Job Description Language The supported attributes are grouped in two categories:  Job Attributes Define the job itself  Resources Taken into account to choose the “best” resources for the job Computing Resources –Used to build expressions of Requirements and/or Rank attributes by the user –Have to be prefixed with “other.” Data and Storage resources –Input data to process, SE where to store output data, protocols spoken by application when accessing SEs

28 28 JDL: Relevant attributes JobType  Normal (simple, sequential job), Interactive, MPICH, Checkpointable Executable (mandatory)  The command name Arguments (optional)  Command line arguments passed directly to the program (not through the shell) StdInput, StdOutput, StdError (optional)  Standard input/output/error of the job Environment (optional)  List of shell environment variables and their settings needed InputSandbox (optional)  List of files on the UI needed to run the job: they will be copied to the remote resource without execution permission (except for the file named as “Executable”) OutputSandbox (optional)  List of files, among those generated by the job, which will be retrieved

29 29 Virtual Organisation I lied to you!  The Grid is not like Hydra  Rather it is like a net full of oranges  And you are a worm! A juicy net full or oranges  It's got a lot of holes (UI nodes) for you to get in and find your food  But you cannot eat it all, you must choose one  That is your Virtual Organization  Where you'll find all you need and enjoy a fruitful life!  You may munch (submit jobs) on many oranges (VOs), but only on one at a time

30 30 JDL: Relevant attributes VirtualOrganisation (optional)  The name of the VO where the job should be run (whose fruity flesh/resources your job will consume) May be specified on the command line when submitting the job  edg-job-submit --vo biomed May be specified on the command line when starting the session  voms-proxy-init --voms biomed May be accepted by default from the UI configuration  Risky bussiness

31 31 JDL: Requirements Requirements  Job special requirements from the resources  Specified using GLUE attributes of resources published in the Information Service  Only one requirements attribute can be specified if there are more than one, only the last one is taken into account  Its value is a boolean expression (you must combine them)  If not specified, default value defined in UI configuration file is considered Default: other.GlueCEStateStatus == "Production" Use reliable and working resources (sensible, isn't it?)

32 32 JDL: sample requirements other.GlueCEInfoTotalCPUs > 1 (the resource WNs must have at least two CPUs) Member(“CMSIM-133”, other.GlueHostApplicationSoftwareRunTimeEnvironment) (a particular experiment software has to run on the resource and this information is published on the resource environment) The Member operator tests if its first argument is a member of its second argument RegExp(“cern.ch”, other.GlueCEUniqueId) (the job has to run on the domain cern.ch) Member(“VO-biomed”, other.GlueHostApplicationSoftwareRunTimeEnvironment) && (other.GlueCEPolicyMaxWallClockTime > 8600)  (the resource must have some packages of the VO-biomed installed and the job has to run for more than 8600 minutes)

33 33 JDL: Ranks  Expresses preference (how to rank resources that have already met the Requirements expression) as a floating-point number The resource with the highest rank is the one selected  Specified using GLUE attributes  If not specified, default value defined in the UI configuration file is considered Default: - other.GlueCEStateEstimatedResponseTime (the lowest estimated traversal time) Default: other.GlueCEStateFreeCPUs (the highest number of free CPUs)  Example: (other.GlueCEStateWaitingJobs == 0 ? other.GlueCEStateFreeCPUs : - other.GlueCEStateWaitingJobs) Check number of waiting jobs: if none, then rank higher resources with more CPUS, and if there are waiting jobs, rank lower as the number of waiting jobs increases

34 34 Essential JDL At least you have to specify the following attributes:  Executable: the name of the executable  StdOutput and StdError: the files where to write the standard output and standard error of the job  Arguments: the arguments to the executable, if needed  OutputSandbox: the files that must be transferred from UI to WN and viceversa Executable = “ls -al”; StdError = “stderr.log”; StdOutput = “stdout.log”; OutputSandbox = {“stderr.log”, “stdout.log”};

35 35 Simple JDL file Type = "job"; JobType = "normal"; VirtualOrganisation = "biomed"; Executable = "hostname"; StdOutput = "where"; StdError = “horror” OutputSandbox = { "where" }; Note that you may specify an executable installed on the remote execution system (the Working Node) and therefore do not need to include it in the InputSandbox. edg-job-submit -o job.id job.jdl

36 36 Ready for fun?

37 37 Job Submission edg-job-submit -vo the Virtual Organisation (if you want to override defaults) -o the generated job identifier is written to STRONGLY RECOMMENDED: it is useful for other commands, e.g.: edg-job-status –i (or edg_jobId) -r the job is submitted directly to the computing element identified by -c use the configuration file instead of the default UI configuration file

38 38 A JDL with arguments Type = "job"; JobType = "normal"; VirtualOrganisation = "biomed"; Executable = "/bin/ls"; StdOutput = "listing"; StdError = “horror”; OutputSandbox = {"listing"}; Arguments = "-l"; Note that the executable will be run by a call to exec. I.e. The arguments will be passed directly to the program, they will not be processed by a shell. In other words, no I/O redirection or pipelining can be specified as Arguments.

39 39 Running Clustal Type = "job"; JobType = "normal"; VirtualOrganisation = "biomed"; Executable = "clustal.sh"; StdOutput = "output"; StdError = "error"; InputSandbox = { “clustal.sh”, "clustalw", "sequences", "input" }; OutputSandbox = { "output", "error", "sequences.dnd", "sequences.aln" }

40 40 The clustal script Please, note that  The script itself will be copied with execution permissions  Clustalw will not carry over its execution permissions We need to set them in the script  Clustalw is an interactive program We must use I/O redirection to feed its input non-interactively  We cannot specify I/O redirection or pipes as Arguments The script takes care of I/O redirection itself ls -l chmod 755 clustalw./clustalw < input ls -l

41 41 Running Clustal revisited You are smart guys! And surely must have noticed that ClustalW reads from its standard input...  So, why not say so? Type = "job"; JobType = "normal"; VirtualOrganisation = "biomed"; Executable = "clustalw"; StdInput = "input"; StdOutput = "output"; StdError = "error"; InputSandbox = { "clustalw", "sequences", "input" }; OutputSandbox = { "output", "error", "sequences.dnd", "sequences.aln" };

42 42 Cheating the system So, you do not want to learn about JDL.. You don't need to:  Use a generic JDL  which launches a generic script  Have the script do all the work Type = "job"; JobType = "normal"; VirtualOrganisation = "biomed"; Executable = "job.sh"; StdOutput = "out"; StdError = "err"; InputSandbox = { "job.sh", "job.tgz" }; OutputSandbox = { "output.tgz", “out”, “err” };

43 43 Sample script to run TINKER #!/bin/sh tar -zxvf job.tgz# extract contents of job with appropriate perms # set up the environment to use shipped shared libraries export LD_LIBRARY_PATH=/lib:/usr/lib:./tinker/lib:$LD_LIBRARY_PATH export PATH=./tinker/bin:$PATH # do the work pdbxyz coordenadas.pdb minimize coordinates <./min.in anneal coordinates <./ann.in analyze coordinates <./ana.in xyzpdb coordinates # pack only interesting results tar -zcvf output.tgz coordinates.pdb* analyze.out anneal.out minimize.out Note that we must include all the needed dynamic libraries in the.tar.gz archive.

44 44 Including Dynamic Libraries #!/bin/sh # Save an executable under./bin and all its dependence libraries # under./lib # Use as get_exec path_to_executable # (C) José R. Valverde, 2006 mkdir./bin cp $1 bin mkdir./lib ldd $1 | cut -d' ' -f3 | \ while read line ; do if [-e $line ] ; then cp $line./lib ; fi done This script will Install an executable (first argument) under./bin Run ldd to get a list of dynamic libraries needed Parse ldd output and extract the path to them Copy all needed dynamic libraries to./lib It can be greatly enhanced, of course!

45 45 Job resubmission If something goes wrong, the Grid tries to reschedule and resubmit the job  possibly on a different resource Maximum number of resubmissions: min(RetryCount, MaxRetryCount)  RetryCount: JDL attribute  MaxRetryCount: attribute in the “RB” configuration file

46 46 Other relevant UI commands edg-job-list-match  Lists resources matching a job description  Performs the matchmaking without submitting the job edg-job-cancel  Cancels a given job edg-job-status  Displays the status of the job edg-job-get-output  Returns the job-output (the OutputSandbox files) to the user edg-job-get-logging-info  Displays logging information about submitted jobs (all the events “pushed” by the various components of the WMS): useful for debugging purposes

47 47 Let us peep into the Grid We need to know how our jobs procceed  To know when they finish  To know if something goes wrong  To satisfy our curiosity edg-job-status -i job.id  Learn where your job is at any time  What it is doing  And if it is done

48 48 And get it over with edg-job-get-output -i job.id -dir.  Note the dot at the end  Retrieves the job OutputSandbox into the specified directory  Or on /tmp/jobOutput instead  Note that it does not risk overwriting files A new subdirectory is created With your username And a unique string derived from the job ID.

49 49 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status

50 50 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status UI: allows users to access the functionalities of the WMS (via command line, GUI, C++ and Java APIs) submitted Job Status

51 51 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status submitted Job Status edg-job-submit myjob.jdl Myjob.jdl JobType = “Normal”; Executable = "$(CMS)/exe/sum.exe"; InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"}; OutputSandbox = {“sim.err”, “test.out”, “sim.log"}; Requirements = other. GlueHostOperatingSystemName == “linux" && other.GlueCEPolicyMaxWallClockTime > 10000; Rank = other.GlueCEStateFreeCPUs; Job Description Language (JDL) to specify job characteristics and requirements

52 52 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status Job Status RB storage waiting submitted Input Sandbox files Job NS: network daemon responsible for accepting incoming requests

53 53 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status Job Status RB storage waiting submitted WM: responsible to take the appropriate actions to satisfy the request

54 54 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status Job Status RB storage waiting submitted Match- Maker/ Broker Where must this job be executed ?

55 55 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status Job Status RB storage waiting submitted Match- Maker/ Broker Matchmaker: responsible to find the “best” CE where to submit a job

56 56 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status Job Status RB storage waiting submitted Match- Maker/ Broker Where are (which SEs) the needed data ? What is the status of the Grid ?

57 57 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status Job Status RB storage waiting submitted Match- Maker/ Broker CE choice

58 58 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status Job Status RB storage waiting submitted Job Adapter JA: responsible for the final “touches” to the job before performing submission (e.g. creation of wrapper script, etc.)

59 59 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status Job Status RB storage JC: responsible for the actual job management operations (done via CondorG) submitted waiting ready

60 60 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node CE characts & status SE characts & status Job Status RB storage Job Input Sandbox files submitted waiting ready scheduled

61 61 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node Job Status RB storage submitted waiting ready scheduled running “Grid enabled” data transfers/ accesses Job

62 62 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node Job Status RB storage Output Sandbox files submitted waiting ready scheduled running done

63 63 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node Job Status RB storage submitted waiting ready scheduled running done edg-job-get-output

64 64 Job Submission UI Network Server Job Contr. - CondorG Workload Manager RLS Inform. Service Computing Element Storage Element RB node Job Status RB storage submitted waiting ready scheduled running done Output Sandbox files cleared

65 65 Job monitoring UI Log Monitor Logging & Bookkeeping Network Server Job Contr. - CondorG Workload Manager Computing Element RB node LM: parses CondorG log file (where CondorG logs info about jobs) and notifies LB LB: receives and stores job events; processes corresponding job status Log of job events edg-job-status edg-job-get-logging-info Job status

66 66 Possible job states

67 67 Interactive Jobs An interactive job is a job whose standard streams are forwarded to the submitting client The user has to set the JDL JobType attribute to interactive When an interactive job is submitted, the edg-job-submit command  starts a Grid console shadow process in the background that listens on a port assigned by the Operating System The port can be forced through the ListenerPort attribute in the JDL  opens a new window where the incoming job streams are forwarded The DISPLAY environment variable has to be set correctly, because an X window is open The user can specify --nogui option, which makes the command provide a simple standard non-graphical interaction with the running job It is not necessary to specify the OutputSandbox attribute in the JDL

68 68 Logical Checkpointing Job The Checkpointing job is a job that can be decomposed in several steps In every step the job state can be saved in the LB and retrieved later in case of failures The job state is a set of pairs defined by the user The job can start running from a previously saved state and not from the beginning again The user has to set the JDL JobType attribute to checkpointable

69 69 Other useful UI commands edg-job-attach  Starts an interactive session for previously submitted interactive jobs  Starts a listener process on the UI machine edg-job-get-chkpt  Allows the user to retrieve one or more checkpoint states by a previously submitted job

70 70 MPI Job An MPI job is run in parallel on several processors The user has to set the JDL JobType attribute to MPICH and specify in the NodeNumber attribute the required number of CPUs When an MPI job is submitted, the UI adds  in the Requirements attribute Member(“MpiCH”, other.GlueHostApplicationSoftwareRunTimeEnvironment) (the MPICH runtime environment must be installed on the CE) other.GlueCEInfoTotalCPUs >= NodeNumber (the number of CPUs must be at least equal to the required number of nodes)  In the Rank attribute other.GlueCEStateFreeCPUs (it is chosen the CE with the largest number of free CPUs)

71 71 MPI Job JobType = "MPICH"; NodeNumber = 4; Executable = "MPItest.sh"; Argument = "cpi 4"; InputSandbox = {"MPItest.sh", "cpi"}; OutputSandbox = "executable.out"; Requirements = other.GlueCEInfoLRMSType == “PBS” || other.GlueCEInfoLRMSType == “LSF”; The NodeNumber entry is the number of threads of MPI job The MPItest.sh script only works if PBS or LSF is the local job manager If you want to submit your MPI programs you have to compile them

72 72 DAG Job The DAG job is a Directed Acyclic Graph Job The sub-jobs are scheduled only when the corresponding DAG node is ready The user has to set the JDL JobType attribute to dag, nodes attributes that contains the description of the nodes, and dependencies attributes NOTE:  A plug-in has been implemented to map an EGEE DAG submission to a Condor DAG submission  Some improvements have been applied to the ClassAd API to better address WMS need

73 73 DAG Job nodes = { cmkin1 = [ file = “bckg_01.jdl" ; ], cmkin2 = [ file = “bckg_02.jdl" ; ], …… cmkinN = [ file = “bckg_0N.jdl" ; ] }; dependencies = { {cmkin1, cmkin2}, {cmkin2, cmkin3}, {cmkin2, cmkin5}, {{cmkin4, cmkin5}, cmkinN} } cmk in1 cmk in4 cmk in2 cmk in5 cmk inN cmk in3

74 74 GANGA While not yet a standard feature of middleware...  It is Open Source  Allows you to manage large numbers of jobs easily From the command line From a GUI  Takes care of all the nuisances of managing large numbers of jobs Submission Monitoring Resubmission Output recovery

75 75 DIANE Based on GANGA Implements a “pull” process model on the Grid  Processor jobs are started at working nodes  Data is feed to these jobs as they become ready Based on Python  Uses a python derived syntax Open Source Not yet part of the official middleware

76 76 PHP::Grid Allows easy development of web interfaces to grid-based services using PHP  Runs on a www server or cluster (not on the UI)  Connects to a remote UI to launch jobs  Takes care of job management  Deals with the complexity of submission of large job numbers  Evolving to a standard DRMAA implementation Open Source

77 77 GridWay and DRMAA GridWay is an extension to Globus that has been incubating and is now to become mainstream Is scheduled to be included in EGEE as well (not yet) Provides several services to the Grid DRMAA is a standard for Grid programming  With bindings in a number of languages (C, C++, Java...)  Supported on popular infrastructures (SGE, Condor, Globus, EGEE..)  Used by several software application packages already  Takes care of job submission, monitoring and output retrieval. Open Source

78 78 The Single Most Important Slide You can always get support from us Check http://bioportal.cnb.uam.es/sbg/ Consider joining the discussions  And the team! And from EGEE people  ROC provide regional operations support  Applications Support helps developers (that's us!)  Training provides knowledge and on site courses. http://www.eu-egee.org/


Download ppt "EGEE is a project funded by the European Union under contract IST-2003-508833 Job Submission José R. Valverde EGEE NA4 Biomed Applications CNB/CSIC EMBnet/CNB."

Similar presentations


Ads by Google