Presentation is loading. Please wait.

Presentation is loading. Please wait.

Condor Project Computer Sciences Department University of Wisconsin-Madison Stork An Introduction Condor Week 2006 Milan.

Similar presentations


Presentation on theme: "Condor Project Computer Sciences Department University of Wisconsin-Madison Stork An Introduction Condor Week 2006 Milan."— Presentation transcript:

1 Condor Project Computer Sciences Department University of Wisconsin-Madison http://www.cs.wisc.edu/condor Stork An Introduction Condor Week 2006 Milan

2 2 http://www.cs.wisc.edu/condor Two Main Ideas Make data transfers a “first class citizen” in Condor Reuse items in the Condor toolbox

3 3 http://www.cs.wisc.edu/condor The tools ClassAds Matchmaking DAGMan

4 4 http://www.cs.wisc.edu/condor The data transfer problem Process large data sets at sites on grid. For each data set: o stage in data from remote server o run CPU data processing job o stage out data to remote server

5 5 http://www.cs.wisc.edu/condor Simple Data Transfer Job #!/bin/sh globus-url-copy source dest Often works fine for short, simple data transfers, but…

6 6 http://www.cs.wisc.edu/condor What can go wrong? Too many transfers at one time Service down; need to try later Service down; need to try alternate data source Partial transfers Time out; not worth waiting anymore

7 7 http://www.cs.wisc.edu/condor Stork What Schedd is to CPU jobs, Stork is to data placement jobs. o Job queue o Flow control o Failure-handling policies o Event log

8 8 http://www.cs.wisc.edu/condor Supported Data Transfers local file system GridFTP FTP HTTP SRB NeST SRM other protocols via simple plugin

9 9 http://www.cs.wisc.edu/condor Stork Commands stork_submit- submit a job stork_q- list the job queue stork_status- show completion status stork_rm- cancel a job

10 10 http://www.cs.wisc.edu/condor Creating a Submit Description File A plain ASCII text file Tells Stork about your job: o source/destination o alternate protocols o proxy location o debugging logs o command-line arguments

11 11 http://www.cs.wisc.edu/condor Simple Submit File // c++ style comment lines [ dap_type = "transfer"; src_url = "gsiftp://server/path”; dest_url = "file:///dir/file"; x509proxy = "default"; log = "stage-in.out.log"; output = "stage-in.out.out"; err = "stage-in.out.err"; ] Note: different format from Condor submit files

12 12 http://www.cs.wisc.edu/condor Sample stork_submit # stork_submit stage-in.stork using default proxy: /tmp/x509up_u19100 ================ Sending request: [ dest_url = "file:///dir/file"; src_url = "gsiftp://server/path"; err = "path/stage-in.out.err"; output = "path/stage-in.out.out"; dap_type = "transfer"; log = "path/stage-in.out.log"; x509proxy = "default" ] ================ Request assigned id: 1 # returned job id

13 13 http://www.cs.wisc.edu/condor Sample Stork User Log 000 (001.-01.-01) 04/17 19:30:00 Job submitted from host:... 001 (001.-01.-01) 04/17 19:30:01 Job executing on host:... 008 (001.-01.-01) 04/17 19:30:01 job type: transfer... 008 (001.-01.-01) 04/17 19:30:01 src_url: gsiftp://server/path... 008 (001.-01.-01) 04/17 19:30:01 dest_url: file:///dir/file... 005 (001.-01.-01) 04/17 19:30:02 Job terminated. (1) Normal termination (return value 0) Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage 0 - Run Bytes Sent By Job 0 - Run Bytes Received By Job 0 - Total Bytes Sent By Job 0 - Total Bytes Received By Job...

14 14 http://www.cs.wisc.edu/condor Who needs Stork? SRM exists. It provides a job queue, logging, etc. Why not use that?

15 15 http://www.cs.wisc.edu/condor Use whatever makes sense! Another way to view Stork: Glue between DAGMan and data transport or transport scheduler. So one DAG can describe a workflow, including both data movement and computation steps.

16 16 http://www.cs.wisc.edu/condor Stork jobs in a DAG A DAG is defined by a text file, listing each job and its dependents: # data-process.dag Data IN in.stork Job CRUNCH crunch.condor Data OUT out.stork Parent IN Child CRUNCH Parent CRUNCH Child OUT each node will run the Condor or Stork job specified by accompanying submit file IN CRUNC H OUT

17 17 http://www.cs.wisc.edu/condor Important Stork Parameters STORK_MAX_NUM_JOBS limits number of active jobs STORK_MAX_RETRY limits job attempts, before job marked as failed STORK_MAXDELAY_INMINUTES specifies “hung job” threshold

18 18 http://www.cs.wisc.edu/condor Features in Development Matchmaking o Job ClassAd with site ClassAd o Global max transfers  per site limits o Load balancing across sites o Dynamic reconfiguration of sites o Coordination of multiple instances of Stork Working prototype developed with Globus gridftp team

19 19 http://www.cs.wisc.edu/condor Further Ahead Automatic startup of personal stork server on demand Fair sharing between users Fit into new pluggable scheduling framework ala schedd-on-the-side

20 20 http://www.cs.wisc.edu/condor Summary Stork manages a job queue for data transfers A DAG may describe a workflow containing both data movement and processing steps.

21 21 http://www.cs.wisc.edu/condor Additional Resources http://www.cs.wisc.edu/condor/stork/ Condor Manual, Stork sections stork-announce@cs.wisc.edu list stork-discuss@cs.wisc.edu list


Download ppt "Condor Project Computer Sciences Department University of Wisconsin-Madison Stork An Introduction Condor Week 2006 Milan."

Similar presentations


Ads by Google