Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intermediate HTCondor: More Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison.

Similar presentations


Presentation on theme: "Intermediate HTCondor: More Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison."— Presentation transcript:

1 Intermediate HTCondor: More Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison

2 OSG Summer School 2015 Before we begin… Any questions on the lectures or exercises up to this point? 2

3 OSG Summer School 2015 Advanced DAGMan Tricks Throttles DAGMan Variables Retries DAGs without dependencies Sub-DAGs Pre and Post scripts: editing your DAG SPLICes: DAGs as subroutines 3

4 OSG Summer School 2015 Throttles Throttles to control job submissions  Max jobs idle  condor_submit_dag -maxidle XX work.dag  Max scripts running  condor_submit_dag -maxpre XX -maxpost XX Useful for “big bag of tasks”  Schedd holds everything in memory 4

5 OSG Summer School 2015 DAGMan variables 5 # Diamond dag Job A a.sub Job B b.sub Job C c.sub Job D d.sub Parent A Child B C Parent B C Child D

6 OSG Summer School 2015 DAGMan variables (Cont) 6 # Diamond dag Job A a.sub Job B a.sub Job C a.sub Job D a.sub VARS A OUTPUT=“A.out” VARS B OUTPUT=“B.out” VARS C OUTPUT=“C.out” VARS D OUTPUt=“D.out” Parent A Child B C Parent B C Child D

7 OSG Summer School 2015 DAGMan variables (cont) 7 # a.sub Universe = vanilla Executable = gronk output = $(OUTPUT) queue

8 OSG Summer School 2015 Retries Failed nodes can be automatically retried a configurable number of times  Helps when jobs randomly crash 8 Job A a.sub Job B b.sub Job C c.sub Job D d.sub RETRY D 100 Parent A Child B C Parent B C Child D

9 OSG Summer School 2015 DAGs without dependencies Submit DAG with:  200,000 nodes  No dependencies Use DAGMan to throttle the job submissions:  HTCondor is scalable, but it will have problems if you submit 200,000 jobs simultaneously  DAGMan can help you with scalability even if you don’t have dependencies A1A1 A2A2 A3A3 … 9

10 OSG Summer School 2015 Shishkabob DAG Used for breaking long jobs into short Easier for scheduling 10 A1A1 A2A2 A3A3

11 OSG Summer School 2015 Sub-DAG Idea: any given DAG node can be another DAG  SUBDAG External Name DAG-file DAG node will not complete until sub-dag finishes Interesting idea: A previous node could generate this DAG node Why?  Simpler DAG structure  Implement a fixed-length loop  Modify behavior on the fly 11

12 OSG Summer School 2015 Sub-DAG A BC D VW Z X Y 12

13 OSG Summer School 2015 Subdag Syntax 13 # Diamond dag Job A a.sub Job B b.sub SUBDAG EXTERNAL C c.dag Job D d.sub Parent A Child B C Parent B C Child D

14 OSG Summer School 2015 DAGMan scripts DAGMan allows pre & post scripts  Run before (pre) or after (post) job  Run on the same computer you submitted from  Don’t have to be scripts: any executable Syntax: JOB A a.sub SCRIPT PRE A before-script $JOB SCRIPT POST A after-script $JOB $RETURN 14

15 OSG Summer School 2015 So What? Pre script can make decisions  Where should my job run? (Particularly useful to make job run in same place as last job.)  What should my job do?  Generate Sub-DAG Post script can change return value  DAGMan decides job failed in non-zero return value  Post-script can look at {error code, output files, etc} and return zero or non-zero based on deeper knowledge. 15

16 OSG Summer School 2015 SPLICEs: DAGs as subroutines 16 A BC D

17 OSG Summer School 2015 SPLICEs: DAGs as subroutines 17 A BC D E

18 OSG Summer School 2015 18 A BC D A BC D A BC D E

19 OSG Summer School 2015 SPLICE Syntax SPLICE name dagfile.dag Creates new node with dag as node CHILD / PARENT / etc all work on ndoes 19

20 OSG Summer School 2015 Example JOB E E.submit SPLICE DIAMOND diamond.dag PARENT DIAMOND CHILD E 20

21 OSG Summer School 2015 Let’s try it out! Exercises with DAGMan. 21

22 OSG Summer School 2015 Questions? Questions? Comments? Feel free to ask me questions later: 22


Download ppt "Intermediate HTCondor: More Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison."

Similar presentations


Ads by Google