Presentation is loading. Please wait.

Presentation is loading. Please wait.

Task Farming on HPCx David Henty HPCx Applications Support

Similar presentations


Presentation on theme: "Task Farming on HPCx David Henty HPCx Applications Support"— Presentation transcript:

1 Task Farming on HPCx David Henty HPCx Applications Support d.henty@epcc.ed.ac.uk

2 220 July 2005HPCx User Group What is Task Farming? Many independent programs (tasks) running at once –each task can be serial or parallel –“independent” means they don’t communicate directly Common approach for using cycles in a loosely- connected cluster –how does it relate to HPCx and Capability Computing? Often needed for pre or post-processing Tasks may contribute to a single, larger calculation –parameter searches or optimisation –enhanced statistical sampling –ensemble modelling

3 320 July 2005HPCx User Group Classical Task Farm A single parallel code (eg written in MPI) –one process is designated as the controller –the rest are workers ControllerWorker 1Worker 2Worker 3Worker 4 input output

4 420 July 2005HPCx User Group Characteristics Pros –load balanced for sufficiently many tasks –can use all of HPCx (using MPI) Cons –must write a new parallel code –potential waste of a CPU if controller is not busy –each task must be serial, ie use a single CPU Approach –find an existing task farm harness on the WWW

5 520 July 2005HPCx User Group Shared Counter Tasks are numbered 1, 2,... maxTask –shared counter requires no CPU time Worker 1Worker 2Worker 3Worker 4Worker 5 Counter output 3 1 2 4 6 3 output 1 5 7 output 6

6 620 July 2005HPCx User Group Characteristics Pros –load-balanced –don’t have to designate a special controller Cons –very much a shared-memory model –easy to scale up to a frame (32 CPUs) with OpenMP –harder to scale to all of HPCx –need to write a new parallel program

7 720 July 2005HPCx User Group Task Farming Existing Code Imagine you have a pre-compiled executable –and you simply want to run P copies on P processors –common in parameter searching or ensemble studies –can be done via poe but is non-portable Possible to launch a simple MPI harness –each process does nothing but run the executable –easy to do via “system(commandstring)” Have written a general-purpose harness –called taskfarm –see /usr/local/packages/bin/

8 820 July 2005HPCx User Group Controlling the Task Farm Need to allow the tasks to do different things –each task assigned unique MPI rank: 0, 1, 2,..., P-2, P-1 –I have hijacked the C “%d” printf syntax taskfarm “echo hello from task %d” –command string is run as-is on each processor –except with %d replaced by the MPI rank On 3 CPUs: hello from task 0 hello from task 1 hello from task 2

9 920 July 2005HPCx User Group Verbose Mode taskfarm -v "echo hello from task %d" taskfarm: called with 5 arguments: echo hello from task %d taskfarm: process 0 executing "echo hello from task 0“ taskfarm: process 1 executing "echo hello from task 1" taskfarm: process 2 executing "echo hello from task 2" hello from task 0 hello from task 1 hello from task 2 taskfarm: return code on process 0 is 0 taskfarm: return code on process 1 is 0 taskfarm: return code on process 2 is 0 Could also report where task is running –ie the name of the HPCx frame

10 1020 July 2005HPCx User Group Use in Practice Need tasks to use different input and output files taskfarm "cd rundir%d; serialjob output.log" –or taskfarm "serialjob output.%d.log” Pros –no new coding, and taskfarm also relatively portable Cons –no load balancing: single job per run Extensions –do more tasks than CPUs, aiming for load balance? –dedicated controller makes this potentially messy

11 1120 July 2005HPCx User Group Implement Shared Counter in MPI-2 Could be accessed as a library function: do task = gettask() if (task.ge. 0) then call serialjob(task) end if while (task.ge. 0) –or via an extended harness taskfarm -n 150 “serialjob output.%d.log” Would run serial jobs on all available processors until all 150 had been completed –potential for load-balancing with more tasks than processors –work in progress!

12 1220 July 2005HPCx User Group Multiple Parallel MPI Jobs What is the issue in HPCx? –poe picks up number of MPI processes directly from the Loadleveler script –can only have a single global MPI job running at once Cannot do mpirun mpijob -nproc 32 & wait –unlike on many other systems like Sun, T3E, Altix,...

13 1320 July 2005HPCx User Group Using taskfarm taskfarm is a harness implemented in MPI –cannot use it to run MPI jobs –but can run jobs parallelised with some other method, eg threads To run 4 copies of a 32-way OpenMP job: – export OMP_NUM_THREADS=32 – taskfarm "openmpjob output.%d.log" Controlling the OpenMP parallelism –how to ensure that each OpenMP job runs on a separate frame? –need to select 4 MPI tasks but place only one on each node #@ cpus=4 #@ tasks_per_node=1

14 1420 July 2005HPCx User Group Real Example: MOLPRO An ab-initio quantum chemistry package –parallelised using the Global Array (GA) Tools library –on HPCx, normal version of GA Tools uses LAPI LAPI requires poe: same problems for taskfarm as with MPI But... –there is an alternative implementation of GA Tools –uses the TCMSG messaging library... –which is implemented using Unix sockets,not MPI Not efficient over the switch –but probably fine on a node, ie up to 32 processors

15 1520 July 2005HPCx User Group Running MOLPRO as Parallel Taskfarm TCMSG parallelism specified on command line –to run 6 MOLPRO jobs each using 16 CPUs ie 2 jobs per frame on a total of 3 frames #@ cpus=6 #@ tasks_per_node=2 taskfarm “molpro -n 16 output.%d.out” Completely analogous to taskfarming OpenMP jobs MOLPRO can now be used to solve many different problems simultaneously –which may not individually scale very well

16 1620 July 2005HPCx User Group Multiple Parallel MPI Jobs So far have seen ways of running the following (where simple means no load balancing) –general serial task farm requiring new parallel code –simple serial task farm of existing program(s) potential for general serial task farm of existing program(s) –simple parallel (non-MPI) task farms with existing program(s) What about task farming parallel MPI jobs? –eg four 64-way MPI jobs in a 256 CPU partition –requires some changes to source code –but potentially not very much

17 1720 July 2005HPCx User Group Communicator Splitting (Almost) every MPI routine takes a communicator –usually MPI_COMM_WORLD but can be a subset of processes call MPI_Init(ierr) comm = MPI_COMM_WORLD call MPI_Comm_size(comm,...) call MPI_Comm_rank(comm,...) if (rank.eq. 0) & write(*,*) 'Hello world‘ ! now do the work... call MPI_Finalize(ierr) call MPI_Init(ierr) bigcomm = MPI_COMM_WORLD comm = split(bigcomm,4) call MPI_Comm_size(comm,...) call MPI_Comm_rank(comm,...) if (rank.eq. 0) & write(*,*) 'Hello world' ! now do the work... call MPI_Finalize(ierr)

18 1820 July 2005HPCx User Group Issues Each group of 64 processors lives in its own world –each has ranks 0 – 63 and its own master, rank = 0 –must never directly reference MPI_COMM_WORLD Need to allow for different input and output files –use different directories for minimum code changes –can arrange for each parallel task to run in a different directory using clever scripts How to split the communicator appropriately? –can be done by hand with MPI_Comm_split –the MPH library gives users some help If you’re interested, submit a query!

19 1920 July 2005HPCx User Group Summary Like any parallel computer, HPCx can run parallel taskfarm programs written by hand However, usual request are: –multiple runs of existing serial program –multiple runs of existing parallel program These can both be done with the taskfarm harness –limitations on tasks’ parallelism (must be non-MPI) –currently no load-balancing Task farming MPI code requires source changes –but can be quite straightforward in many cases –eg ensemble modelling with Unified Model


Download ppt "Task Farming on HPCx David Henty HPCx Applications Support"

Similar presentations


Ads by Google