Scheduling › Need “Dedicated Scheduler” "Dedicated" has a specific Condor meaning Nodes running MPI require a dedicated scheduler A Given machine can have many opportunistic schedulers ... but only 1 dedicated scheduler
DedicatedScheduler surprises › DedicatedScheduler co-opts normal negotiation cycle › Preemption and scheduling work differently than opportunistic › DedicatedScheduler schedules First- Fit, sorted by UserJobPrio › Condor_q –analyze mystery!
Job startup › Same file transfer, etc. as Vanilla › One shadow, many starters › Starter runs sshd on all machines, does key exchange › Starter runs the exe on first machine (head node, Rank0)
Your script Here › Script on the head node has contact file › We provide samples for LAM, MPICH › We try to mimic “by hand” startup › Use condor_ssh to start remote jobs › When script exits, condor cleans up