Download presentation
Presentation is loading. Please wait.
Published byDamian McCoy Modified over 9 years ago
1
The eMinerals minigrid and the national grid service: A user’s perspective NGS169 (A. Marmier)
2
Objectives 1. User Profile 2. Two real resources: eMinerals Minigrid National Grid Service 3. Practical Difficulties 4. Amateurish rambling (discussion/suggestions)
3
User Profile 1 Atomistic modelling community Chemistry/physics/material science Potentially big users of eScience (CPU intensive, NOT data) VASP, SIESTA, DL_POLY, CASTEP … Want to run parallel codes
4
User Profile 2 Relative proficiency with Unix, mainframes, etc … Scripting parallel programming Note of caution: Speaker might be biased Want to run parallel codes
5
eMinerals Virtual Organisation, NERC The eMinerals project brings together simulation scientists, applications developers and computer scientists to develop UK eScience/grid capabilities for molecular simulations of environmental issues Grid prototype: the minigrid
6
eMinerals: Minigrid 3 clusters of 16 pentiums UCL condor pool Earth Science Cambridge condor pool SRB vaults SRB manager at Daresbury
7
eMinerals: Minigrid philosophy Globus 2 No Login possible (except one debug/compile cluster) No easy Files transfer (have to use SRB, see later) Feels very ‘gridy’, but not painless Promotes condorG and home wrappers
8
eMinerals: Minigrid example Universe = globus Globusscheduler = lake.bath.ac.uk/jobmanager-pbs Executable = /home/arnaud/bin/vasp-lam-intel Notification = NEVER transfer_executable = true Environment = LAMRSH=ssh -x GlobusRSL = (job_type=mpi)(queue=workq)(count=4)(mpi_type=lam-intel) Sdir = /home/amr.eminerals/run/TST.VASP3 Sget = INCAR,POTCAR,POSCAR,KPOINTS Sget = OUTCAR,CONTCAR SRBHome = /home/srbusr/SRB3_3_1/utilities/bin log = vasp.log error = vasp.err output = vasp.out Queue My_condor_submit script example
9
NGS: What ? VERY NICE PEOPLE who offer access to LOVELY clusters Real GRID approximation
10
NGS: Resources “Data” Clusters: 20 compute nodes with dual Intel Xeon 3.06 GHz CPUs, 4 GB RAM grid-data.rl.ac.uk - RAL grid-data.man.ac.uk – Manchester “Compute” Clusters: 64 compute nodes with dual Intel Xeon 3.06 GHz CPUs, 2 GB RAM grid-compute.leeds.ac.uk - WRG Leeds grid-compute.oesc.ox.ac.uk – Oxford Plus Other nodes : HPCx, Cardiff, Bristol …
11
NGS: Setup Grid-proxy-init Gsi-ssh … Then, a “normal” machine Permanent fixed account (NGS169) unix queuing system With gsi-ftp for file transfer
12
NGS: example globus-job-run grid-compute.oesc.ox.ac.uk/jobmanager-fork /bin/ls globusrun -b grid-compute.oesc.ox.ac.uk/jobmanager-pbs example1.rsl [EXAMPLE1.RSL: & (executable=DLPOLY.Y) (jobType=mpi) (count=4) (environment=(NGSMODULES intel- math:gm:dl_poly))
13
Interlude
14
Difficulty 1: access Well known problem - Certificate - Globus enabled machine - SRB account (2.0)
15
Difficulty 2: Usability How do I submit a job ? Directly (gsi-ssh…) Remotely (globus,condorG) Direct Login, checkq, submit, (kill), logout Different Batch Queuing Systems (PBS, condor,LoadLeveler …)
16
Usability 2 Usually requires a “script” Almost nobody writes their own scripts Works by inheritance and adaptation At the moment eScience forces the user to learn the syntax of the B.Q.S.
17
Usability 3 Remote [EXAMPLE1.RSL: & (executable=DLPOLY.Y) (jobType=mpi) (count=4) (environment=(NGSMODULES intel-math:gm:dl_poly)) Ignores file transfer Ignores more complex submit structures
18
Usability 4 Ignores more complex submit structures abinit <inp.txt Cpmd.x MgO.inp => User has to learn globus syntax :o/ (environment and RSL)
19
Finally At the moment no real incentives to submit remotely Mechanism to reward the early adopters Access to special queues Longer walltime ? More cpus ?
20
CONCLUSION Submission scripts are very important and useful pieces of information Easily accessible examples would save a lot of time Mechanism to encourage remote submission (access to better queues)
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.