Presentation is loading. Please wait.

Presentation is loading. Please wait.

Plans for the National NERC HPC services UM vn 6.1 installations and performance UM vn 6.6 and NEMO(?) plans.

Similar presentations


Presentation on theme: "Plans for the National NERC HPC services UM vn 6.1 installations and performance UM vn 6.6 and NEMO(?) plans."— Presentation transcript:

1 Plans for the National NERC HPC services UM vn 6.1 installations and performance UM vn 6.6 and NEMO(?) plans

2 2007200820092010 HPCx (NERC ~10% share) Phase 3 2011 Phase 1 Phase 2 HECToR (NERC ~20% share) National HPC Facilities Phase 4 Black Widow (Vector) UKMO Shared (NERC <10% share)

3 1 node on HPCx = 16 processors UM atmosphere model resolutions low N48 -> N96 -> N144 -> N216 high UM version 6.1 on HPCx, phase2a IPCC like STASH and with climate meaning

4 When will the NCAS service on HECToR be available? 1.HECToR service started on 16 th October 2007. 2.NERC will provide initial HECToR allocation during the NERC HPC steering panel to be held 22 nd November 2007 3.NCAS service, via the PUMA UMUI, will start with UM versions 4.5 and 6.1. 4.NCAS service for UM version 6.6 may begin at Easter 2008, depends on Met Office delivery of new versions

5 What is HECToR phase 1 service? A Cray XT4 with 11,328 cores, each acts as a single CPU, on which NERC has ~20% share of the allocation. The processors are AMD 2.8 Ghz Opterons. HECToR has a total of 32 Tbytes of memory and has a peak speed of 59 Tflops. The machine is run by Edinburgh (EPCC) and Daresbury and so has the same administration process as HPCx using SAFE (Service Administration From EPCC) So it has the same look and feel as HPCx. High level support is provided by NAG, which will cause a significant culture change for NCAS.

6 What is the HECToR service like compared to HPCx? - it runs SUSE linux (so we may need some script changes) - it uses MPICH2 for the processor interconnect (so we need to look at the UM scalability issues) - it has a new file system (so we need to explore UM I/O issues) - it doesn’t (yet?) have an archive system (this is being discussed with NERC, HECToR and EPSRC) - it has 3 different compilers PGI, pathscale and gnu (there are many UM issues to explore with all these options) - system software is controlled by modules (so we need to make changes to the UM setvars) - job submission using PBS (so we will make changes to UM scripts and the UMUI) - parallel jobs are launched with aprun not mpirun (so we have to change the UM scripts) - no serial queue (yet!) ( so we may have to change the way we compile the Um and what about the simple models?

7 UM Compiler issues CompilerCompiler Warninga = Intel 7.1.0.42Initialisation of variable A more than once is an extension to standard Fortran 95 1.00000000000000 2.00000000000000 1.00000000000000 Intel 8.0.0.46 Intel 8.1.0.28 Intel 9.0.0.21 -1.00000000000000 1.00000000000000 1.00000000000000 IBM xlf 8.1 and 10.0.02 Variable a is initialized more than once1.00000000000000000 2.00000000000000000 1.00000000000000000 Pathscale 2.0.20Warning: Multiple DATA initialization of storage for A. Some initializations ignored 1. 2. 1. PGI PGF90/x86-64 Linux 7.0-4 PGF90-W-0164-Overlapping data initializations of a 1.000000 Survey from Polyhedron Software Results from a UM version 4.5 code sample

8 Other NCAS UM issues f77/ ftn PGI compiler Pathscale compiler module switch Basic UM PGI options now selected after rounding problems - we need now to look at portability/reproducibility - do some validation runs UM vn 4.5 1)Hadam3 + Hadam3P 2) Hadcm3 + preind QUEST? L64, Stochem Moses 2.1, 2.2 Famous/QUEST PRECIS, Hadrm3 Hadam4 We currently testing both compilers. UM vn 6.1 1)Hadgem -> Hadgem1a 2)Higem –> Higem2 3)NUGAM…… 4)Weather jobs? 5)UKCA? UM vn 6.3, 6.6……..

9

10 NCAS Plans for Porting UM to HECToR Set up central UM userid hum Install and test UM vn 6.1 and 4.5 Focus on portability, performance and scalability issues - there are currently many different queues but we need to provide advice to users at different resolutions Work out disk space strategy - how are we going to manage users personal archives? - what do we need to do with ECMWF and UKMO data? Design the FCM build system for HECToR for UM vn 6.3, 6.6 - timetable of the UK Met Office - timetable for UKCA, CASCADE, Higem, GSUM, QUESM

11

12 3 Gbyte files1.6 Gbyte files Time spent (secs) for I/O - UM atmosphere N216 L38  I/O is an issue on different computers hence GSUM will optimise I/O as well as provide a tuneable I/O strategy On HECToR

13 Current Issues - Robustness of the system - hardware still not that reliable but improving - lustre file system still having teething problems - support rather ‘green’ - No management committee in place to drive improvements - No long term storage solution - UM installation (vn45. and vn6.1) complete but validation is still not complete - Higem run still running - UKCA, chemistry solvers are taking 31 x HPCx ! - UMCET (ensemble framework) needs re-working - UM vn 6.6 using FCM should be installed by Easter 2008


Download ppt "Plans for the National NERC HPC services UM vn 6.1 installations and performance UM vn 6.6 and NEMO(?) plans."

Similar presentations


Ads by Google