Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The NERSC Global File System NERSC June 12th, 2006.

Similar presentations


Presentation on theme: "1 The NERSC Global File System NERSC June 12th, 2006."— Presentation transcript:

1 1 The NERSC Global File System NERSC June 12th, 2006

2 2 Overview NGF: What/Why/How NGF Today –Architecture –Who’s Using it –Problems/Solutions NGF Tomorrow –Performance Improvements –Reliability Enhancements –New Filesystems(/home)

3 3 What is NGF?

4 4 NERSC Global File System - what What do we mean by a global file systems? –Available via standard APIs for file system access on all NERSC systems. POSIX MPI-IO –We plan on being able to extend that access to remote sites via future enhancements. –High Performance NGF is seen as a replacement for our current file systems, and is expected to meet the same high performance standards

5 5 NERSC Global File System - why Increase User productivity –To reduce users’ data management burden. –Enable/Simplify workflows involving multiple NERSC computational systems –Accelerate the adoption of new NERSC systems Users have access to all of their data, source code, scripts, etc. the first time they log into the new machine Enable more flexible/responsive management of storage –Increase Capacity/Bandwidth on demand

6 6 NERSC Global File System - how Parallel Network/SAN heterogeneous access model Multi-Platform (AIX/linux for now)

7 7 NGF Today

8 8 NGF current architecture NGF is a GPFS file system using GPFS multi-cluster capabilities Mounted on all NERSC systems as /project External to all NERSC computational clusters Small linux server cluster managed separately from computational systems. 70 TB user visible storage. 50+ Million inodes. 3GB/s aggregate bandwith

9 9 NGF Current Configuration

10 10 /project Limited initial deployment - no homes, no /scratch Projects can include many users potentially using multiple systems(mpp, vis, …) and seemed to be prime candidates to benefit from the NGF shared data access model Backed up to HPSS bi-weekly –Will eventually receive nightly incremental backups. Default project quota: –1 TB –250,000 inodes

11 11 /project – 2 Current usage –19.5 TB used (28% of capacity) –2.2 M inodes used (5% of capacity) NGF /project is currently mounted on all major NERSC systems (1240+ clients): –Jacquard, LNXI Opteron System running SLES 9 –Da Vinci, SGI Altix running SLES 9 Service Pack 3 with direct storage access –PDSF IA32 Linux cluster running Scientific Linux –Bassi, IBM Power5 running AIX 5.3 –Seaborg, IBM SP running AIX 5.2

12 12 /project – problems & Solutions /project has not been without it’s problems –Software bugs 2/14/06 outage due to Seaborg gateway crash – problem reported to IBM, new ptf with fix installed. GPFS on AIX5.3 ftruncate() error on compiles – problem reported to IBM. efix now installed on Bassi. –Firmware bugs FibreChannel Switch bug – firmware upgraded. DDN firmware bug(triggered on rebuild) – firmware upgraded –Hardware Failures Dual disk failure in raid array – more exhaustive monitoring of disk health including soft errors now in place

13 13 NGF – Solutions General actions taken to improve reliability. –Pro-active monitoring – see the problems before they’re problems –Procedural development – decrease time to problem resolution/perform maintenance without outages –Operations staff activities – decrease time to problem resolution –PMRs filed and fixes applied – prevent problem recurrence –Replacing old servers – remove hardware with demonstrated low MTBF NGF Availability since 12/1/05: ~99% (total down time: 2439 minutes)

14 14 Current Project Information Projects using /project file system: (46 projects to date) –narccap: North American Regional Climate Change Assessment Program – Phil Duffy, LLNL Currently using 4.1 TB Global model with fine resolution in 3D and time; will be used to drive regional models Currently using only Seaborg –mp107: CMB Data Analysis – Julian Borrill, LBNL Currently using 2.9 TB Concerns about quota management and performance –16 different file groups

15 15 Current Project Information Projects using /project file system (cont.): –incite6: Molecular Dynameomics – Valerie Daggett, UW Currently using 2.1 TB –snaz: Supernova Science Center – Stan Woosley, UCSC Currently using 1.6 TB

16 16 Other Large Projects ProjectPIUsage snapSaul Perlmutter922 GB aerosolCatherine Chuang912 GB acceldacRobert Ryne895 GB vorpalDavid Bruhwiler876 GB m526Peter Cummings759 GB gc8Martin Karplus629 GB incite7Cameron Geddes469 GB

17 17 NGF Performance Many users have reported good performance for their applications(little difference from /scratch) Some applications show variability of read performance(MADCAP/MADbench) – we are investigating this actively.

18 18 MADbench Results OperationMinMaxMeanStdDev Bassi Home Read12.335.322.03.5 Bassi Home Write28.246.532.91.8 Bassi Scratch Read2.627.13.31.6 Bassi Scratch Write1.28.52.00.5 Bassi Project Read10.9245.256.758.0 Bassi Project Write8.521.79.80.9 Seaborg Home Read33.8103.941.36.9 Seaborg Home Write17.822.919.30.9 Seaborg Scratch Read24.856.537.82.4 Seaborg Scratch Write4.914.010.41.8 Seaborg Project Read34.9261.256.234.7 Seaborg Project Write13.9135.517.17.9

19 19 Bassi Read Performance

20 20 Bassi Write Performance

21 21 Current Architecture Limitations NGF performance is limited by the architecture of current NERSC systems –Most NGF I/O uses GPFS TCP/IP storage access protocol Only Da Vinci can access NGF storage directly via FC. –Most NERSC systems have limited IP bandwidth outside of the cluster interconnect. 1 gig-e per I/O node on Jacquard. each compute node uses only 1 I/O node for NGF traffic. 20 I/O noodes feed into 1 10Gb ethernet Seaborg has 2 gateways with 4xgig-e bonds. Again each compute node uses only 1 gateway. Bassi nodes each have 1-gig interfaces all feeding into a single 10Gb ethernet link

22 22 NGF tomorrow(and beyond …)

23 23 Performance Improvements NGF Client System Performance upgrades –Increase client bandwidth to NGF via hardware and routing improvements. NGF storage fabric upgrades –Increase Bandwidth and ports of NGF storage fabric to support future systems. Replace old NGF Servers –New servers will be more reliable. –10-gig ethernet capable. New Systems will be designed to support High performance to NGF.

24 24 NGF /home We will deploy a shared /home file system in 2007 –Initially only home for 1 system, may be mounted on others. –New systems thereafter all have home directories on NGF /home –Will be a new file system with tuning parameters configured for small file accesses.

25 25 /home layout – decision slide Two options 1.A user’s login directory is the same for all systems –/home/matt/ 2.A user’s login directory is a different subdirectory of the user’s directory for each system –/home/matt/seaborg –/home/matt/jacquard –/home/matt/common –/home/matt/seaborg/common ->../common

26 26 One directory for all Users see exactly the same thing in their home dir every time they log in, no matter what machine they’re on. Problems –Programs sometimes change the format of their configuration files(dotfiles) from one release to another without changing the file’s name. –Setting $HOME affects all applications not just the one that needs different config files –Programs have been known to use getpwnam() to determine the users home directory, and look there for config files rather than in $HOME –Setting $HOME essentially emulates the effect of having separate home dirs for each system

27 27 One directory per system By default users start off in a different directory on each system Dotfiles are different on each system unless the user uses symbolic links to make them the same All of a users files are accessible from all systems, but a user may need to “cd../seaborg” to get at files he created on seaborg if he’s logged into a different system

28 28 NGF /home conclusion We currently believe that the multiple directories option will result in less problems for the users, but are actively evaluating both options. We would welcome user input on the matter.

29 29 NGF /scratch We plan on deploying a shared /scratch to NERSC-5 sometime in 2008


Download ppt "1 The NERSC Global File System NERSC June 12th, 2006."

Similar presentations


Ads by Google