Presentation is loading. Please wait.

Presentation is loading. Please wait.

National Energy Research Scientific Computing Center – Established 1974, first unclassified supercomputer center – Original mission: to enable computational.

Similar presentations


Presentation on theme: "National Energy Research Scientific Computing Center – Established 1974, first unclassified supercomputer center – Original mission: to enable computational."— Presentation transcript:

1 National Energy Research Scientific Computing Center – Established 1974, first unclassified supercomputer center – Original mission: to enable computational science as a complement to magnetically controlled plasma experiment NERSC Today’s mission: Accelerate scientific discovery at the DOE Office of Science through high performance computing and extreme data analysis

2 Diverse workload: – 4,500 users, 600 projects – 700 codes; 100s of users daily Allocations controlled primarily by DOE – 80% DOE Annual Production awards (ERCAP): From 10K hour to ~10M hour Proposal-based; DOE chooses – 10% DOE ASCR Leadership Computing Challenge – 10% NERSC reserve (“NISE”) 2 NERSC: Production Computing for the DOE Office of Science

3 DOE View of Workload ASCR Advanced Scientific Computing Research BER Biological & Environmental Research BES Basic Energy Sciences FES Fusion Energy Sciences HEP High Energy Physics NP Nuclear Physics NERSC 2013 Allocations By DOE Office

4 Science View of Workload NERSC 2013 Allocations By Science Area

5 ASCR Facilities

6 6 NERSC High End Computing and Storage Capabilities Large-Scale Computing Systems Hopper (NERSC-6): Early Cray Gemini System 6,384 compute nodes, 153,216 cores 144 Tflop/s on applications; 1.3 Pflop/s peak Edison (NERSC-7): Early Cray Aries System (2013) Over 200 Tflop/s on applications, 2 Pflop/s peak 333 TB of memory, 6.4 PB of disk HPSS Archival Storage 240 PB capacity 5 Tape libraries 200 TB disk cache NERSC Global Filesystem (NGF) Uses IBM’s GPFS 8.5 PB capacity 15GB/s of bandwidth Midrange 275 Tflops peak Carver IBM iDataplex cluster cores; 132TF PDSF (HEP/NP) ~2300 core cluster; 30TF GenePool (JGI) ~8200 core cluster; 113TF 2.1 PB Isilon File System Analytics & Testbeds IBM x3850 1TB, 2TB nodes Dirac 50 Nvidia GPU nodes Jesup IBM iDataPlex Data Analytics; HTC

7 JGI Historical Usage YearInitial AllocationUsage% charged* % % million1.6 million81% ,000451,00095% million5.6 million86% million10.9 million92% million1.4 million< 60%; 40% taken away thus far YearInitial AllocationUsage% charged* ,000127,26599% million10.4 million97% million20.7 million94% million1.1 million<40%; 60% taken away thus far Repo: m342 PI: Eddy Rubin Repo: m1045 PI: Victor Markowitz *Note: % charged may differ from usage/allocation due to refunds

8 Why are we giving hours back? System Instability Genepool doubled in compute power end of 2012; nodes configured for both MPP and traditional JGI workloads (e.g. Hadoop jobs now run here) System Expansion / Configuration System Stability Improvements Some analysis jobs couldn’t run on Genepool Hadoop-style jobs, MPP work (RaxML, Ray,) needed to be run on Carver/Hopper Major improvements to the scheduler, file systems and user workload has improved Genepool stability/availability; fewer jobs being rerun JGI clusters consolidated into Genepool; system went through period of instability (June-Sept 2012), users relied more heavily on other systems and checkpointing JGI’s Available Hours: 31,536,000 (GP1) + 30,835,200 (GP2) + 6,937,920 (highmem) + 20,000,000 (NERSC) = 89,309,120 JGI’s Available Hours: 31,536,000 (GP1) + 4,905,600 (highmem) + 20,000,000 (NERSC) = 56,441,600

9 Not everything can run on Hopper/Edison/Carver Phylogenetic Tree Reconstruction USEARCH HMMER BLAST (inefficient) Metagenome Assembly Research (MPI-based assemblers like Ray) Hadoop Illumina pipeline SMRTPortal Fungal annotation pipeline RQC pipeline Jigsaw Large memory assemblies IMG production runs High-throughput, automated, slot-scheduled jobs/pipelines; require large memory nodes, local disk, external database access Single large analysis runs, codes that can run at scale, jobs that tolerate long queue waits Goal: Determine which workflows could migrate to Hopper/Edison (e.g. Jigsaw) This analysis CAN and has been done on Hopper or Carver This analysis is critical to JGI’s mission and CAN NOT currently run on Hopper or Carver Goal: Improve efficiency (particularly I/O) in existing workflows

10 Genepool is sufficient today, but what about next year? Step 1: Collect data On Genepool (jobs run by program, type of analysis, time to complete, queue wait time) - procmon On file systems (amount of data created per job, access patterns of the job) –NGF scripts Step 2: Analyze the data Predict compute time needed per sample sequenced; define acceptable queue wait and turn around times - MATH Predict space needed per project - MATH Step 3: Sanity check predictions Add columns to LIMS system giving PMs the ability to enter predictions for compute and disk space needs Goal: Accurately predict that given X compute nodes, we can analyze Y samples over the course of Z months. IN PROGRESS

11 Edison Quick Facts First Petaflop system with Intel “Ivy Bridge” processors & Cray Aires High Speed Network Nodes5,200 dual-socket with 64 GB memory ProcessorsIntel “Ivy Bridge” 12-core, 2.4 GHz NetworkCray “Aires” Dragonfly Topology Scratch Disk6.4 PB with >140 GB/sec bandwidth Peak / Sustained2.4 PF / 260 TF Global Network Bandwidth > 11 TB/sec Node Memory Bandwidth > 100 GB/s High-Impact Results on Day One NERSC’s users started running production codes immediately on Edison. Edison is very similar to Hopper, but with 2-5 times the performance per core on most codes. 408 M MPP hours delivered in 2013 through Oct. 16. NERSC 8 Benchmark Performance Top projects: carbon sequestration, artificial photosynthesis, complex novel materials, cosmic background radiation analysis

12 Edison is the premier production computing platform for DOE Office of Science New Cray XC30 with Intel Ivy Bridge processors and Aries interconnect Designed to support HPC and data-intensive work Performs 2-4 x Hopper per node on real applications Outstanding scalability for massively parallel apps Easy adoption for users – runs current apps unmodified Ambient cooled for extreme energy efficiency compute nodes 124.5K processing cores 333 Terabytes memory 2.4 petaflops peak 530 TB/s memory bandwidth 11TB/s global bandwidth 1.3MW per PF 6.4PB 140TB/s

13 HopperEdisonMiraTitan Peak Flops (PF) (CPU) 21.8 (GPU) CPU cores152,408124,800786,432299,008 (GPU) 18,688 (CPU) Frequency (GHz) (CPU) 0.7 (GPU) Memory (TB) Total / Per-node 217 / / / / 32 (CPU) 112 / 6 (GPU) Memory BW* (TB/s) (CPU) 3,270 (GPU) Memory BW/node* (GB/s) (CPU) 175 (GPU) Filesystem2 PB 70 GB/s6.4 PB 140 GB/s35 PB 240 GB/s10 PB 240 GB/s Peak Bisection BW (TB/s) Sq ft ~ Power (MW Linpack) * STREAM

14 Supported Programming Languages Supported Programming Models Supported Compilers FortranMPIIntel (default) C, C++OpenMPCray UPCCray ShmemGnu Python, Perl, ShellsPOSIX Threads JavaPOSIX Shared Memory ChapelUPC Coarray Fortran Chapel * STREAM Supported Programming Languages, Models and Compilers

15 * STREAM How to compile Hopper and Edison are Cray super computers and have specialized compilers/compiler wrappers that are optimized for these systems (demo)

16 * STREAM How to compile Use modules to find available software (same as Genepool) For Cray and GNU programming environments all Cray scientific and math libraries are available (compile as you would on Genepool) For Intel programming environment, some libraries are different (contact if you have trouble with your


Download ppt "National Energy Research Scientific Computing Center – Established 1974, first unclassified supercomputer center – Original mission: to enable computational."

Similar presentations


Ads by Google