Presentation is loading. Please wait.

Presentation is loading. Please wait.

Real Science at the Petascale Radhika S. Saksena 1, Bruce Boghosian 2, Luis Fazendeiro 1, Owain A. Kenway, Steven Manos 1, Marco Mazzeo 1, S. Kashif Sadik.

Similar presentations


Presentation on theme: "Real Science at the Petascale Radhika S. Saksena 1, Bruce Boghosian 2, Luis Fazendeiro 1, Owain A. Kenway, Steven Manos 1, Marco Mazzeo 1, S. Kashif Sadik."— Presentation transcript:

1 Real Science at the Petascale Radhika S. Saksena 1, Bruce Boghosian 2, Luis Fazendeiro 1, Owain A. Kenway, Steven Manos 1, Marco Mazzeo 1, S. Kashif Sadik 1, James L. Suter 1, David Wright 1 and Peter V. Coveney 1 1. Centre for Computational Science, UCL, UK 2. Tufts University, Boston, USA

2 2 New era of petascale resources Scientific applications at petascale: –Unstable periodic orbits in turbulence –Liquid crystalline rheology –Clay-polymer nanocomposites –HIV drug resistance –Patient specific haemodynamics Conclusions Contents

3 3 New era of petascale machines Ranger (TACC) - NSF funded SUN Cluster 0.58 petaflops (theoretical) peak: ~ 10 times HECToR (59 Tflops)bigger than all other TeraGrid resources combined Linpack speed 0.31 petaflops, 123TB memory Architecture: 82 racks; 1 rack = 4 chassis; 1 chassis = 12 nodes 1 node = Sun blade x6420 (four 16 bit AMD Opteron Quad-Core processors); 3,936 nodes = 62,976 cores Intrepid (ALCF) - DOE funded BlueGene/P 0.56 petaflops (theoretical) peak 163,840 cores; 80TB memory Linpack speed 0.45 petaflops Fastest machine available for open science and third in general 1 1.

4 4 New era of petascale machines US firmly committed to path to petascale (and beyond) NSF: Ranger (5 years, $59 million award) University of Tennessee, to build system with just under 1PF peak performance ($65 million, 5-year project) 1 Blue Waters will come online in 2011 at NCSA ($208 grant), using IBM technology – to deliver peak 10 Pflops performance (~ 200K cores, 10PB of disk) 1.

5 5 New era of petascale machines We wish to do new science at this scale – not just incremental advances Applications that scale linearly up to tens of thousands of cores (large system sizes, many time steps) – capability computing at petascale High throughput for intermediate scale applications (in the 128 – 512 core range)

6 6 Intercontinental HPC grid environment AHE Leeds Manchester Oxford RAL HPCx UK NGS PSC SDSC NCSA US TeraGrid TACC (Ranger) HECToR DEISA Lightpaths ANL (Intrepid) Massive data transfers Advanced reservation/ co-scheduling Emergency/pre-emptive access

7 7 JANET Lightpath is a centrally managed service which supports large research projects on the JANET network by providing end- to-end connectivity, from 100s of Mb up to whole fibre wavelengths (10 Gb). Typical usage –Dedicated 1Gb network to connect to national and international HPC infrastructure –Shifting TB datasets between the UK/US –Real-time visualisation –Interactive computational steering –Cross-site MPI runs (e.g. between NGS2 Manchester and NGS2 Oxford) Lightpaths - Dedicated 1 Gb UK/US network

8 8 Advanced reservations Plan in advance to have access to the resources ­ Process of reserving multiple resources for use by a single application - HARC 1 - Highly Available Resource Co-Allocator - GUR 2 - Grid Universal Remote Can reserve the resources: –For the same time: Distributed MPIg/MPICH-G2 jobs Distributed visualization Booking equipment (e.g. visualization facilities) –Or some coordinated set of times –Computational workflows Urgent computing and pre-emptive access (SPRUCE)

9 9 Also available via the HARC API - can be easily built into Java applications. Deployed on a number of systems - LONI (ducky, bluedawg, zeke, neptune IBM p5 clusters) - TeraGrid (NCSA, SDSC IA64 clusters, Lonestar, Ranger(?)) - HPCx - North West Grid (UK) - National Grid Service - UK NGS - Manchester, Oxford, Leeds Advanced reservations

10 10 Middleware which simplifies access to distributed resources; manage workflows Wrestling with middleware can't be a limiting step for scientists - Hiding complexities of the grid from the end user Applications are stateful Web services Application can consist of a coupled model, parameter sweep, steerable application, or a single executable Application Hosting Environment

11 11 HYPO4D 1 (Hydrodynamic periodic orbits in 4D ) Scientific goal: to identify and characterize periodic orbits in turbulent fluid flow (from which exact time averages can be computed exactly) Uses lattice-Boltzmann method: highly scalable (linear scaling up to at least 33K cores on Intrepid and close to linear up to 65K) a) Rangerb) Intrepid + Surveyor (Blue Gene/P) 1. L. Fazendeiro et al. A novel computational approach to turbulence, AHM08

12 12 HYPO4D 1 (Hydrodynamic periodic orbits in 4D ) Novel approach to turbulence studies: efficiently parallelizes time and space Algorithm is extremely memory-intensive: full spacetime trajectories are numerically relaxed to nearby minimum (unstable periodic orbit) Ranger is ideal resource for this work (123 TB of RAM) During early-user period millions of time steps for different systems simulated and then compared for similarities ~ 9TB of data 1. L. Fazendeiro et al. A novel computational approach to turbulence, AHM08

13 13 LB3D 1 LB3D -- three-dimensional lattice-Boltzmann solver for multi-component fluid dynamics, in particular amphiphilic systems Mature code - 9 years in development. It has been extensively used on the US TeraGrid, UK NGS, HECToR and HPCx machines Largest model simulated to date is (needs Ranger) 1.R. S. Saksena et al. Petascale lattice-Boltzmann simulations of amphiphilic liquid crystals, AHM08

14 14 Cubic Phase Rheology Results lattice-sites gyroidal system with multiple domains Recent results include the tracking of large time-scale defect dynamics on lattice-sites systems; only possible on Ranger, due to sustained core count and disk storage requirements Regions of high stress magnitude are localized in the vicinity of defects 1. R. S. Saksena et al. Petascale lattice-Boltzmann simulations of amphiphilic liquid crystals, AHM08

15 15 LAMMPS 1 Fully-atomistic simulations of clay-polymer nanocomposites on Ranger More than 85 million atoms simulated Clay mineral studies, with ~ 3 million atoms, 2-3 orders of magnitude greater than any previous study Prospects: to include the edges of the clay (not periodic boundary) and do realistic-sized models – at least 100 million atoms (~2 weeks wall clock, using 4096 cores) 1. J Suter et al. Grid-Enabled Large-Scale Molecular Dynamics of Clay Nano-materials, AHM08

16 16 HIV-1 drug resistance 1 Goal: to study the effect of anti- retroviral inhibitors (targetting proteins in the HIV lifecycle, such as viral protease and reverse- transcriptase enzymes) High end computational power to confer clinical decision support On Ranger, up to 100 replicas (configurations) simulated, for the first time, in some cases going to 100 ns 3.5TB of trajectory and free energy analysis Energy differences of binding compared with experimental results for wildtype and MDR proteases with inhibitors LPV and RTV using 10ns trajectory. 1. K. Sadiq et al., Rapid, Accurate and Automated Binding Free Energy Calculations of Ligand-Bound HIV Enzymes for Clinical Decision Support using HPC and Grid Resources, AHM08 6 microseconds in four weeks AHE orchestrated workflows

17 17 GENIUS project 1 Grid Enabled Neurosurgical Imaging Using Simulation (GENIUS) Scientific goal: to perform real time patient specific medical simulation Combines blood flow simulation with clinical data Fitting the computational time scale to the clinical time scale: Capture the clinical workflow Get results which will influence clinical decisions: 1 day? 1 week? GENIUS - 15 to 30 minutes 1. S. Manos et al., Surgical Treatment for Neurovascular Pathologies Using Patient-specific Whole Cerebral Blood Flow Simulation, AHM08

18 18 GENIUS project 1 Blood flow is simulated using lattice-Boltzmann method (HemeLB) Parallel ray tracer doing real time in situ visualization Sub-frames rendered on each MPI processor/rank and composited before being sent over the network to a (lightweight) viewing client Addition of volume rendering cuts down scalability of fluid solver due to required global communications Even so, datasets rendered at more than 30 frames per second ( pixel resolution) 1. S. Manos et al., Surgical Treatment for Neurovascular Pathologies Using Patient-specific Whole Cerebral Blood Flow Simulation, AHM08

19 19 CONCLUSIONS A wide range of scientific research activities were presented that make effective use of the new range of petascale resources available in the USA These demonstrate the emergence of new science not possible without access to this scale of resources Some existing techniques still hold however, such as MPI, as some of these applications have shown, scaling linearly up to at least tens of thousands of cores Future prospects: we are well placed to move onto next machines coming online in the US and Japan

20 20 JANET/David Salmon NGS staff TeraGrid Staff Simon Clifford (CCS) Jay Bousseau (TACC) Lucas Wilson (TACC) Pete Beckmann (ANL) Ramesh Balakrishnan (ANL) Brian Toonen (ANL) Prof. Nicholas Karonis (ANL) Acknowledgements


Download ppt "Real Science at the Petascale Radhika S. Saksena 1, Bruce Boghosian 2, Luis Fazendeiro 1, Owain A. Kenway, Steven Manos 1, Marco Mazzeo 1, S. Kashif Sadik."

Similar presentations


Ads by Google