Present and Future Computing Requirements for Simulation and Analysis of Reacting Flow John Bell CCSE, LBNL NERSC ASCR Requirements for 2017 January 15,

Slides:

Advertisements

Similar presentations

School of Computing University of Leeds Computational PDEs Unit A Grid-based approach to the validation and testing of lubrication models Christopher Goodyer.

Advertisements

Weather Research & Forecasting: A General Overview

U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003

Current Progress on the CCA Groundwater Modeling Framework Bruce Palmer, Yilin Fang, Vidhya Gurumoorthi, Computational Sciences and Mathematics Division.

HPC - High Performance Productivity Computing and Future Computational Systems: A Research Engineer’s Perspective Dr. Robert C. Singleterry Jr. NASA Langley.

1 Lawrence Livermore National Laboratory By Chunhua (Leo) Liao, Stephen Guzik, Dan Quinlan A node-level programming model framework for exascale computing*

A Scalable Heterogeneous Parallelization Framework for Iterative Local Searches Martin Burtscher 1 and Hassan Rabeti 2 1 Department of Computer Science,

ASCI/Alliances Center for Astrophysical Thermonuclear Flashes Simulating Self-Gravitating Flows with FLASH P. M. Ricker, K. Olson, and F. X. Timmes Motivation:

Introduction CS 524 – High-Performance Computing.

Chapter 13 Embedded Systems

Adaptive MPI Chao Huang, Orion Lawlor, L. V. Kalé Parallel Programming Lab Department of Computer Science University of Illinois at Urbana-Champaign.

The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.

Advance the understanding and the prediction of mesoscale precipitation systems and to promote closer ties between the research and operational forecasting.

Michael L. Norman, UC San Diego and SDSC

Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.

© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March

Role of Deputy Director for Code Architecture and Strategy for Integration of Advanced Computing R&D Andrew Siegel FSP Deputy Director for Code Architecture.

LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.

CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.

Computational Design of the CCSM Next Generation Coupler Tom Bettge Tony Craig Brian Kauffman National Center for Atmospheric Research Boulder, Colorado.

The WRF Model The Weather Research and Forecasting (WRF) Model is a mesoscale numerical weather prediction system designed for both atmospheric research.

Compiler BE Panel IDC HPC User Forum April 2009 Don Kretsch Director, Sun Developer Tools Sun Microsystems.

Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.

A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.

2005 Materials Computation Center External Board Meeting The Materials Computation Center Duane D. Johnson and Richard M. Martin (PIs) Funded by NSF DMR.

Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.

Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.

Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.

IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.

4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.

The Future of the iPlant Cyberinfrastructure: Coming Attractions.

Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.

14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.

NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.

Advanced Simulation and Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at The University of Chicago The Center for Astrophysical Thermonuclear.

1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 

Experts in numerical algorithms and HPC services Compiler Requirements and Directions Rob Meyer September 10, 2009.

F. Douglas Swesty, DOE Office of Science Data Management Workshop, SLAC March Data Management Needs for Nuclear-Astrophysical Simulation at the Ultrascale.

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.

1 1  Capabilities: Building blocks for block-structured AMR codes for solving time-dependent PDE’s Functionality for [1…6]D, mixed-dimension building.

Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.

ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.

MESQUITE: Mesh Optimization Toolkit Brian Miller, LLNL

I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET.

CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –

1 OFFICE OF ADVANCED SCIENTIFIC COMPUTING RESEARCH The NERSC Center --From A DOE Program Manager’s Perspective-- A Presentation to the NERSC Users Group.

Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.

TR&D 2: NUMERICAL TOOLS FOR MODELING IN CELL BIOLOGY Software development: Jim Schaff Fei Gao Frank Morgan Math & Physics: Boris Slepchenko Diana Resasco.

Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.

Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.

2/22/2001Greenbook 2001/OASCR1 Greenbook/OASCR Activities Focus on technology to enable SCIENCE to be conducted, i.e. Software tools Software libraries.

ComPASS Summary, Budgets & Discussion Panagiotis Spentzouris, Fermilab ComPASS PI.

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,

1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.

Motivation: dynamic apps Rocket center applications: –exhibit irregular structure, dynamic behavior, and need adaptive control strategies. Geometries are.

C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.

Center for Extended MHD Modeling (PI: S. Jardin, PPPL) –Two extensively developed fully 3-D nonlinear MHD codes, NIMROD and M3D formed the basis for further.

HPC University Requirements Analysis Team Training Analysis Summary Meeting at PSC September Mary Ann Leung, Ph.D.

Building PetaScale Applications and Tools on the TeraGrid Workshop December 11-12, 2007 Scott Lathrop and Sergiu Sanielevici.

Defining the Competencies for Leadership- Class Computing Education and Training Steven I. Gordon and Judith D. Gardiner August 3, 2010.

INTRODUCTION TO WIRELESS SENSOR NETWORKS

A Brief Introduction to NERSC Resources and Allocations

Panel Discussion: Discussion on Trends in Multi-Physics Simulation

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

Programming Models for SimMillennium

Development of the Nanoconfinement Science Gateway

SDM workshop Strawman report History and Progress and Goal.

Presented By: Darlene Banta

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

Present and Future Computing Requirements for Simulation and Analysis of Reacting Flow John Bell CCSE, LBNL NERSC ASCR Requirements for 2017 January 15, 2014 LBNL

1. Simulation and Reacting Flow Overview John Bell, Ann Almgren, Marc Day, Andy Nonaka / LBNL Integrated approach to algorithm development for multiphysics applications Mathematical formulation – exploit structure of the problems Discretization – match discretization to mathematical properties of the underlying processes Software / solvers – evolving development of software framework to enable efficient implementation of applications Prototype applications – real world testing of approaches Current focus Coupling strategies for multiphysics coupling Higher-order discretizations AMR for next generation architectures Application areas Combustion Environmental flows Astrophysics Micro / meso scale fluid simulation

1. Simulation and Reacting Flow Overview – cont’d Target future applications Combustion – Complex oxygenated fuels at high pressure; integration of simulation and experimental data to improve predictive capability Environmental flow -- High fidelity simulation of cloud physics with low Mach number stratified flow model with detailed microphysics Astrophysics – 3D simulation of X-ray bursts on surface of neutron stars with detailed nucleosynthesis; high fidelity cosmological simulations Microfluidics – mesoscale modeling of non-ideal multicomponent complex fluids

2. Computational Strategies -- Overview Core algorithm technology Finite volume discretization methods Geometric multigrid Block-structured AMR Implemented in BoxLib framework Class structure to support development of structured AMR methods Manages data distribution and load balancing Efficient metadata manipulation Hybrid parallelization strategy Distribute patches to nodes using MPI Thread operations on patches using OpenMP Simulation of NOx emissions in a low swirl burner fueled by hydrogen. Effective resolution is

2. Computational strategies -- Combustion Formulation Low Mach number model derived from asymptotic analysis Removes acoustic wave propagation Retains compressibility effects due to thermal processes Numerics Adaptive projection formulation Spectral deferred correction to couple processes Dynamic estimation of chemistry work for load balancing LMC Dimethyl ether jet at 40 atm

2. Computational strategies – Stratified flows Formulation Low Mach number formulation with evolving base state General equation of state Numerics Unsplit PPM Multigrid AMR Code – Maestro Astrophysics Atmospheric flows Simulation of advection leading up to ignition in a Chandrasakar white dwarf

2. Computational strategies -- Astrophysics Formulation CASTRO Compressible flow formulation Self-gravity Models for turbulent flame propagation Multigroup flux-limited diffusion NYX CASTRO + collision-less particles to represent dark matter Numerics Unsplit PPM Multigrid AMR Baryonic matter from Nyx simulation

2. Computational Strategies – future direction We expect our computational approach and/or codes to change (or not) by 2017 in this way … Higher-order discretizations Spectral deferred corrections for multiphysics coupling Alternative time-stepping strategies More sophisticated hybrid programming model Integration of analysis with simulation Combining simulation with experimental data

3. Current HPC Usage (see slide notes) Machines currently using (NERSC or elsewhere) Hopper, Edison, Titan Hours used in (list different facilities) NERSC – 21 Million hours (MP111); OLCF -- ?? Typical parallel concurrency and run time, number of runs per year 20+K cores ; one study is hours; many smaller runs Data read/written per run 1-2 Tbytes / hour Memory used per (node | core | globally) Hopper: (12 G |.5 G | 12 T ) Necessary software, services or infrastructure MPI / OpenMP / C++ and F90 / HPSS / Parallel file system / Visit / htar / hypre / Petsc Data resources used (/scratch,HPSS, NERSC Global File System, etc.) and amount of data stored /scratch, HPSS, /project Tbytes

4. HPC Requirements for 2017 (Key point is to directly link NERSC requirements to science goals ) Compute hours needed (in units of Hopper hours) 500 M hours Changes to parallel concurrency, run time, number of runs per year Increase of factor of in concurrency; longer run times, 2 x number of runs per year Changes to data read/written 5 x increase in data read / written Changes to memory needed per ( core | node | globally ) We tend to select number of cores based on available memory so that problem will fit Need to maintain reasonable level of memory per core and node Changes to necessary software, services or infrastructure Improved thread support Programming model to support mapping data to cores within a node (respect NUMA) Tools to query how job is mapped onto machines so we can optimize communication Improved performance analysis tools

5. Strategies for New Architectures (1 of 2) Does your software have CUDA/OpenCL directives; if yes, are they used, and if not, are there plans for this? – No – Some limited plans to move chemistry integration to GPU’s Does your software run in production now on Titan using the GPUs? – No Does your software have OpenMP directives now; if yes, are they used, and if not, are there plans for this? – We routinely use OpenMP for production runs Does your software run in production now on Mira or Sequoia using threading? – No Is porting to, and optimizing for, the Intel MIC architecture underway or planned? – Yes. We have developed a tiling based implementation of one of our codes that got a speedup of a factor of 86 on a 61 core MIC

5. Strategies for New Architectures (2 of 2) Have there been or are there now other funded groups or researchers engaged to help with these activities? CS researcher funded by ExaCT Co-design Center and various X-stack projects have interacted with us on these activities If you answered "no" for the questions above, please explain your strategy for transitioning your software to energy-efficient, manycore architectures N/A What role should NERSC play in the transition to these architectures? Provide high quality tools needed to make this transition Support development of new programming models needed to effectively implement algorithms on these types of architectures What role should DOE and ASCR play in the transition to these architectures? Continue to fund applied math research groups working to develop algorithms for these architectures Provide support for software developed by these groups to facilitate availability of libraries / frameworks on new architectures Other needs or considerations or comments on transition to manycore:

5. Special I/O Needs Does your code use checkpoint/restart capability now? Yes Do you foresee that a burst buffer architecture would provide significant benefit to you or users of your code? Burst buffer would be useful in two ways 1.Stage latest checkpoint to burst buffer before jobs begins 2.Write more frequent checkpoints to burst buffer and migrate last complete checkpoint to rotating disk at end of run Scenarios for possible Burst Buffer use are on NERSC8-use-case-v1.2a.pdf

6. Summary What new science results might be afforded by improvements in NERSC computing hardware, software and services? Significant increase in multiphysics simulation (math hat) Recommendations on NERSC architecture, system configuration and the associated service requirements needed for your science Maintain system balance as much as possible Keep (at least) memory per node fairly large Aggressively pursue new programming models to facilitate intranode, fine-grained parallelization Aggressively pursue programming model support for in situ analysis NERSC generally refreshes systems to provide on average a 2X performance increase every year. What significant scientific progress could you achieve over the next 5 years with access to 32X your current NERSC allocation? Higher-order AMR capability for target applications such as those discussed above Integration of simulation and experimental data to improve predictive capability What "expanded HPC resources" are important for your project? ??? General discussion