I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET.

I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET

Block-Structured Local Refinement (Berger and Oliger, 1984) Refined regions are organized into rectangular patches. Refinement performed in time as well as in space.

Stakeholders SciDAC projects: Combustion, astrophysics (cf. John Bell’s talk). MHD for tokomaks (R. Samtaney). Wakefield accelerators (W. Mori, E. Esarey). AMR visualization and analytics collaboration (VACET). AMR elliptic solver benchmarking / performance collaboration (PERI, TOPS). Other projects: ESL edge plasma project - 5D gridded data (LLNL, LBNL). Cosmology - AMR Fluids + PIC (F. Miniati, ETH). Systems biology - PDE in complex geometry (A. Arkin, LBNL). Larger structured-grid AMR community: Norman (UCSD), Abel (SLAC), Flash (Chicago), SAMRAI (LLNL)… We all talk to each other, have common requirements.

Chombo: a Software Framework for Block-Structured AMR Requirement: to support a wide variety of applications that use block-structured AMR using a common software framework. Mixed-language model: C++ for higher level data structures, Fortran for regular single-grid calculations. Reusable components: Component design based on mapping of mathematical abstractions to classes. Build on public domain standards: MPI, HDF5, VTK. Previous work: BoxLib (LBNL/CCSE), KeLP (Baden, et. al., UCSD), FIDIL (Hilfinger and Colella).

Layered Design Layer 1. Data and operations on unions of boxes - set calculus, rectangular array library (with interface to Fortran), data on unions of rectangles, with SPMD parallelism implemented by distributing boxes over processors. Layer 2. Tools for managing interactions between different levels of refinement in an AMR calculation - interpolation, averaging operators, coarse-fine boundary conditions. Layer 3. Solver libraries - AMR-multigrid solvers, Berger-Oliger time-stepping. Layer 4. Complete parallel applications. Utility layer. Support, interoperability libraries - API for HDF5 I/O, visualization package implemented on top of VTK, C API’s.

Distributed Data on Unions of Rectangles Provides a general mechanism for distributing data defined on unions of rectangles onto processors, and communication between processors. Metadata of which all processors have a copy: BoxLayout is a collection of Boxes and processor assignment. template LevelData and other container classes hold data distributed over multiple processors. For each k=1... nGrids, an “array” of type T corresponding to the box B k is located on processor p k. Straightforward API’s for copying, exchanging ghost cell data, iterating over the arrays on your processor in a SPMD manner.

Typical I/O requirements Loads are balanced to fill available memory on all processors. Typical output data size corresponding to a single time slice: 10% - 100% of total memory image. Current problems scale to 100 - 1000 processors. Combustion and astrophysics simulations write one file / processor; other applications use Chombo API for HDF5.

Disk File “/ “ Group subdirectory Attribute, dataset files. Attribute: small metadata that multiple processes in a SPMD program can write out redundantly. Dataset: large data, each processor writes out only what it owns. 347156298 123456798 P0 P1 P2 Chombo API for HDF5 Parallel neutral: can change processor layout when re-inputting output data. Dataset creation is expensive: create only one dataset for each LevelData. The data for each patch is written into offsets from the origin of that dataset. HDF5 I/O

Observed performance of HDF5 applications in Chombo: no (weak) scaling. More detailed measurements indicate two causes: misalignment with disk block boundaries, lack of aggregation. Performance Analysis (Shan and Shalf, 2006) 347156298 347156298 347156298 123456798 123456798

Future Requirements Weak scaling to 10 4 processors. Need fo finer time resolution will add another 10x in data. Other data types: sparse data, particles. One file / processor doesn’t scale. Interfaces to VACET, FastBit.

Potential for Collaboration with SDM Common AMR data API developed under SciDAC I. APDEC weak scaling benchmark for solvers could be extended to I/O. Minimum buy-in: high-level API, portability, sustained support.

I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET.

Similar presentations

Presentation on theme: "I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET.

Similar presentations

Presentation on theme: "I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET."— Presentation transcript:

Similar presentations

About project

Feedback