Parallel I/O in CMAQ David Wong, C. E. Yang*, J. S. Fu*, K. Wong*, and Y. Gao** *University of Tennessee, Knoxville, TN, USA **now at: Pacific Northwest.

Slides:



Advertisements
Similar presentations
MPI version of the Serial Code With One-Dimensional Decomposition Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy.
Advertisements

Jialin Liu, Bradly Crysler, Yin Lu, Yong Chen Oct. 15. Seminar Data-Intensive Scalable Computing Laboratory (DISCL) Locality-driven High-level.
June 2003Yun (Helen) He1 Coupling MM5 with ISOLSM: Development, Testing, and Application W.J. Riley, H.S. Cooley, Y. He*, M.S. Torn Lawrence Berkeley National.
Current Progress on the CCA Groundwater Modeling Framework Bruce Palmer, Yilin Fang, Vidhya Gurumoorthi, Computational Sciences and Mathematics Division.
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
GEOS-Chem Simulations for CMAQ Initial and Boundary Conditions 1 Yun-Fat Lam, 1 Joshua S. Fu, 2 Daniel J. Jacob, 3 Carey Jang and 3 Pat Dolwick.
Copyright HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919) P.O. Box 569, Chapel Hill,
Recent Developments for Parallel CMAQ Jeff Young AMDB/ASMD/ARL/NOAA David Wong SAIC – NESCC/EPA.
Dynamical Downscaling of CCSM Using WRF Yang Gao 1, Joshua S. Fu 1, Yun-Fat Lam 1, John Drake 1, Kate Evans 2 1 University of Tennessee, USA 2 Oak Ridge.
I/O Analysis and Optimization for an AMR Cosmology Simulation Jianwei LiWei-keng Liao Alok ChoudharyValerie Taylor ECE Department Northwestern University.
Display States, Class I Areas Home extent set to 36km grid.
Ozone and Aerosols in US and East Asia between 2001 and 2002 Yun-Fat Lam 1, Joshua S. Fu 1, Zuopan Li 1, Carey Jang 2, Rokjin Park 3 and Daniel J. Jacob.
Page 1 CS Department Parallel Design of JPEG2000 Image Compression Xiuzhen Huang CS Department UC Santa Barbara April 30th, 2003.
Physical Design CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 Physical Design Steps 1. Develop standards 2.
Introduction to Scientific Computing on Linux Clusters Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002.
Connecting HPIO Capabilities with Domain Specific Needs Rob Ross MCS Division Argonne National Laboratory
Performance of CMAQ on a Mac OS X System Tracey Holloway, John Bachan, Scott Spak Center for Sustainability and the Global Environment University of Wisconsin-Madison.
Office of Research and Development National Exposure Research Laboratory, Atmospheric Modeling Division, Applied Modeling Research Branch October 8, 2008.
Template Development of a Plume-in-Grid Version of Global-through-Urban WRF/Chem Prakash Karamchandani, Krish Vijayaraghavan, Shu-Yun Chen ENVIRON International.
Improvement of extreme climate predictions from dynamical climate downscaling Yang Gao 1, Joshua S. Fu 1, John B. Drake 1, Yang Liu 2, Jean-Francois Lamarque.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
Introduction to the WRF Modeling System Wei Wang NCAR/MMM.
Framework for Risk Analysis in Multimedia Environmental Systems - Version 2 (FRAMES-2) Overview FRAMES-2.0 Workshop U.S. Nuclear Regulatory Commission.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
The WRF Model The Weather Research and Forecasting (WRF) Model is a mesoscale numerical weather prediction system designed for both atmospheric research.
Pursuing Faster I/O in COSMO POMPA Workshop May 3rd 2010.
Active Storage and Its Applications Jarek Nieplocha, Juan Piernas-Canovas Pacific Northwest National Laboratory 2007 Scientific Data Management All Hands.
Active Cell Name Box Title Bar Formula Bar ColumnsMenu Bar Formatting Toolbar Standard Toolbar Rows Cell Fill Handle.
Implementation of the Particle & Precursor Tagging Methodology (PPTM) for the CMAQ Modeling System: Mercury Tagging 5 th Annual CMAS Conference Research.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Accomplishments and Remaining Challenges: THREDDS Data Server and Common Data Model Ethan Davis Unidata Policy Committee Meeting May 2011.
1 Using Hemispheric-CMAQ to Provide Initial and Boundary Conditions for Regional Modeling Joshua S. Fu 1, Xinyi Dong 1, Kan Huang 1, and Carey Jang 2 1.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.
Community Multiscale Air Quality Modeling System CMAQ Air Quality Data Summit February 2008.
11/7/2007HDF and HDF-EOS Workshop XI, Landover, MD1 HDF5 Software Process MuQun Yang, Quincey Koziol, Elena Pourmal The HDF Group.
Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory.
Design and Development of MAQM, an Air Quality Model Architecture with Process-Level Modularity Weimin Jiang, Helmut Roth, Qiangliang Li, Steven C. Smyth,
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
I/O on Clusters Rajeev Thakur Argonne National Laboratory.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Temporal Source Apportionment of Policy-Relevant Air Quality Metrics Nicole MacDonald Amir Hakami CMAS Conference October 11, 2010.
Office of Research and Development Atmospheric Modeling Division, National Exposure Research Laboratory WRF-CMAQ 2-way coupled system: Part I David Wong,
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
Development and Application of Parallel Plume-in-Grid Models Prakash Karamchandani, Krish Vijayaraghavan, Shu-Yun Chen and Christian Seigneur AER, San.
Earth System Modeling Framework Python Interface (ESMP) October 2011 Ryan O’Kuinghttons Robert Oehmke Cecelia DeLuca.
MESQUITE: Mesh Optimization Toolkit Brian Miller, LLNL
ESMF Regridding Update Robert Oehmke Ryan O’Kuinghttons Amik St. Cyr.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
Connections to Other Packages The Cactus Team Albert Einstein Institute
Progress on Component-Based Subsurface Simulation I: Smooth Particle Hydrodynamics Bruce Palmer Pacific Northwest National Laboratory Richland, WA.
Research & Development Building a science foundation for sound environmental decisions Remote Sensing Information Gateway (RSIG)
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
JL MONGE - Chimere software evolution 1 Chimere software evolution How to make a code : –Obscure –Less efficient –Usable by nobody but its author.
University of North Carolina at Chapel Hill Carolina Environmental Programs Community Modeling and Analysis System (CMAS) Year 3 Adel Hanna Director, CMAS.
Parallel NetCDF Rob Latham Mathematics and Computer Science Division Argonne National Laboratory
Enhance CMAQ Performance to Meet Future Challenges: I/O Aspect David Wong AMAD, EPA October 20, 2009.
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
SAN DIEGO SUPERCOMPUTER CENTER Advanced User Support Project Overview Thomas E. Cheatham III University of Utah Jan 14th 2010 By Ross C. Walker.
SDM Center Parallel I/O Storage Efficient Access Team.
Response of fine particles to the reduction of precursor emissions in Yangtze River Delta (YRD), China Juan Li 1, Joshua S. Fu 1, Yang Gao 1, Yun-Fat Lam.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package Christian Chilan, Kent Yang, Albert Cheng, Quincey Koziol, Leon Arber.
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
GFE in RFCs Tom LeFebvre ESRL/Global Systems Division.
Office of Research and Development National Exposure Research Laboratory, Atmospheric Modeling Division Donna Schwede NOAA/Air Resources Laboratory Atmospheric.
Locality-driven High-level I/O Aggregation
J-Zephyr Sebastian D. Eastham
PyStormTracker: A Parallel Object-Oriented Cyclone Tracker in Python
FUJIN: a parallel framework for meteorological models
Presentation transcript:

Parallel I/O in CMAQ David Wong, C. E. Yang*, J. S. Fu*, K. Wong*, and Y. Gao** *University of Tennessee, Knoxville, TN, USA **now at: Pacific Northwest National Laboratory, Richland, WA, USA 1 CMAS Conference October 5, 2015

Background IOAPI was developed over 20 years ago It was designed for serial operation (with OpenMP capability) CMAQ uses it for performing data input and output functions IOAPI package provides very simple and nice interface to handle netCDF format of data I/O has been reported as one of the performance bottlenecks in CMAQ 2

PARIO It was developed in 1998 It was built on top of IOAPI with a “pseudo” parallel implementation Single read and single write 3

CMAQ I/O Scheme 4

CMAQ I/O Scheme (cont’) 5

What is pnetCDF? developed by Northwestern University and Argonne National Laboratory Build on top of MPI2 Based on netCDF format Requires Parallel File System (e.g. Lustre, GPFS) Publicly available free software 6

Parallel I/O Scheme 7

Among of Data in a Processor 8

With Data Aggregation 9 aggregate along column, cc aggregate along row, cr

With Data Aggregation (cont’) 10

Experiment Setup Consider three sizes of domain: CA (89x134), Eastern US (279x299), and US conus (459x299) 35 layers and 146 species Reading in or writing out three time steps of data, using variable techniques: regular, pnetcdf, and data aggregation along a dimension Tested on two DOE machines: Edison and Kraken with Lustre file system Experimenting various stripe counts (2, 5, 11) and sizes (1M, 2M, 4M, 8M, 16M) 11

Simulation Domains 12

Experiment Code 13

Performance Metric 14

CA Case on Kraken 15

CA Case on Edison 16

EUS Case on Kraken 17

EUS Case on Edison 18

CONUS Case on Kraken 19

CONUS Case on Edison 20

pnetCDF Failure?! 21

With GPFS NOAA machine, 16x32 Conus domain 22 Different I/O methods Time (sec)

Test on CMAQ Test with 120, 180, and 360 CPUs on Kraken and Edison with one day simulation on EUS 4km domain Write out VIS, WET_DEP, CONC, DRY_DEP, A_CONC and S_CGRID files only Stripe count is 11 and stripe size 2M 23

Timing 24

Summary True parallel I/O using pnetCDF renders excellent performance Data aggregation technique enhances the performance of pnetCDF The new technique has been applied to CMAQ output operation and has shown significant improvement Collaborating with Dr. Carlie Coats to implement this in IOAPI_3 library Current implementation only works with gridded files with same size as the model domain 25