CCSM Performance, Successes and Challenges Tony Craig NCAR RIST Meeting March 12-14, 2002 Boulder, Colorado, USA.

Slides:



Advertisements
Similar presentations
Weather Research & Forecasting: A General Overview
Advertisements

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY Center for Computational Sciences Cray X1 and Black Widow at ORNL Center for Computational.
CM-5 Massively Parallel Supercomputer ALAN MOSER Thinking Machines Corporation 1993.
Supercomputing Challenges at the National Center for Atmospheric Research Dr. Richard Loft Computational Science Section Scientific Computing Division.
CCSM cpl6 Design and Interfaces Tony Craig Brian Kauffman Tom Bettge National Center for Atmospheric Researc Robert Jacob Jay Larson Everest Ong Argonne.
CCSM Testing Status Tony Craig Lawrence Buja Wei Yu CCSM SEWG Meeting Feb 5, 2003.
Nov. 2002NERSC/LBNL1 Climate Modeling: Coupling Component Models by MPH for Distributed Multi-Component Environment Chris Ding and Yun (Helen) He NERSC.
Dynamical Downscaling of CCSM Using WRF Yang Gao 1, Joshua S. Fu 1, Yun-Fat Lam 1, John Drake 1, Kate Evans 2 1 University of Tennessee, USA 2 Oak Ridge.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Global Climate Modeling Research John Drake Computational Climate Dynamics Group Computer.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
Parallel Computing Overview CS 524 – High-Performance Computing.
11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.
Chapter Hardwired vs Microprogrammed Control Multithreading
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
ORIGINAL AUTHOR JAMES REINDERS, INTEL PRESENTED BY ADITYA AMBARDEKAR Overview for Intel Xeon Processors and Intel Xeon Phi coprocessors.
WRF-VIC: The Flux Coupling Approach L. Ruby Leung Pacific Northwest National Laboratory BioEarth Project Kickoff Meeting April 11-12, 2011 Pullman, WA.
KUAS.EE Parallel Computing at a Glance. KUAS.EE History Parallel Computing.
Performance Evaluation of Hybrid MPI/OpenMP Implementation of a Lattice Boltzmann Application on Multicore Systems Department of Computer Science and Engineering,
NSF NCAR | NASA GSFC | DOE LANL ANL | NOAA NCEP GFDL | MIT Adoption and field tests of M.I.T General Circulation Model (MITgcm) with ESMF Chris Hill ESMF.
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
Lappeenranta University of Technology / JP CT30A7001 Concurrent and Parallel Computing Introduction to concurrent and parallel computing.
Parallel and Distributed Simulation Hardware Platforms Simulation Fundamentals.
CESM/RACM/RASM Update May 15, Since Nov, 2011 ccsm4_0_racm28:racm29:racm30 – vic parallelization – vic netcdf files – vic coupling mods and “273.15”
Computational Design of the CCSM Next Generation Coupler Tom Bettge Tony Craig Brian Kauffman National Center for Atmospheric Research Boulder, Colorado.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Computer Science Section National Center for Atmospheric Research Department of Computer Science University of Colorado at Boulder Blue Gene Experience.
Mathematics and Computer Science & Environmental Research Divisions ARGONNE NATIONAL LABORATORY Regional Climate Simulation Analysis & Vizualization John.
Experience with COSMO MPI/OpenMP hybrid parallelization Matthew Cordery, William Sawyer Swiss National Supercomputing Centre Ulrich Schättler Deutscher.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
ESMF Performance Evaluation and Optimization Peggy Li(1), Samson Cheung(2), Gerhard Theurich(2), Cecelia Deluca(3) (1)Jet Propulsion Laboratory, California.
CESM/ESMF Progress Report Mariana Vertenstein NCAR Earth System Laboratory CESM Software Engineering Group (CSEG) NCAR is sponsored by the National Science.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
1 CCSM Component Performance Benchmarking and Status of the CRAY X1 at ORNL Patrick H. Worley Oak Ridge National Laboratory Computing in Atmospheric Sciences.
Regional Models in CCSM CCSM/POP/ROMS: Regional Nesting and Coupling Jon Wolfe (CSEG) Mariana Vertenstein (CSEG) Don Stark (ESMF)
Petascale –LLNL Appro AMD: 9K processors [today] –TJ Watson Blue Gene/L: 40K processors [today] –NY Blue Gene/L: 32K processors –ORNL Cray XT3/4 : 44K.
ROMS as a Component of the Community Climate System Model (CCSM) Enrique Curchitser, IMCS/Rutgers Kate Hedstrom, ARSC/UAF Bill Large, Mariana Vertenstein,
High performance parallel computing of climate models towards the Earth Simulator --- computing science activities at CRIEPI --- Yoshikatsu Yoshida and.
Adrianne Middleton National Center for Atmospheric Research Boulder, Colorado CAM T340- Jim Hack Running the Community Climate Simulation Model (CCSM)
The CCSM2.0 Quick Start Guide Lawrence Buja CCSM Software Engineering Group June
Towards development of a Regional Arctic Climate System Model --- Coupling WRF with the Variable Infiltration Capacity land model via a flux coupler Chunmei.
1 OASIS3-MCT_3.0 OASIS overview OASIS3-MCT_3.0 Some recent performance results Summary and future efforts A. Craig, S. Valcke, L. Coquart, CERFACS April.
CCSM Portability and Performance, Software Engineering Challenges, and Future Targets Tony Craig National Center for Atmospheric Research Boulder, Colorado,
CCSM Tutorial CCSM Software Engineering Group June
Computing Environment The computing environment rapidly evolving ‑ you need to know not only the methods, but also How and when to apply them, Which computers.
CPS 4150 Computer Organization Fall 2006 Ching-Song Don Wei.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
Lawrence Livermore National Laboratory S&T Principal Directorate - Computation Directorate Tools and Scalable Application Preparation Project Computation.
On the Road to a Sequential CCSM Robert Jacob, Argonne National Laboratory Including work by: Mariana Vertenstein (NCAR), Ray Loy (ANL), Tony Craig (NCAR)
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
Running CESM An overview
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-2.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Data Requirements for Climate and Carbon Research John Drake, Climate Dynamics Group Computer.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
CCSM Software Engineering Update Tony Craig CCSM SEWG Meeting Feb 4, 2003.
NSF NCAR / NASA GSFC / DOE LANL ANL / NOAA NCEP GFDL / MIT / U MICH C. DeLuca/NCAR, J. Anderson/NCAR, V. Balaji/GFDL, B. Boville/NCAR, N. Collins/NCAR,
The Community Climate System Model (CCSM): An Overview Jim Hurrell Director Climate and Global Dynamics Division Climate and Ecosystem.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
PERFORMANCE OF THE OPENMP AND MPI IMPLEMENTATIONS ON ULTRASPARC SYSTEM Abstract Programmers and developers interested in utilizing parallel programming.
Overview of the CCSM CCSM Software Engineering Group June
CRESCO Project: Salvatore Raia
Software Practices for a Performance Portable Climate System Model
What is Parallel and Distributed computing?
CCSM3’s IPCC Simulations and Lessons Learned
Distributed computing deals with hardware
A High Performance SoC: PkunityTM
Presentation transcript:

CCSM Performance, Successes and Challenges Tony Craig NCAR RIST Meeting March 12-14, 2002 Boulder, Colorado, USA

OUTLINE Overview of CCSM, platforms Individual component performance Coupling, load balance, coupled performance Challenges in the environment Summary

CCSM Overview CCSM = Community Climate System Model (NCAR) Designed to evaluate and understand earth’s climate, both historical and future. Multiple executables (5) –Atmosphere (CAM), MPI/OpenMP –Ocean (POP), MPI –Land (CLM2), MPI/OpenMP –Sea Ice (CSIM4), MPI –Coupler (CPL5), OpenMP

CCSM Platforms Currently support –IBM Power3, Power4 –SGI Origin –CPQ (not quite) Future? –Linux –Vector Platforms

CCSM Performance Summary T42 resolution atm and land, 26 vertical levels in atm (128x64x26) 1 degree resolution ocean and ice, 40 vertical levels in ocean (320x384x40) On 100 processors of IBM winterhawk II (4 processors/node, 375 Mhz clock, TBMX network): –Model runs about 4 simulated years / day –Requires about a month to run a century

Component Timings

Component Scaling

Coupling Multiple executables, running concurrently on unique processors sets. All communication is root-to-root, via coupler. Coupling frequency is 1 hour for atm, land, and ice models, 1 day for ocean. 1 send and 1 receive for each component during each coupling interval. Scientific requirement that the land and ice models compute fluxes for the atmosphere “first”. Atm Ocean Land Ice Time Processors

Figure 2: CCSM Load Balancing 40 ocean 32 atm 16 ice 12 land 04 cpl 104 total Timings in seconds per day 5 processors

Other Coupling Options: single executable, sequential Single executable, components running sequentially on all processors –No idle time –Components must scale well to large number of processors Atm Ocean Land Ice Time Processors

Other Coupling Options: combined sequential/concurrent on overlapping processors Single or multiple executable, components running sequentially and/or concurrently on overlapping processor sets. –Optimal compromise between scaling and load balance. –Requires sophisticated coupler and improvements to hardware and machine software capabilities. LandIce Time Processors Ocean Atm

Challenges in the Environment Batch environment; startup and control of multiple executables Tools –Debuggers are challenged on multiple executable, MPI/OpenMP parallel models –Timing and profiling tools inadequate –Compilers and libraries can be slow, buggy, and constantly changing Machines not well balanced; chip speed vs interconnect vs memory access and cache.

Performance Summary Successes –Single component scaling –MPI/OpenMP parallelization of components –Load balancing multiple executables Challenges –Scaling up to 1000s of processors for our standard problem –Load balancing multiple executables –Optimizing relative to cache, memory, network, chip, and parallelism –RISC vs Vector architectures –Environment