NCAR Cyberinfrastructure for Earth System Modeling Don Middleton NCAR Scientific Computing Division APAN eScience Workshop, Honolulu January 28, 2004.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

NCAR The Earth System Grid (ESG) & The Community Data Portal (CDP) (NCARs Data & GriD Efforts) for COMMISSION FOR BASIC SYSTEMS INFORMATION SYSTEMS and.
Data Management Expert Panel - WP2. WP2 Overview.
1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
NG-CHC Northern Gulf Coastal Hazards Collaboratory Simulation Experiment Integration Sandra Harper 1, Manil Maskey 1, Sara Graves 1, Sabin Basyal 1, Jian.
A. Sim, CRD, L B N L 1 ANI and Magellan Launch, Nov. 18, 2009 Climate 100: Scaling the Earth System Grid to 100Gbps Networks Alex Sim, CRD, LBNL Dean N.
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
DOE Global Modeling Strategic Goals Anjuli Bamzai Program Manager Climate Change Prediction Program DOE/OBER/Climate Change Res Div
Earth System Curator Spanning the Gap Between Models and Datasets.
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Global Climate Modeling Research John Drake Computational Climate Dynamics Group Computer.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Toni Saarinen, Tite4 Tomi Ruuska, Tite4 Earth System Grid - ESG.
Cyberinfrastructure for Rapid Prototyping Capability Tomasz Haupt, Anand Kalyanasundaram, Igor Zhuk, Vamsi Goli Mississippi State University GeoResouces.
The Earth System Grid Discovery and Semantic Web Technologies Line Pouchard Oak Ridge National Laboratory Luca Cinquini, Gary Strand National Center for.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
TPAC Digital Library Talk Overview Presenter:Glenn Hyland Tasmanian Partnership for Advanced Computing & Australian Antarctic Division Outline: TPAC Overview.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
CCSM Portal/ESG/ESGC Integration (a PY5 GIG project) Lan Zhao, Carol X. Song Rosen Center for Advanced Computing Purdue University With contributions by:
NCAR NCAR Data and Grid Efforts: The Earth System Grid & The Community Data Portal Don Middleton NCAR Scientific Computing Division CAS2003 September 11,
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team.
NE II NOAA Environmental Software Infrastructure and Interoperability Program Cecelia DeLuca Sylvia Murphy V. Balaji GO-ESSP August 13, 2009 Germany NE.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
Ian Foster Argonne National Lab University of Chicago Globus Project The Grid and Meteorology Meteorology and HPN Workshop, APAN.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
ESG The Earth System Grid (ESG) Presented by Don Middleton & Luca Cinquini NCAR Scientific Computing Division On Behalf of the ESG Team SCD Executive Committee.
The Earth System Grid (ESG) Goals, Objectives and Strategies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
Intergrid KoM Santander 22 june, 2006 E-Infraestructure shared between Europe and Latin America José Manuel Gutiérrez
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Commodity Grid Kits Gregor von Laszewski (ANL), Keith Jackson (LBL) Many state-of-the-art scientific applications, such as climate modeling, astrophysics,
The Earth System Grid: A Visualisation Solution Gary Strand.
Web Portal Design Workshop, Boulder (CO), Jan 2003 Luca Cinquini (NCAR, ESG) The ESG and NCAR Web Portals Luca Cinquini NCAR, ESG Outline: 1.ESG Data Services.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
NIEeS Workshop, Cambridge (UK), Sep 2002 Luca Cinquini for the Earth System Grid METADATA DEVELOPMENT for the EARTH SYSTEM GRID Luca Cinquini (SCD/NCAR)
Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES
Adrianne Middleton National Center for Atmospheric Research Boulder, Colorado CAM T340- Jim Hack Running the Community Climate Simulation Model (CCSM)
- Vendredi 27 mars PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
ESG Observational Data Integration Presented by Feiyi Wang Technology Integration Group National Center of Computational Sciences.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
Access Control for NCAR Data Portals A report on work in progress about the future of the NCAR Community Data Portal Luca Cinquini GO-ESSP Workshop, 6-8.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
1 Overall Architectural Design of the Earth System Grid.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
1 Summary. 2 ESG-CET Purpose and Objectives Purpose  Provide climate researchers worldwide with access to data, information, models, analysis tools,
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
2. WP9 – Earth Observation Applications ESA DataGrid Review Frascati, 10 June Welcome and introduction (15m) 2.WP9 – Earth Observation Applications.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
SCD User Briefing The Community Data Portal and the Earth System Grid Don Middleton with presentation material developed by Luca Cinquini, Mary Haley,
U.S. Grid Projects and Involvement in EGEE Ian Foster Argonne National Laboratory University of Chicago EGEE-LHC Town Meeting,
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Data Requirements for Climate and Carbon Research John Drake, Climate Dynamics Group Computer.
The Community Climate System Model (CCSM): An Overview Jim Hurrell Director Climate and Global Dynamics Division Climate and Ecosystem.
The NOAA Operational Model Archive and Distribution System NOMADS CEOS-Grid Application Status Report Glenn K. Rutledge NOAA NCDC CEOS WGISS-19 Cordoba,
1 Scientific Data Management Group LBNL SRM related demos SC 2002 DemosDemos Robust File Replication of Massive Datasets on the Grid GridFTP-HPSS access.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
The Earth System Grid: A Visualisation Solution
Multidimensional Data and GIS
Data Requirements for Climate and Carbon Research
Metadata Development in the Earth System Curator
Data Management Components for a Research Data Archive
Presentation transcript:

NCAR Cyberinfrastructure for Earth System Modeling Don Middleton NCAR Scientific Computing Division APAN eScience Workshop, Honolulu January 28, 2004

NCAR Cyberinfrastructure for Earth System Modeling l Supercomputers l High-bandwidth networks l Models l Data centers and Grids l Collaboratories l Analysis and Visualization

NCAR “Atkins Report” l “A new age has dawned…” “The Panel’s overarching recommendation is that the National Science Foundation should establish and lead a large-scale, interagency, and internationally coordinated Advanced Cyberinfrastructure Program (ACP) to create, deploy, and apply cyberinfrastructure in ways that radically empower all scientific and engineering research and allied education. We estimate that sustained new NSF funding of $1 billion per year is needed to achieve critical mass and to leverage the coordinated co-investment from other federal agencies, universities, industry, and international sources necessary to empower a revolution. The cost of not acting quickly or at a subcritical level could be high, both in opportunities lost and in increased fragmentation and balkanization of the research.” Atkins Report, Executive Summary

NCAR Characteristics of Infrastructure (from Kim Mish workshop presentation) l Essential –So important that it becomes ubiquitous l Reliable –Example: the built environment of the Roman Empire l Expensive –Nothing succeeds like excess (e.g. Interstate system) –Inherently one-off (often, few economies of scale) l Clear factorization between research and practice –Generally deploy what provably works

NCAR A Global Coupled Climate Model

NCAR Climate Model Data Production l T42 CCSM (current, 280km) –7.5GB/yr, 100 years ->.75TB l T85 CCSM (140km) –29GB/yr, 100 years -> 2.9TB l T170 CCSM (70km) –110GB/yr, 100 years -> 11TB

NCAR Capacity-related Improvements Increased turnaround, model development, ensemble of runs Increase by a factor of 10, linear data l Current T42 CCSM –7.5GB/yr, 100 years ->.75TB * 10 = 7.5TB

NCAR CCM at T170 Resolution

NCAR Capability-related Improvements Spatial Resolution: T42 -> T85 -> T170 Increase by factor of ~ 10-20, linear data Temporal Resolution: Study diurnal cycle, 3 hour data Increase by factor of ~ 4, linear data CCM3 at T170 (70km)

NCAR Capability-related Improvements Quality: Improved boundary layer, clouds, convection, ocean physics, land model, river runoff, sea ice Increase by another factor of 2-3, data flat Scope: Atmospheric chemistry (sulfates, ozone…), biogeochemistry (carbon cycle, ecosystem dynamics), middle Atmosphere Model… Increase by another factor of 10+, linear data

NCAR Model Improvement Wishlist Grand Total: Increase compute by a Factor O( )

NCAR Advances at the Earth Simulator ESC Climate Model at T1279 (approx. 10km)

NCAR Longer-term Missions - Observation of Key Earth System Interactions Terra Aura Aqua Landsat 7 Exploratory - Explore Specific Earth System Processes and Parameters and Demonstrate Technologies GRACE PICASSO Cloudsat QuikScat EO-1 ICEsatJason-1 SRTM VCL We Will Examine Practically Every Aspect of the Earth System from Space in This Decade Triana Courtesy of Tim Killeen, NCAR

NCAR The Earth System Grid U.S. DOE SciDAC funded R&D effort - a “ Collaboratory Pilot Project” U.S. DOE SciDAC funded R&D effort - a “ Collaboratory Pilot Project” Build an “Earth System Grid” that enables management, discovery, distributed access, processing, & analysis of distributed terascale climate research data Build an “Earth System Grid” that enables management, discovery, distributed access, processing, & analysis of distributed terascale climate research data l Build upon Globus Toolkit  and DataGrid technologies and deploy l Potential broad application to other areas

NCAR ESG Team l ANL –Ian Foster (PI) –Veronika Nefedova –(John Bresenhan) –(Bill Allcock) l LBNL –Arie Shoshani –Alex Sim l ORNL –David Bernholdte –Kasidit Chanchio –Line Pouchard l LLNL/PCMDI –Bob Drach –Dean Williams (PI) l USC/ISI –Anne Chervenak –Carl Kesselman –(Laura Perlman) l NCAR –David Brown –Luca Cinquini –Peter Fox –Jose Garcia –Don Middleton (PI) –Gary Strand

NCAR

ESG Scenario l End 2002: 1.2 million files comprising ~75TB of data at NCAR, ORNL, LANL, NERSC, and PCMDI l End 2007: As much as 3 PB (3,000 TB) of data (!) l Current practice is already broken – the future will be even worse if something isn’t done…

NCAR ESG: Challenges l Enabling the simulation and data management team l Enabling the core research community in analyzing and visualizing results l Enabling broad multidisciplinary communities to access simulation results We need integrated scientific work environments that enable smooth WORKFLOW for knowledge development: computation, collaboration & collaboratories, data management, access, distribution, analysis, and visualization.

NCAR ESG: Strategies l Harness a federation of sites, web portals –Globus Toolkit -> The Earth System Grid -> The UltraDataGrid l Move data a minimal amount, keep it close to computational point of origin when possible –Data access protocols, distributed analysis l When we must move data, do it fast and with a minimum amount of human intervention –Storage Resource Management, fast networks l Keep track of what we have, particularly what’s on deep storage –Metadata and Replica Catalogs

NCAR

Server Tera/Peta-scale Archive HRM Tools for reliable staging, transport, and replication Server Tera/Peta-scale Archive HRM Client Selection Control Monitoring HRM Storage/Data Management

NCAR OPeNDAP An Open Source Project for a Network Data Access Protocol (originally DODS, the Distributed Oceanographic Data System)

NCAR OPeNDAP-g -Transparency -Performance -Security -Authorization -(Processing) Typical Application Data (local) netCDF lib Application Data (remote) OPeNDAP Client Application OPeNDAP Via http Big Data (remote) ESG client Application ESG + DODS OpenDAP Server ESG Server Distributed Application data Distributed Data Access Services OPeNDAP Via Grid

NCAR l For XML encoding of metadata (and data) of any generic netCDF file l Objects: netCDF, dimension, variable, attribute l Beta version reference implementation as Java Library ( ESG: NcML Core Schema netCDF nc:netCDFType nc:dimension nc:variable nc: attribute nc:values nc:VariableType

NCAR Object [1] id Object [1] id Activity [0,1] name [0,1] description [0,1] rights [0,n] date type= [0,n] note [0,n] participant role= [0,n] reference uri= Activity [0,1] name [0,1] description [0,1] rights [0,n] date type= [0,n] note [0,n] participant role= [0,n] reference uri= isA Investigation isA Project [0,n] topic type= [0,1] funding Project [0,n] topic type= [0,1] funding isA Ensemble Campaign isPartOf Simulation [0,n] simulationInput type= [0,n] simulationHardware Simulation [0,n] simulationInput type= [0,n] simulationHardware Observation Experiment Analysis isPartOf hasParent hasChild hasSibling Dataset [0,1] type [0,1] conventions [0,n] date type= [0,n] format type= uri= [0,1] timeCoverage [0,1] spaceCoverage Dataset [0,1] type [0,1] conventions [0,n] date type= [0,n] format type= uri= [0,1] timeCoverage [0,1] spaceCoverage isA generated By isPart Of Person [0,1] firstName [0,1] lastName [0,1] contact Person [0,1] firstName [0,1] lastName [0,1] contact Institution [0,1] name [0,1] type [0,1] contact Institution [0,1] name [0,1] type [0,1] contact isA worksFor participant role= Class AbstractClass inheritance association LEGEND Service [0,1] name [0,1] description Service [0,1] name [0,1] description serviceId

NCAR ESG Current Topology RLI MSS HRM HPSS HRM RLI HPSS HRM RLI DISK HRM RLI DISK OGSA-DAI MySQL RDBMS ESG WEB PORTAL Tomcat/Struts cross-update gridFTP query MyProxy authenticate GRAM GATEKEEPER submit execute gridFTP SERVER LAS SERVER visualize LBNL ISI LLNL NCAR ORNL CAS ANL LRC

NCAR Data->Knowledge Mass Storage System (1.3PB) Petascale Knowledge Repository Establish new paradigms for managing and accessing scientific data based on semantic organization.

NCAR Collaborations & Relationships l CCSM Data Management Group l The Globus Project l Other SciDAC Projects: Climate, Security & Policy for Group Collaboration, Scientific Data Management ISIC, & High- performance DataGrid Toolkit l OPeNDAP/DODS (multi-agency) l NSF National Science Digital Libraries Program (UCAR & Unidata THREDDS Project) l U.K. e-Science and British Atmospheric Data Center l NOAA NOMADS and CEOS-grid l Earth Science Portal group (multi-agency, intnl.) l ESMF (emerging)

NCAR NCAR Command Language (NCL)

NCAR

NCL: Core l Approx. 500 built-in functions and procedures –File I/O & data model for Earth sciences –Unique grids, Climate-modeling routines –Spherical harmonics, Regridding and interpolation –Graphics (wind barbs, simple 3D plots) l 36 NCL core visual representations –Contours, XY plots, vectors, streamlines, maps, histograms, text, markers, polygons l Supported on Unix, Linux, Mac, and PC 10 years, 20 People involved with development, 50 person-years of effort, about 1.5 million lines of source, 500K lines of documentation

NCAR NCL as CI for a Community l CAM & CCSM Processor – 100 functions, 200 examples, 20K lines of NCL code (CGD) l WGNE Climate Diagnostics Processor – 10K lines of NCL code (CGD) l Award-winning Aviation Weather Site (RAP) l MM5 Analysis Package (RIP) l Weather Research & Forecast Model: Initial community analysis software and RIP l Community Data Portal (SCD)

NCAR NCL

Collaborative Environments and the AccessGrid Science Portals + AccessGrid: University of Michigan (Knoop, Hardin) Vegetation & Ecosystem Mapping Program (VEMAP) NCAR/SCD VETS/KEG Argonne National Labs

NCAR END