RAMADDA for Big Climate Data Don Murray NOAA/ESRL/PSD and CU-CIRES Boulder/Denver Big Data Meetup - June 18, 2014.

Slides:



Advertisements
Similar presentations
Climate change, Does it matter? Martin Hedberg meteorologist Swedish Weather Center.
Advertisements

PRESENTS: FORECASTING FOR OPERATIONS AND DESIGN February 16 th 2011 – Aberdeen.
A Look At The Research Perspective Assessed in IPCC Third Assessment Report (TAR) Climate Change 2001: The Scientific Basis (Working Group 1; Sir John.
Climate Change: Science and Modeling John Paul Gonzales Project GUTS Teacher PD 6 January 2011.
ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley.
Analysis of Extremes in Climate Science Francis Zwiers Climate Research Division, Environment Canada. Photo: F. Zwiers.
THORPEX-Pacific Workshop Kauai, Hawaii Polar Meteorology Group, Byrd Polar Research Center, The Ohio State University, Columbus, Ohio David H. Bromwich.
Transitioning unique NASA data and research technologies to the NWS 1 Evaluation of WRF Using High-Resolution Soil Initial Conditions from the NASA Land.
Global temp 140 years IPCC Summary for Policymakers 2001.
The International Surface Pressure Databank (ISPD) and Twentieth Century Reanalysis at NCAR Thomas Cram - NCAR, Boulder, CO Gilbert Compo & Chesley McColl.
MARYLAND’S CLIMATE: VARIABILITY AND CHANGE Dr. Konstantin Vinnikov, Acting State Climatologist for Maryland University of Maryland at College Park, MD.
The Canadian Climate Impacts Scenarios (CCIS) Project is funded by the Climate Change Action Fund and provides climate change scenarios and related information.
SECC – CCSP Meeting November 7, 2008 Downscaling GCMs to local and regional levels Institute of Food and Agricultural Sciences Guillermo A. Baigorria
CPC’s U.S. Seasonal Drought Outlook & Future Plans April 20, 2010 Brad Pugh, CPC.
TPAC Digital Library Talk Overview Presenter:Glenn Hyland Tasmanian Partnership for Advanced Computing & Australian Antarctic Division Outline: TPAC Overview.
Dr Mark Cresswell Model Assimilation 69EG6517 – Impacts & Models of Climate Change.
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat Data Publishing.
ClimDB/HydroDB (ClimHy) Integration ClimHy has been migrated from AND to LNO and will remain status quo in 2011 – Public page (
Climate Forecasting Unit Prediction of climate extreme events at seasonal and decadal time scale Aida Pintó Biescas.
AERONET Web Data Access and Relational Database David Giles Science Systems and Applications, Inc. NASA Goddard Space Flight Center.
Outline Further Reading: Detailed Notes Posted on Class Web Sites Natural Environments: The Atmosphere GE 101 – Spring 2007 Boston University Myneni L30:
Workshop on QC in Derived Data Products, Las Cruces, NM, 31 January 2007 ClimDB/HydroDB Objectives Don Henshaw Improve access to long-term collections.
1 The NOAA Weather and Climate Toolkit Steve Ansari, Stephen Del Greco, Neal Lott (NOAA / NCDC)
Climate data sets: introduction two perspectives: A. What varieties of data are available? B. What data helps you to identify...
ATMOSPHERIC SCIENCE DATA CENTER ‘Best’ Practices for Aggregating Subset Results from Archived Datasets Walter E. Baskin 1, Jennifer Perez 2 (1) Science.
What is a Climate Model?.
Data discovery and data processing for environmental research infrastructures Roberto Cossu ENVRI WP4 leader ESA.
June 20-22, nomads.ncdc.noaa.gov Being developed and integrated to provide one-stop.
European Climate Assessment CCl/CLIVAR ETCCDMI meeting Norwich, UK November 2003 Albert Klein Tank KNMI, the Netherlands.
WDCGG Outline What is WDCGG How WDCGG works Data information –Data type –Data format download.
Meteorological Data Analysis Urban, Regional Modeling and Analysis Section Division of Air Resources New York State Department of Environmental Conservation.
Climate Scenario and Uncertainties in the Caribbean Chen,Cassandra Rhoden,Albert Owino Anthony Chen,Cassandra Rhoden,Albert Owino Climate Studies Group.
IHOP_2002 DATA MANAGEMENT UPDATE Steve Williams UCAR/Joint Office for Science Support (JOSS) Boulder, Colorado 2 nd International IHOP_2002 Science Workshop.
Using data in the classroom Workshop facilitators: Cindy Shellito Kathy Surpless.
The evolution of climate modeling Kevin Hennessy on behalf of CSIRO & the Bureau of Meteorology Tuesday 30 th September 2003 Canberra Short course & Climate.
Post Processing Tools Sylvia Murphy National Center for Atmospheric Research.
Climate Analysis Section, CGD, NCAR, USA Detection and attribution of extreme temperature and drought using an analogue-based dynamical adjustment technique.
Teaching Climate Change: Lessons from the Past 2006 Workshop Montana State University, Bozeman Mt Teaching with Real Data: Paleoclimatology Resources for.
Multi-Model Ensembles for Climate Attribution Arun Kumar Climate Prediction Center NCEP/NOAA Acknowledgements: Bhaskar Jha; Marty Hoerling; Ming Ji & OGP;
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
Weather and Climate. Introduction Before the end of June 2011, the National Oceanic and Atmospheric Administration (NOAA) officially declared the year.
An introduction to CDO, NCL and PRECIS utilities
Running CESM An overview
WCRP Extremes Workshop Sept 2010 Detecting human influence on extreme daily temperature at regional scales Photo: F. Zwiers (Long-tailed Jaeger)
“Building the daily observations database for the European Climate Assessment” KNMI.nl CLARIS meeting, 7 july 2005.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
U.S. Department of the Interior U.S. Geological Survey Automatic Generation of Parameter Inputs and Visualization of Model Outputs for AGNPS using GIS.
TIGGE Archive Access at NCAR Steven Worley Doug Schuster Dave Stepaniak Hannah Wilcox.
Figure 3. Overview of system architecture for RCMES. A Regional Climate Model Evaluation System based on Satellite and other Observations Peter Lean 1.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Pearl River Coordination Meeting October 7, 2009 Dave Reed Hydrologist in Charge Lower Mississippi River Forecast Center.
Corn Yield Comparison Between EPIC-View Simulated Yield And Observed Yield Monitor Data by Chad M. Boshart Oklahoma State University.
1 Yun Fan, Huug van den Dool, Dag Lohmann, Ken Mitchell CPC/EMC/NCEP/NWS/NOAA Kunming, May, 2004.
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
HYDROCARE Kick-Off Meeting 13/14 February, 2006, Potsdam, Germany HYDROCARE Actions 2.1Compilation of Meteorological Observations, 2.2Analysis of Variability.
Actions & Activities Report PP8 – Potsdam Institute for Climate Impact Research, Germany 2.1Compilation of Meteorological Observations, 2.2Analysis of.
IASC Workshop Potsdamr, Germany Polar Meteorology Group, Byrd Polar Research Center, The Ohio State University, Columbus, Ohio, USA The Arctic System Reanalysis.
Climate Monitoring Tools High Plains Regional Climate Center
Downloading Weather Observations
Overview of Downscaling
RCM workshop, Meteo Rwanda, Kigali
A project within the EC 5th Framework Programme EVK2-CT
MELODIST – An open-source MEteoroLOgical observation time series DISaggregation Tool Kristian Förster, Florian Hanzer, Benjamin Winter, Thomas Marke,
Global warming - a gradual increase in the temperature of the Earth's surface caused by the greenhouse effect and causing climate change on a global scale.
Multimodel Ensemble Reconstruction of Drought over the Continental U.S
INVESTIGATING CLIMATE CHANGE USING OBSERVED TEMPERATURE DATA
Multimodel Ensemble Reconstruction of Drought over the Continental U.S
Earth's Dynamic Climate
V. Uddameri Texas Tech University
Robert Dattore and Steven Worley
Presentation transcript:

RAMADDA for Big Climate Data Don Murray NOAA/ESRL/PSD and CU-CIRES Boulder/Denver Big Data Meetup - June 18, 2014

Outline The Problem Space The Data Space The RAMADDA Solution How should we deal with complex calculations? Boulder/Denver Big Data Meetup - June 18, 2014

The Problem Space Climate Attribution –What caused the 2013 Colorado flood? –What is causing the California drought? –Has global warming stopped? What do the observations say? Can climate models give us insight into the statistical nature of these events? Boulder/Denver Big Data Meetup - June 18, 2014

The Data Space Observations –National Climatic Data Center (NCDC) collects data from worldwide observing sites Temperature (30-40K stations), Precipitation (75K stations), 1901-present, 90K files Problem: Different stations have different recording periods and gaps in the record Reanalyses –Model reconstructions from observations. –Help fill in the gaps – but are not observations Boulder/Denver Big Data Meetup - June 18, 2014

The Data Space Climate model simulations –Climate models are used to test the impact of external forcing on the atmosphere (experiments) Greenhouse gases, sea surface temperature, arctic sea ice –Multiple runs using the same inputs with slight perturbations of the initial conditions Ensembles provide useful statistics (mean, variance) –Multiple models using the same experiment Ensemble of ensembles Boulder/Denver Big Data Meetup - June 18, 2014

The Data Space PSD Climate Model Output –Experiments are run over a period of time (e.g present, 1880-present) –Global models at.75 to 1.25 degree resolution 27 levels K points/parameter/level/time step/ensemble Problem: Different domains (-180 to 180, 0 to 360) –Model’s internal calculations vary (5 mins to hours) Output data for each 6 hour time step (0, 06, 12, 18) Post processing produces daily and monthly averages –Output format is netCDF (in an ideal world) Boulder/Denver Big Data Meetup - June 18, 2014

The Data Space Ensemble size from 10 to 50 members –Even larger in other cases Multiple parameters calculated –Temperature, precipitation, wind, humidity, etc. –Problem: Each model has different variable names and units Each experiment can take weeks to months to complete on a supercomputer. Boulder/Denver Big Data Meetup - June 18, 2014

The Data Space At NOAA/ESRL/P SD we run multiple models with multiple ensembles for multiple experiments Need to provide web- based access and analysis capabilities Boulder/Denver Big Data Meetup - June 18, 2014

The Data Problem 1 model, 20 ensembles, 34 years: ~10 TB data, 14K files, multiple parameters/file Post processing –Separate by parameter –Daily/monthly averages, merge files –Convert to common names/units End result for 1 model/experiment –Monthly data: ~.5 TB, 700 files –Daily data: ~7.5 TB, 13.5K files Times 2 models x 6 experiments Boulder/Denver Big Data Meetup - June 18, 2014

The RAMADDA Solution NOAA’s Facility for Climate Assessments (FACTS) –Web based access to climate model runs and reanalyses –Provides on-line analysis –Download raw data PSD Climate Data Repository –Access other data holdings –Publishing platform for visualization bundles, images and climate assessments Boulder/Denver Big Data Meetup - June 18, 2014

The RAMADDA Solution Ingest the metadata –Use harvester for automatic metadata ingestion –For some datasets, use Entry XML specification Organize the data –Use collections to partition the data (monthly vs. daily) –Database searches make finding the data easy Data Processing Framework –Loosely based on Open Geospatial Consortium (OGC) Web Processing Service (WPS) –Fairly simple calculations – areal/temporal subsetting/averaging –Use community accepted tools for analysis and plotting (Climate Data Operators, NCAR Command Language) Other tools could be plugged in (e.g., R) –Currently synchronous, looking at batch processing Boulder/Denver Big Data Meetup - June 18, 2014

The RAMADDA Solution Demo/Examples Boulder/Denver Big Data Meetup - June 18, 2014

Complex calculations Question: How are extremes behaving during the hiatus? –Look at 27 standard extreme indices (e.g., frost free days, number of days that max temp exceeds the 90 th percentile, etc.) Finding 99 th percentile precipitation in the ensemble space requires reading all members for all times for all points. 5 models/> 100 ensembles/multiple experiments = Big Data Boulder/Denver Big Data Meetup - June 18, 2014

Complex calculations Tools used now –FORTRAN, R, Python Data has to be looked at as a cohesive unit for statistical calculations, but may be in many files. Problems –getting all the data into memory –System reliability Could standard Big Data processes be applied? Boulder/Denver Big Data Meetup - June 18, 2014

Links NOAA/ESRL/PSD Climate Data Repository – Facility for Climate Assessments (FACTS) – actshttp:// acts RAMADDA – Boulder/Denver Big Data Meetup - June 18, 2014