Presentation is loading. Please wait.

Presentation is loading. Please wait.

- Vendredi 27 mars 2009 1 PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL.

Similar presentations


Presentation on theme: "- Vendredi 27 mars 2009 1 PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL."— Presentation transcript:

1 - Vendredi 27 mars 2009 1 PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL

2 - Vendredi 27 mars 2009 2 Context : countdown of the GIEC/IPCC report  End of 2009  Fall 2010 : Climate simulations  End of 2010  ? : Data Distribution  End of 2010  Early 2012 : Scientific publications  Early 2013 : Report publication GIECC/IPCC AR5 (Assessment Report #5)  Octobre 2013 : Nobel price

3 - Vendredi 27 mars 2009 3 Context : National and European Project PRODIGUER : project submitted in september 2008 to the GIS climat In the wake of IS-ENES (FP7), Virtual Earth System Modeling resources Centre, Metadata standard and Metafor (FP7) metadata standard for climate modeling Implementation of these tools at national level and integration to International effort Must be done in close collaboration with national computing centers

4 - Vendredi 27 mars 2009 4 ESG/CMIP5 Timeline 2008: Design and implement core functionality:  Browse and search  Registration  Single sign-on / security  Publication  Distributed metadata  Server-side processing Early 2009: Testbed  By early 2009 it is expected to include at least seven centres in the US, Europe and Japan:  Program for Climate Model Diagnosis and Intercomparison - PCMDI (U.S.),  National Centre for Atmospheric Research - NCAR (U.S.),  Geophysical Fluid Dynamics Laboratory - GFDL (U.S.),  Oak Ridge National Laboratory - ORNL (U.S.),  British Atmosphere Data Centre - BADC (U.K.),  Max Planck Institute for Meteorology - MPI (Germany),  The University of Tokyo Centre for Climate System Research (Japan). 2009: Deal with system integration issues and develop production system.  By summer 2009, the hardware and software requirements will be provided to centres that want to be Nodes. 2010: Modelling centres publish data 2011-2012: Research and journal articles submissions 2013: IPCC Report

5 - Vendredi 27 mars 2009 5 AR5 open issues What are the set of runs to be done and, derived from that, the expected data volumes we can expect? Expected participants – where will data be hosted? (Who is going to step up and host the data nodes, and provide the level of support expect in terms of manpower and hardware capability.). This includes minimum software and hardware data holding site requirement (e.g. ftp access and ESG authentication and authorization) and a skilled staff help desk. The AR5 archive is to be globally distributed with support for WG1, WG2, and WG3. Will there be a need for a central (or core) archive and what will it look like? Replication of holdings - disaster protection, a desire to have a replica of the core data archive on every continent, etc. Number of users and level of access – scientist, policy makers, economists, health officials, etc.

6 - Vendredi 27 mars 2009 6

7 7

8 8 Orders of magnitude Climate models, centennial runs. Resolutions used Atmosphere 2.5° (280 Km) : 144 x 143 x 39 Ocean 2° (220 Km) : 180 x 149 x 31 Atm 2.5° - Ocean 2° : 20 GB/y, 300 ans  5,85 TB Atm 1.0° - Ocean 2° : 60 GB/y, 300 ans  17,5 TB Atm 0.5° - Ocean 0,5° : 400 GB/y, 30 ans  11,75 TB

9 - Vendredi 27 mars 2009 9 Global data amount Raw Data amount low bound  565 TB Raw Data amount high bound  1000 TB CMIP5 Distribution (25-50%)  (140-280) (250-500) TB Global Storage (Raw+Distributed)  700-1500 TB LMDz 0.5° (50 Km)

10 - Vendredi 27 mars 2009 10 Management of data since years Mainly centralised, store on a SAN OpenDap access on Supercomputing Centre Basic system of data retrieval Access to raw data Security/Authentication/Restriction to data access : not an issue No on demand post-processing No metadata integration No support for high level database query

11 - Vendredi 27 mars 2009 11 Data management with Prodiguer Move the data a minimum, keep them close to supercomputing centres if possible  Data access protocol, strong links with computing centres When data needs to be moved do it quickly and with a minimum amount of human intervention  Management of storage resources, fast network Keep a track of what we got, particularly what is on deep storage  Metadata et data catalogues Exploiting of federation of sites  Grid middleware  Data grid ?


Download ppt "- Vendredi 27 mars 2009 1 PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL."

Similar presentations


Ads by Google