Presentation is loading. Please wait.

Presentation is loading. Please wait.

SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.

Similar presentations


Presentation on theme: "SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators."— Presentation transcript:

1 SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators per year Atmospheric ReanalysesOcean Analyses Atmospheric and Ocean Observations Climate model output Land topography ocean bathymetry River flow data Weather center operational analyses Programmatic collections, GCIP, TOGA/COARE, etc Gridded products; slp, precip., climate indices, etc Land surface characteristics, soils, etc

2 Enhanced Service through the CDP What data is best for the CDP? –Datasets that are needed by the largest group of scientists. –Datasets which are typically large (10’s of Gigabytes) and from which spatial, temporal, and parameter subsets are normally preferred. –Other relevant datasets that are often required to support research using the datasets defined above. Global Atmospheric Reanalyses

3 CDP project, NCEP Reanalysis-2 About Reanalysis-2 –Proper full name: NCEP/DOE AMIP-II Reanalysis –Experimental follow-on to the popular NCEP/NCAR Global Atmospheric Reanalysis –For the CDP we have chosen one popular product “Pressure stack”, global 2.5°, 7 variables on 17 pressure levels, 4x daily, and a few surface only grids. There are other products, e.g. surface flux fields, climatologies –Using a one year sample for CDP study 1460 file, 2.2 Gbytes –We have data for 1979-1999, continuing. Total pressure stack data is 45 Gbytes, and growing Data provided by M. Kanamitsu, NCEP

4 Successes and outlook It works, we can do it! –Access based on LAS, NCL (NCAR Command Language), and a local file system. –The important key was NCL NCL can read many file formats (netCDF, GrIB, HDF) The native format produced at the weather centers (NCEP and ECMWF) is GrIB, a WMO standard.

5 Outlook NCL can do much more! –It is a powerful analysis tool 50+ computational math functions 10+ routines for scalar and vector regridding Many atmospheric model specific function – Spherepack etc –We control the development of NCL – important functionality can be added –Through NCL we could offer more analysis capability as part of the CDP

6 Outlook Challenges –How can we sensibly scale this system up to handle 100 Gigabyte datasets and multiple users? A certainty. Users will request large subsets and some will be orthogonal to whatever file structure is chosen Result. Long computational run times, and large output data files The requester may not know this in advance This type of unexpected result => dissatisfactory service

7 Outlook –Enhancements to avoid unexpected results Construct algorithms to estimate the run time and output data volume. For large output files or long running requests –offer delayed service through standard FTP procedures – E.g write the data to an FTP server and notify the user when it is ready. Some requests will be too large for convenient FTP transfer. –In this case the requester should be referred to the SCD/DSS staff for assistance.

8 Outlook –Need to enhance the interface to insure complete metadata access A wealth of critical metadata –Model descriptions –Input data sources –Publications –Associated studies and derived datasets –Many related URL’s Clear links throughout the CDP so users can find the metadata and get assistance, e.g. SCD/DSS information server. –Need mechanisms to get user feedback

9 Outlook –May need restriction and authentication procedures for some datasets Redistribution of some data is restricted, e.g. ECMWF analyses. With simple registration we are able to provide these data to UCAR members in North America. All others are excluded.

10 Wrap-up We have encouraging results so far and will continue the development Measure of success – User satisfaction! Public availability at the CDP will be announced on the SCD URL – scd.ucar.edu Reanalysis-2 is available now from the MSS or through the SCD/DSS, see dss.ucar.edu/datasets/ds091.0 Details about the model runs are at: wesley.wwb.noaa.gov/reanalysis2


Download ppt "SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators."

Similar presentations


Ads by Google