Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Research Data Archive at NCAR Doug Schuster and Steve Worley NCAR.

Similar presentations


Presentation on theme: "The Research Data Archive at NCAR Doug Schuster and Steve Worley NCAR."— Presentation transcript:

1 The Research Data Archive at NCAR Doug Schuster and Steve Worley NCAR

2 Topic Outline lIntroduction/History lCore Data Categories/Featured Datasets lArchive Management/Tools lNew Supporting IT Infrastructure lFuture Possibilities 1/25/2011AMS 2011 2

3 Introduction/History lData Support Section (Founded 1965) lPaper -> Punch Cards -> Tapes -> CD/DVD’s ->Hard Drives -> Network Based Storage and Transfer lKB of observations -> Terabytes of Model Generated Data (Total archive volume over 600 TB) lWeeks or months for a user to get data -> Users want data access now (over 7000 registered users) lPay for Data -> Free and open access to all datasets that aren’t subject to source restrictions 1/25/2011AMS 2011 3

4 Introduction/History lHow do we evolve to support the growing needs of data users and generators? lStay aware of current research uses lStrengthen datasets supporting core research data categories lUpdate archive management tools lRebuild/Augment IT infrastructure lEducate supporting staff 1/25/2011AMS 2011 4

5 Core Data Categories lContent to support atmospheric and geosciences research lSome research examples: lClimate lOceanographic lHydrologic lWeather Prediction lRenewable Energy (Wind/Solar) 1/25/2011AMS 2011 5

6 Core Data Categories 1/25/2011AMS 2011 6 Operational and Reanalysis model outputs Meteorological and Oceanographic Observations Remote Sensing Observations Topography/Bathymetry, Vegetation, Land Use

7 Featured Datasets Platform Observations Dataset TitleCoverageUpdate Frequency NCEP GDAS observations (PREPBUFR and NetCDF)Global 1999 – PresentDaily RDA Upper Air DatabaseGlobal 1920 – PresentMonthly NCDC TD3200 U.S. Cooperative Summary of DayU.S. 1890 – PresentMonthly Unidata IDD GTS based observations (NetCDF)Global 2002 – PresentDaily NCEP operational observations (ON-29 Format)Global 1975 – 2007Fixed International Comprehensive Ocean-Atmosphere Data Set (ICOADS) Global 1662 – PresentMonthly 1/25/2011AMS 2011 7 16622011Global Platform Observations

8 Featured Datasets Analysis and Forecast Model Data Dataset TitleCoverageUpdate Frequency Thorpex Interactive Grand Global Ensemble (TIGGE)Global 2006 - PresentHourly Unidata IDD (GFS 0.5deg, RUC 20km, NAM 12km) Global and Regional 2002 - Present Daily NCEP ETA/NAM (40km) North America 1995 - Present Monthly ECMWF Operational Deterministic (1.25 x 1.25 Deg)Global 1985 - PresentBi-Yearly NCEP GDAS Final Analysis (1x1 Deg)Global 1999 - PresentDaily NCEP OI Global SST (1x1 Deg)Global 1981 - PresentWeekly NOAA OI Global SST (0.25 x 0.25 Deg)Global 1981 - PresentMonthly Hadley Centre Global Sea Ice and SSTGlobal 1850 - PresentMonthly 1/25/2011AMS 2011 8 18502011Analysis and Forecast Model Data

9 Featured Datasets 1/25/2011AMS 2011 9 High Resolution Re-Analysis Dataset TitleCoverageUpdate Frequency ERA-40 (T159)Global 1957 - 2002Static Set ERA-Interim (N128 Gaussian)Global 1989 - PresentYearly 18702011High Resolution Re-Analysis JRA-25 (1.125 Deg Gaussian)Global 1979 – PresentYearly NCEP/DOE (T62)Global 1979 - PresentStatic Set NCEP/NCAR (T62)Global 1948 - PresentQuarterly NARR (32 x 32 km) North America 1979 - Present Quarterly CFSR (0.5 x 0.5 Deg)Global 1979 - PresentMonthly NOAA-CIRES 20 th CenturyGlobal 1870 – 2008Static Set

10 Archive Management How can we support an archive that continuously grows in volume and complexity with a fixed number of supporting staff? 1/25/2011AMS 2011 10

11 Archive Management lCommon Data Management Tools lFunctionality Requirements lScalable lIntegrated –one call does all lAutomatable 1/25/2011AMS 2011 11

12 Archive Management lCommon Data Management Tools lTask Completion Requirements 1.Data acquisition lGet Data (daily or irregularly) 2.Data Archival lArchive to disk and tape 3.Metadata Collection lCollect Metadata lUpdate Metadata Databases 4.Metadata Publishing lUpdate Web Server Pages lUpdate Internal Metadata Access Points 1/25/2011AMS 2011 12

13 Integrated Archival Tools 1/25/2011AMS 2011 13 Model Generated Data GRIB, NetCDF Obs Data BUFR, ASCII etc. Obs Data BUFR, ASCII etc. Topography Vector Image, Binary, etc Topography Vector Image, Binary, etc Remote Sensing Data Binary RDA/CISL Servers

14 Integrated Archival Tools 1/25/2011AMS 2011 14 Model Generated Data GRIB, NetCDF Obs Data BUFR, ASCII etc. Obs Data BUFR, ASCII etc. Topography Vector Image, Binary, etc Topography Vector Image, Binary, etc Remote Sensing Data Binary Model Generated Data Files GRIB-2 DISK HPSS Model Generated Data File dsarch RDA Database File attribute metadata: Name, Dataset, Location, Format File attribute metadata: Name, Dataset, Location, Format

15 RDA/CISL Servers Integrated Archival Tools 1/25/2011AMS 2011 15 RDA DB Model Generated File, GRIB-2 Format Model Generated File, GRIB-2 Format Temperature (Center, Date, Time, Level, Location) Humidity (Center, Date, Time, Level, Location) Vorticity (Center, Date, Time, Level, Location) Visibility (Center, Date, Time, Level, Location) Precip Rate (Center, Date, Time, Level, Location) File attribute metadata: Name, Dataset, Location, Format File attribute metadata: Name, Dataset, Location, Format File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L) File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L) Gather Meta data Gather Meta data

16 RDA/CISL Servers Integrated Archival Tools 1/25/2011AMS 2011 16 RDA Web Server -Dynamic File lists -Data Search tools -Detailed Content Metadata -Data Subsetting Interfaces -Dynamic File lists -Data Search tools -Detailed Content Metadata -Data Subsetting Interfaces CISL Computational Node -Detailed Metadata for files on disk. -Data Subsetting -Detailed Metadata for files on disk. -Data Subsetting RDA DB File attribute metadata: Name, Dataset, Location, Format File attribute metadata: Name, Dataset, Location, Format File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L) File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L)

17 New Supporting IT/Infrastructure lOnline Disk Upgrades lLarger Disk (450 TB) lCommon Disk Interfaces (webserver and compute nodes) lTape Archive Upgrades lHigh Performance Storage System (HPSS) lComputing Power Upgrades lAdditional and more powerful servers 1/25/2011AMS 2011 17

18 New Supporting IT/Infrastructure 1/25/2011 AMS 2011 18 Complete User Community Pros: -Fast access to online data. -Access to all RDA metadata. -Access to RDA data. processing services. Complete User Community Cons: -Small fraction of RDA online. -Slow access to offline data. -Data processing requests take a long time to finish. NCAR User Community Pros: -Access to full RDA. -Fast computing. NCAR User Community Cons: -No access to online data. -Forced to use MSS as a file server: access is too slow -No direct access to RDA metadata.

19 New Supporting IT/Infrastructure 1/25/2011 AMS 2011 19 Complete User Community Improvements: -Faster access to full RDA. -Expanded data processing services available. -Faster turnaround on data processing requests. NCAR User Community Improvements: -Faster access to full RDA. -Direct access to all RDA metadata.

20 Future Possibilities 1/25/2011AMS 2011 20 lLeverage New IT Infrastructure lServer side parameter and spatial sub-setting across multiple datasets lModel or In-Situ observations lData provided in multiple output formats lWeb services based requests (REST, etc.) lAddition of large and diverse data sets to the RDA.

21 http://dss.ucar.edu 1/25/2011AMS 2011 21


Download ppt "The Research Data Archive at NCAR Doug Schuster and Steve Worley NCAR."

Similar presentations


Ads by Google