The Research Data Archive at NCAR Doug Schuster and Steve Worley NCAR.

Slides:



Advertisements
Similar presentations
ECMWF June 2006Slide 1 Access to ECMWF data for Research Manuel Fuentes Data and Services Section, ECMWF ECMWF Forecast Products User Meeting.
Advertisements

Data management in SCD Steven Worley General Categories –The Mass Storage System –NCAR user file services (home directories) –Computer attached storage.
New Resources in the Research Data Archive Doug Schuster.
RAMADDA for Big Climate Data Don Murray NOAA/ESRL/PSD and CU-CIRES Boulder/Denver Big Data Meetup - June 18, 2014.
SCD Research Data For UCAR Data Management Working Group January 10, 2001 Steven Worley Scientific Computing Division Data Support Section.
ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley.
ERA-Interim and ASR Data Management at NCAR
The International Surface Pressure Databank (ISPD) and Twentieth Century Reanalysis at NCAR Thomas Cram - NCAR, Boulder, CO Gilbert Compo & Chesley McColl.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
NOAA National Climate Model Portal D. Michael Grogan Lead Systems Engineer National Climatic Data Center.
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
Scientific Investigations; Support from Research Data Archives for Computing in Atmospheric Sciences October, 2001 Steven Worley National Center.
October 16-18, Research Data Set Archives Steven Worley Scientific Computing Division Data Support Section.
EGU 2011 TIGGE, TIGGE LAM and the GIFS T. Paccagnella (1), D. Richardson (2), D. Schuster(3), R. Swinbank (4), Z. Toth (3), S.
TIGGE Archive Highlights. First Service Date ECMWF – October 2006 NCAR – October 2006 CMA – June 2007.
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
Data for Climate and Energy Studies Steven Worley Computational and Information Systems Laboratory NCAR.
Scientific Investigations; Support from Research Data Archives for Joint Office for Science Support 26 February, 2002 Steven Worley SCD/DSS.
A Comparison of the Northern American Regional Reanalysis (NARR) to an Ensemble of Analyses Including CFSR Wesley Ebisuzaki 1, Fedor Mesinger 2, Li Zhang.
Data to Support Ocean-Atmosphere Research NCAR Research Data Archive (RDA), Zaihua Ji, NCAR Steven Worley, NCAR Scott Woodruff,
Archive and Access Practices that Support Data Reuse and Transparency Steven Worley Doug Schuster Bob Dattore National Center for Atmospheric Research.
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
Data Access to Marine Surface Observations and Products from COADS 29 January, 2002 Steven Worley National Center for Atmospheric Research.
CISL/DSS & MMM Data Discussion 19 March Who CISL/DSS - maintain NCEP operational analyses and observation datasets – Gregg Walters, Doug Schuster,
THORPEX Interactive Grand Global Ensemble (TIGGE) China Meteorological Administration TIGGE-WG meeting, Boulder, June Progress on TIGGE Archive Center.
ICOADS: Update Status and Data Distribution Steven J. Worley Scott D. Woodruff Sandra J. Lubker Ziahua Ji J. Eric Freeman NCAR, NOAA/ESRL, NOAA/NCDC CLIMAR-III,
Analyzed Data Products Available from NCAR that Support Marine Climate Research JCOMM ETMC-III 9-12 February 2010 Steven Worley Doug Schuster.
ESIP Federation 2004 : L.B.Pham S. Berrick, L. Pham, G. Leptoukh, Z. Liu, H. Rui, S. Shen, W. Teng, T. Zhu NASA Goddard Earth Sciences (GES) Data & Information.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
Content, Discovery, and Accessibility Enhancements to the NCAR Research Data Archive Doug Schuster and Steve Worley NCAR.
JRA-25 and JCDAS at NCAR Data from Japanese 25-year Reanalysis (JRA-25) and the operational follow- on JMA Climate Data Assimilation System (JCDAS) are.
The NOAA Operational Model Archive and Distribution System NOMADS The NOAA Operational Model Archive and Distribution System NOMADS Dave Clark for Glenn.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
TIGGE Data Archive and Access at NCAR November 2008 November 2008 Steven Worley National Center for Atmospheric Research Boulder, Colorado, U.S.A.
An Update on COLA’s Software Development Jennifer M. Adams and Brian Doty.
Slide 1 GO-ESSP Paris. June 2007 Slide 1 (TIGGE and) the EU Funded BRIDGE project Baudouin Raoult Head of Data and Services Section ECMWF.
TIGGE Data Archive at NCAR 8th GIFS-TIGGE Working Group World Meteorological Organization Geneva February, 2010 Doug Schuster Steven Worley Dave.
The TIGGE Model Validation Portal: An Improvement in Data Interoperability 1 Thomas Cram Doug Schuster Hannah Wilcox Steven Worley National Center for.
29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.
TIGGE Archive Status at NCAR THORPEX Workshop and 6th GIFS-TIGGE Working Group Meetings WMO Headquarters Geneva September 2008 Steven Worley Doug.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
TIGGE Archive Access at NCAR Steven Worley Doug Schuster Dave Stepaniak Hannah Wilcox.
Research Data Archive (RDA) Access and Services from Yellowstone Grace Peng and Doug Schuster 1.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
SCD Research Data for Ocean Observatories Steering Committee June 18, 2001 Steven Worley Scientific Computing Division Data Support Section.
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
The TIGGE Model Validation Portal: An Improvement in Data Interoperability 1 Thomas Cram Doug Schuster Hannah Wilcox Michael Burek Eric Nienhouse Steven.
1. Gridded Data Sub-setting Services through the RDA at NCAR Doug Schuster, Steve Worley, Bob Dattore, Dave Stepaniak.
A41I-0105 Supporting Decadal and Regional Climate Prediction through NCAR’s EaSM Data Portal Doug Schuster and Steve Worley National Center for Atmospheric.
Introduction What purpose does a data archive center serve if users can’t find or access the holdings they might need to facilitate their research discoveries?
TIGGE Archives and Access
TIGGE Data Archive and Access System at NCAR
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
Development and Futures of Research Data Archives
TIGGE Data Archive at NCAR
Research Data Archives at NCAR
Key new features in the research data archive
Steven Worley, NSF/NCAR/SCD
Steven Worley, Douglas Schuster,
Implementation and Plans for TIGGE at NCAR and ECMWF
CISL’s Research Data Archive (RDA) : Description and Methods
Comeaux and Worley, NSF/NCAR/SCD
Long-Lived Data Collections
Data Management Components for a Research Data Archive
Robert Dattore and Steven Worley
Data Curation in Climate and Weather
Comeaux and Worley, NSF/NCAR/SCD
Presentation transcript:

The Research Data Archive at NCAR Doug Schuster and Steve Worley NCAR

Topic Outline lIntroduction/History lCore Data Categories/Featured Datasets lArchive Management/Tools lNew Supporting IT Infrastructure lFuture Possibilities 1/25/2011AMS

Introduction/History lData Support Section (Founded 1965) lPaper -> Punch Cards -> Tapes -> CD/DVD’s ->Hard Drives -> Network Based Storage and Transfer lKB of observations -> Terabytes of Model Generated Data (Total archive volume over 600 TB) lWeeks or months for a user to get data -> Users want data access now (over 7000 registered users) lPay for Data -> Free and open access to all datasets that aren’t subject to source restrictions 1/25/2011AMS

Introduction/History lHow do we evolve to support the growing needs of data users and generators? lStay aware of current research uses lStrengthen datasets supporting core research data categories lUpdate archive management tools lRebuild/Augment IT infrastructure lEducate supporting staff 1/25/2011AMS

Core Data Categories lContent to support atmospheric and geosciences research lSome research examples: lClimate lOceanographic lHydrologic lWeather Prediction lRenewable Energy (Wind/Solar) 1/25/2011AMS

Core Data Categories 1/25/2011AMS Operational and Reanalysis model outputs Meteorological and Oceanographic Observations Remote Sensing Observations Topography/Bathymetry, Vegetation, Land Use

Featured Datasets Platform Observations Dataset TitleCoverageUpdate Frequency NCEP GDAS observations (PREPBUFR and NetCDF)Global 1999 – PresentDaily RDA Upper Air DatabaseGlobal 1920 – PresentMonthly NCDC TD3200 U.S. Cooperative Summary of DayU.S – PresentMonthly Unidata IDD GTS based observations (NetCDF)Global 2002 – PresentDaily NCEP operational observations (ON-29 Format)Global 1975 – 2007Fixed International Comprehensive Ocean-Atmosphere Data Set (ICOADS) Global 1662 – PresentMonthly 1/25/2011AMS Global Platform Observations

Featured Datasets Analysis and Forecast Model Data Dataset TitleCoverageUpdate Frequency Thorpex Interactive Grand Global Ensemble (TIGGE)Global PresentHourly Unidata IDD (GFS 0.5deg, RUC 20km, NAM 12km) Global and Regional Present Daily NCEP ETA/NAM (40km) North America Present Monthly ECMWF Operational Deterministic (1.25 x 1.25 Deg)Global PresentBi-Yearly NCEP GDAS Final Analysis (1x1 Deg)Global PresentDaily NCEP OI Global SST (1x1 Deg)Global PresentWeekly NOAA OI Global SST (0.25 x 0.25 Deg)Global PresentMonthly Hadley Centre Global Sea Ice and SSTGlobal PresentMonthly 1/25/2011AMS Analysis and Forecast Model Data

Featured Datasets 1/25/2011AMS High Resolution Re-Analysis Dataset TitleCoverageUpdate Frequency ERA-40 (T159)Global Static Set ERA-Interim (N128 Gaussian)Global PresentYearly High Resolution Re-Analysis JRA-25 (1.125 Deg Gaussian)Global 1979 – PresentYearly NCEP/DOE (T62)Global PresentStatic Set NCEP/NCAR (T62)Global PresentQuarterly NARR (32 x 32 km) North America Present Quarterly CFSR (0.5 x 0.5 Deg)Global PresentMonthly NOAA-CIRES 20 th CenturyGlobal 1870 – 2008Static Set

Archive Management How can we support an archive that continuously grows in volume and complexity with a fixed number of supporting staff? 1/25/2011AMS

Archive Management lCommon Data Management Tools lFunctionality Requirements lScalable lIntegrated –one call does all lAutomatable 1/25/2011AMS

Archive Management lCommon Data Management Tools lTask Completion Requirements 1.Data acquisition lGet Data (daily or irregularly) 2.Data Archival lArchive to disk and tape 3.Metadata Collection lCollect Metadata lUpdate Metadata Databases 4.Metadata Publishing lUpdate Web Server Pages lUpdate Internal Metadata Access Points 1/25/2011AMS

Integrated Archival Tools 1/25/2011AMS Model Generated Data GRIB, NetCDF Obs Data BUFR, ASCII etc. Obs Data BUFR, ASCII etc. Topography Vector Image, Binary, etc Topography Vector Image, Binary, etc Remote Sensing Data Binary RDA/CISL Servers

Integrated Archival Tools 1/25/2011AMS Model Generated Data GRIB, NetCDF Obs Data BUFR, ASCII etc. Obs Data BUFR, ASCII etc. Topography Vector Image, Binary, etc Topography Vector Image, Binary, etc Remote Sensing Data Binary Model Generated Data Files GRIB-2 DISK HPSS Model Generated Data File dsarch RDA Database File attribute metadata: Name, Dataset, Location, Format File attribute metadata: Name, Dataset, Location, Format

RDA/CISL Servers Integrated Archival Tools 1/25/2011AMS RDA DB Model Generated File, GRIB-2 Format Model Generated File, GRIB-2 Format Temperature (Center, Date, Time, Level, Location) Humidity (Center, Date, Time, Level, Location) Vorticity (Center, Date, Time, Level, Location) Visibility (Center, Date, Time, Level, Location) Precip Rate (Center, Date, Time, Level, Location) File attribute metadata: Name, Dataset, Location, Format File attribute metadata: Name, Dataset, Location, Format File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L) File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L) Gather Meta data Gather Meta data

RDA/CISL Servers Integrated Archival Tools 1/25/2011AMS RDA Web Server -Dynamic File lists -Data Search tools -Detailed Content Metadata -Data Subsetting Interfaces -Dynamic File lists -Data Search tools -Detailed Content Metadata -Data Subsetting Interfaces CISL Computational Node -Detailed Metadata for files on disk. -Data Subsetting -Detailed Metadata for files on disk. -Data Subsetting RDA DB File attribute metadata: Name, Dataset, Location, Format File attribute metadata: Name, Dataset, Location, Format File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L) File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L)

New Supporting IT/Infrastructure lOnline Disk Upgrades lLarger Disk (450 TB) lCommon Disk Interfaces (webserver and compute nodes) lTape Archive Upgrades lHigh Performance Storage System (HPSS) lComputing Power Upgrades lAdditional and more powerful servers 1/25/2011AMS

New Supporting IT/Infrastructure 1/25/2011 AMS Complete User Community Pros: -Fast access to online data. -Access to all RDA metadata. -Access to RDA data. processing services. Complete User Community Cons: -Small fraction of RDA online. -Slow access to offline data. -Data processing requests take a long time to finish. NCAR User Community Pros: -Access to full RDA. -Fast computing. NCAR User Community Cons: -No access to online data. -Forced to use MSS as a file server: access is too slow -No direct access to RDA metadata.

New Supporting IT/Infrastructure 1/25/2011 AMS Complete User Community Improvements: -Faster access to full RDA. -Expanded data processing services available. -Faster turnaround on data processing requests. NCAR User Community Improvements: -Faster access to full RDA. -Direct access to all RDA metadata.

Future Possibilities 1/25/2011AMS lLeverage New IT Infrastructure lServer side parameter and spatial sub-setting across multiple datasets lModel or In-Situ observations lData provided in multiple output formats lWeb services based requests (REST, etc.) lAddition of large and diverse data sets to the RDA.

1/25/2011AMS