Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.

Similar presentations


Presentation on theme: "The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research."— Presentation transcript:

1 The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research Boulder, CO, USA

2 Outline o Introduction o RDA - Then o RDA - Now o User Services

3 Introduction o Purpose - support climate & weather research at NCAR; services are extended worldwide as resources permit o Observations, derived products; focus on historical atmosphere/ocean data o Metrics  Established in 1960s  600+ datasets, 4M files, 400 TB  6000 users annually

4 Introduction o Changing data landscape  Then – small datasets, single country/experiment, specialized formats  Now – global coverage, high spatial/vertical resolutions, standard formats o Result and challenge:  Lots of diversity  How can we provide uniform discovery and access?

5 Then

6 o Bottom line  Increasing data diversity, evolving technology; difficult to develop good user services  Discovery limited to READMEs, dir names  Data access via personal communications o Major limiting factor – insufficient metadata  No metadata standard, dictionaries  Collection not uniform across all datasets  Rigidly-structured flat ASCII files  Archiving separate from metadata collection

7 Now

8 o Developed local standard for discovery based on DIF 1 & THREDDS 2 ; applied across all datasets o Adopted GCMD 3 controlled vocabularies  Local enhancements; e.g. data formats o Harvest two types of file metadata  External – file size, compression, …  Internal - variables, levels, date range, … o Storage using XML 1 Directory Interchange Format, NASA/GCMD 3 2 Thematic Realtime Environmental Distributed Data Services 3 Global Change Master Directory Now

9 Metadata Collection

10 o Tools that automatically capture file metadata  Integrated with archiving activities o Web-based GUI - guided entry of dataset discovery metadata  Required fields, constrained entries Metadata Collection

11 Relational Databases

12 o Fast access o Discovery metadata  Single database, ~255K rows o External file metadata  Single database, ~45M rows  Maintains dataset/data file relationships o Internal file metadata  Four databases structured to handle diversity of data, ~920M rows  Supports accurate data discovery by establishing detailed parameter relationships

13 User Services

14 o Data Discovery  Google-like dataset search  “Look For Data” interface – user-defined dataset catalogs  Auto-generated dataset pages – always up-to- date  Collections – all reanalyses, upper air obs, surface obs User Services

15 o Data file access  Multiple methods; can vary by dataset; dynamic table shows available options  “Create Your Own List” for data file lists Show specific files from terabyte-sized collections Download aids Detailed file content views

16 Metadata Sharing o OAI-PMH  UCAR Community Data Portal (THREDDS)  Global Change Master Directory (DIF)  also Dublin Core, native

17 Thank You!  Web: http://dss.ucar.edu  Email: dssweb@ucar.edu  Questions/comments?


Download ppt "The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research."

Similar presentations


Ads by Google