Download presentation
Presentation is loading. Please wait.
Published byClement Fitzgerald Modified over 8 years ago
1
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research Boulder, CO, USA
2
Outline o Introduction o RDA - Then o RDA - Now o User Services
3
Introduction o Purpose - support climate & weather research at NCAR; services are extended worldwide as resources permit o Observations, derived products; focus on historical atmosphere/ocean data o Metrics Established in 1960s 600+ datasets, 4M files, 400 TB 6000 users annually
4
Introduction o Changing data landscape Then – small datasets, single country/experiment, specialized formats Now – global coverage, high spatial/vertical resolutions, standard formats o Result and challenge: Lots of diversity How can we provide uniform discovery and access?
5
Then
6
o Bottom line Increasing data diversity, evolving technology; difficult to develop good user services Discovery limited to READMEs, dir names Data access via personal communications o Major limiting factor – insufficient metadata No metadata standard, dictionaries Collection not uniform across all datasets Rigidly-structured flat ASCII files Archiving separate from metadata collection
7
Now
8
o Developed local standard for discovery based on DIF 1 & THREDDS 2 ; applied across all datasets o Adopted GCMD 3 controlled vocabularies Local enhancements; e.g. data formats o Harvest two types of file metadata External – file size, compression, … Internal - variables, levels, date range, … o Storage using XML 1 Directory Interchange Format, NASA/GCMD 3 2 Thematic Realtime Environmental Distributed Data Services 3 Global Change Master Directory Now
9
Metadata Collection
10
o Tools that automatically capture file metadata Integrated with archiving activities o Web-based GUI - guided entry of dataset discovery metadata Required fields, constrained entries Metadata Collection
11
Relational Databases
12
o Fast access o Discovery metadata Single database, ~255K rows o External file metadata Single database, ~45M rows Maintains dataset/data file relationships o Internal file metadata Four databases structured to handle diversity of data, ~920M rows Supports accurate data discovery by establishing detailed parameter relationships
13
User Services
14
o Data Discovery Google-like dataset search “Look For Data” interface – user-defined dataset catalogs Auto-generated dataset pages – always up-to- date Collections – all reanalyses, upper air obs, surface obs User Services
15
o Data file access Multiple methods; can vary by dataset; dynamic table shows available options “Create Your Own List” for data file lists Show specific files from terabyte-sized collections Download aids Detailed file content views
16
Metadata Sharing o OAI-PMH UCAR Community Data Portal (THREDDS) Global Change Master Directory (DIF) also Dublin Core, native
17
Thank You! Web: http://dss.ucar.edu Email: dssweb@ucar.edu Questions/comments?
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.