Presentation is loading. Please wait.

Presentation is loading. Please wait.

29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.

Similar presentations


Presentation on theme: "29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson."— Presentation transcript:

1 29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson and NSF colleagues

2 29 March 2004 Steven Worley, NSF/NCAR/SCD 2 How is cyberinfrastructure used in this domain? Harvest data to build RDA content –World-wide Create standard metadata –Enable discovery and metadata sharing Provide data access –Internally to NCAR/UCAR –Externally to global research community

3 29 March 2004 Steven Worley, NSF/NCAR/SCD 3 Definition of the RDA 500 plus distinct archived datasets Continual growth for about 40 years Each has metadata displayed on a web page All data on the MSS (primary + backups) –548K files –100.5 TB

4 29 March 2004 Steven Worley, NSF/NCAR/SCD 4 Harvest data to build RDA content

5 29 March 2004 Steven Worley, NSF/NCAR/SCD 5 Current network methods –Manual web download –Automatic scripted FTP –Subscription upload  Commodity internet Limitations –Slow for large volumes –Success/failure checks are responsibility of staff Future –Exploit larger bandwidth networks –Larger bandwidth tools, ESG… etc Harvest data to build RDA content

6 29 March 2004 Steven Worley, NSF/NCAR/SCD 6 Create standard metadata Legacy metadata –Hardcopy and images –Digitally online since about 1980 –Local standardize format Currently –Legacy metadata remains available Used to derive web pages –Transformed to standards used in CDP –Incorporated into THREDDS catalogues Enable searches across UCAR Future –More detailed metadata for accurate discovery (e.g. file level metadata) –Continue to be export through CDP and data servers systems

7 29 March 2004 Steven Worley, NSF/NCAR/SCD 7 Provide data access (delivery) Internally – to NCAR computing systems Currently, from the NCAR MSS –Supercomputer –Data analysis systems –Divisional computer systems  MSS is a tape based archive system not designed to be a scalable file server Future SANS between computer systems and MSS Enable rapid file service and unburden the archive system

8 29 March 2004 Steven Worley, NSF/NCAR/SCD 8 Internal (MSS) access metrics Files read for 2004 25K

9 29 March 2004 Steven Worley, NSF/NCAR/SCD 9 Provide data access (delivery) Externally – to the internet Caveat: some NCAR user Currently, traditional data server –Web and FTP downloads Most popular data only (166 K files, 10.7 TB) –Subsetting By request and delayed mode processing Future –More traditional services –Key datasets available through portals (CDP/ESG)

10 29 March 2004 Steven Worley, NSF/NCAR/SCD 10 Provide data access (delivery) Data server (Web and FTP) metrics Jan. – Feb. 2005 Only –New system to accurately track users –Old system provided “fuzzy” metrics January 2005February 2005 Unique Users517523 Amount (TB)1.21.8 No. Files615112403

11 29 March 2004 Steven Worley, NSF/NCAR/SCD 11 Future Fact –Dataset size and complexity is growing – need to handle more data How? –Use advanced networks harvest rapidly –More complete metadata, in a standard Improved data discovery and access Improved (more efficient) data management –Provide critical collections through portals Interoperable access through servers (e.g. GDS, etc) –Distributed archives Share metadata with other portals (global discovery)

12 29 March 2004 Steven Worley, NSF/NCAR/SCD 12 Key Case – ERA-40 35 TB collection, 30 distinct product lines Added about 10 products (computed in SCD) –Support Climate Modeling Metrics for 2004 Web & FTP = MSS in Data Amount Over 20 TB delivered 13K files from non-file server MSS

13 29 March 2004 Steven Worley, NSF/NCAR/SCD 13 Conclusions Are using basic cyberinfrastructure now Will use new proven components in our operations With cyberinfrastructure we plan to: improve data acquisition, discovery, and access improve our management efficiency In the process we will: seamlessly integrate new and traditional systems not lose track of critical legacy data and metadata

14 29 March 2004 Steven Worley, NSF/NCAR/SCD 14 Questions/Discussion


Download ppt "29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson."

Similar presentations


Ads by Google