29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.

Slides:



Advertisements
Similar presentations
Data management in SCD Steven Worley General Categories –The Mass Storage System –NCAR user file services (home directories) –Computer attached storage.
Advertisements

BEDI -Big Earth Data Initiative
New Resources in the Research Data Archive Doug Schuster.
Slide: 1 ROSA GRAS Meeting February 2009 Matera, Italy User Services EUMETSAT EUMETSAT Data Access & User Support.
SCD Research Data For UCAR Data Management Working Group January 10, 2001 Steven Worley Scientific Computing Division Data Support Section.
ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
Digital Video Archiving. ViArchive Overview ViArchive provides user friendly solutions for… – uploading video clips with metadata (searchable file info.
ERA-Interim and ASR Data Management at NCAR
© 2010 Siemens IT Solutions and Services Private Ltd. All rights reserved. Mumbai Conference 21–22 October 2010 Exhibition 21–23 October 2010 iNeevM -
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
The International Surface Pressure Databank (ISPD) and Twentieth Century Reanalysis at NCAR Thomas Cram - NCAR, Boulder, CO Gilbert Compo & Chesley McColl.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
October 16-18, Research Data Set Archives Steven Worley Scientific Computing Division Data Support Section.
OCLC Research Libraries Partners 10 June 2011 Robin Murray Vice President, Global Product Management OCLC Collaboratively Building Web-Scale with Libraries.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
Data for Climate and Energy Studies Steven Worley Computational and Information Systems Laboratory NCAR.
CF Conventions Support at BADC Alison Pamment Roy Lowry (BODC)
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Scientific Investigations; Support from Research Data Archives for Joint Office for Science Support 26 February, 2002 Steven Worley SCD/DSS.
The Information Challenge Exponential growth of resources New researchers with new needs Multiple communication options New expectations and opportunities.
IODE Ocean Data Portal - technological framework of new IODE system Dr. Sergey Belov, et al. Partnership Centre for the IODE Ocean Data Portal MINCyT,
Archive and Access Practices that Support Data Reuse and Transparency Steven Worley Doug Schuster Bob Dattore National Center for Atmospheric Research.
Describe workflows used to maintain and provide the RDA to users – Both are 24x7 operations Transition to the NWSC with zero downtime NWSC is new environment.
IODE Ocean Data Portal – from data access to integration platform Sergey Belov, Tobias Spears, Nikolai Mikhailov International Oceanographic Data and Information.
Improved Access to RDA from the MSS OSD Executive Meeting April 28, 2009.
June 20-22, nomads.ncdc.noaa.gov Being developed and integrated to provide one-stop.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
Content, Discovery, and Accessibility Enhancements to the NCAR Research Data Archive Doug Schuster and Steve Worley NCAR.
JRA-25 and JCDAS at NCAR Data from Japanese 25-year Reanalysis (JRA-25) and the operational follow- on JMA Climate Data Assimilation System (JCDAS) are.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
1-2-3 February 2006 –Page 1 Mersea Integrated System How to improve Access/Downloading services ? How far do we go in terms of standardization ?
Hussein Suleman University of Cape Town Department of Computer Science Digital Libraries Laboratory February 2008 Data Curation Repositories:
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
TIGGE Archive Status at NCAR THORPEX Workshop and 6th GIFS-TIGGE Working Group Meetings WMO Headquarters Geneva September 2008 Steven Worley Doug.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
DISCUSSION DRAFT ONLY Data Management METRICS for NNDC and CLASS David Hermreck.
ALA Annual Meeting Claire Cocco Global Product Manager CONTENTdm Users Group June 30th, 2008.
SCD User Briefing The Community Data Portal and the Earth System Grid Don Middleton with presentation material developed by Luca Cinquini, Mary Haley,
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
TIGGE Archive Access at NCAR Steven Worley Doug Schuster Dave Stepaniak Hannah Wilcox.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
AOLI 2015 The NMME Experience: A Research Community Archive Lessons learned from Climate Model data archive and use AOLI Meeting 2015 Eric Nienhouse NCAR.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
1. Gridded Data Sub-setting Services through the RDA at NCAR Doug Schuster, Steve Worley, Bob Dattore, Dave Stepaniak.
Introduction What purpose does a data archive center serve if users can’t find or access the holdings they might need to facilitate their research discoveries?
IODE Ocean Data Portal - technological framework of new IODE system Dr. Sergey Belov, et al. Partnership Centre for the IODE Ocean Data Portal.
Joseph JaJa, Mike Smorul, and Sangchul Song
TIGGE Data Archive and Access System at NCAR
DIGITAL LIBRARY.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
Development and Futures of Research Data Archives
Research Data Archives at NCAR
Steven Worley, NSF/NCAR/SCD
Steven Worley, Douglas Schuster,
CISL’s Research Data Archive (RDA) : Description and Methods
Comeaux and Worley, NSF/NCAR/SCD
Long-Lived Data Collections
Data Management Components for a Research Data Archive
Robert Dattore and Steven Worley
Comeaux and Worley, NSF/NCAR/SCD
Presentation transcript:

29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson and NSF colleagues

29 March 2004 Steven Worley, NSF/NCAR/SCD 2 How is cyberinfrastructure used in this domain? Harvest data to build RDA content –World-wide Create standard metadata –Enable discovery and metadata sharing Provide data access –Internally to NCAR/UCAR –Externally to global research community

29 March 2004 Steven Worley, NSF/NCAR/SCD 3 Definition of the RDA 500 plus distinct archived datasets Continual growth for about 40 years Each has metadata displayed on a web page All data on the MSS (primary + backups) –548K files –100.5 TB

29 March 2004 Steven Worley, NSF/NCAR/SCD 4 Harvest data to build RDA content

29 March 2004 Steven Worley, NSF/NCAR/SCD 5 Current network methods –Manual web download –Automatic scripted FTP –Subscription upload  Commodity internet Limitations –Slow for large volumes –Success/failure checks are responsibility of staff Future –Exploit larger bandwidth networks –Larger bandwidth tools, ESG… etc Harvest data to build RDA content

29 March 2004 Steven Worley, NSF/NCAR/SCD 6 Create standard metadata Legacy metadata –Hardcopy and images –Digitally online since about 1980 –Local standardize format Currently –Legacy metadata remains available Used to derive web pages –Transformed to standards used in CDP –Incorporated into THREDDS catalogues Enable searches across UCAR Future –More detailed metadata for accurate discovery (e.g. file level metadata) –Continue to be export through CDP and data servers systems

29 March 2004 Steven Worley, NSF/NCAR/SCD 7 Provide data access (delivery) Internally – to NCAR computing systems Currently, from the NCAR MSS –Supercomputer –Data analysis systems –Divisional computer systems  MSS is a tape based archive system not designed to be a scalable file server Future SANS between computer systems and MSS Enable rapid file service and unburden the archive system

29 March 2004 Steven Worley, NSF/NCAR/SCD 8 Internal (MSS) access metrics Files read for K

29 March 2004 Steven Worley, NSF/NCAR/SCD 9 Provide data access (delivery) Externally – to the internet Caveat: some NCAR user Currently, traditional data server –Web and FTP downloads Most popular data only (166 K files, 10.7 TB) –Subsetting By request and delayed mode processing Future –More traditional services –Key datasets available through portals (CDP/ESG)

29 March 2004 Steven Worley, NSF/NCAR/SCD 10 Provide data access (delivery) Data server (Web and FTP) metrics Jan. – Feb Only –New system to accurately track users –Old system provided “fuzzy” metrics January 2005February 2005 Unique Users Amount (TB) No. Files

29 March 2004 Steven Worley, NSF/NCAR/SCD 11 Future Fact –Dataset size and complexity is growing – need to handle more data How? –Use advanced networks harvest rapidly –More complete metadata, in a standard Improved data discovery and access Improved (more efficient) data management –Provide critical collections through portals Interoperable access through servers (e.g. GDS, etc) –Distributed archives Share metadata with other portals (global discovery)

29 March 2004 Steven Worley, NSF/NCAR/SCD 12 Key Case – ERA TB collection, 30 distinct product lines Added about 10 products (computed in SCD) –Support Climate Modeling Metrics for 2004 Web & FTP = MSS in Data Amount Over 20 TB delivered 13K files from non-file server MSS

29 March 2004 Steven Worley, NSF/NCAR/SCD 13 Conclusions Are using basic cyberinfrastructure now Will use new proven components in our operations With cyberinfrastructure we plan to: improve data acquisition, discovery, and access improve our management efficiency In the process we will: seamlessly integrate new and traditional systems not lose track of critical legacy data and metadata

29 March 2004 Steven Worley, NSF/NCAR/SCD 14 Questions/Discussion