M. Stockhause et al. Martina Stockhause, Michael Lautenschlager, Frank Toussaint Deutsches Klimarechenzentrum (DKRZ) World Data Centre for Climate (WDCC)

Slides:



Advertisements
Similar presentations
Std-doi Publication of Climate Data at WDCC DataCite Summer Meeting 7./8. June 2010 Publication of climate data Heinke Höck World Data Center for Climate.
Advertisements

Data Acquisition and Data Publishing with eSciDoc Matthias Razum DataCite Summer Meeting Hannover June 7-8, 2010.
Preservation and Long Term Access of Data at the World Data Centre for Climate Frank Toussaint N.P. Drakenberg, H. Höck, M. Lautenschlager, H. Luthardt,
1 Persistent identifiers, long-term access and the DiVA preservation strategy Eva Müller Electronic Publishing Centre Uppsala University Library, Sweden.
M. Diepenbroek (MARUM), M. Lautenschlager (MPI-M), E. Paliouras (DLR), H. Grobe (AWI) CODATA General Assembly, Berlin World Data Center Cluster.
Review on 5 Years DataCite and 10 Years DOI Registration for Data DataCite Annual Conference 2014 Nancy, August 25th – 26th Michael Lautenschlager (DKRZ.
Preservation and Long Term Access of Data at the World Data Centre for Climate Frank Toussaint N.P. Drakenberg, H. Höck, S. Kindermann, M. Lautenschlager,
M.Lautenschlager (WDCC / MPI-M) / / 1 GO-ESSP at LLNL Livermore, June 19th – 21st, 2006 World Data Center Climate: Status and Portal Integration.
DataCite: Making Data Citable Jan Brase (DataCite/TIB Hannover) Brigitte Hausstein (GESIS) Wolfgang Zenk-Möltgen (GESIS)
Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.
M. Lautenschlager (M&D/MPIM)1 The CERA Database Michael Lautenschlager Modelle und Daten Max-Planck-Institut für Meteorologie Workshop "Definition.
Z EGU Integration of external metadata into the Earth System Grid Federation (ESGF) K. Berger 1, G. Levavasseur 2, M. Stockhause 1, and M. Lautenschlager.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
The Data Attribution Abdul Saboor PhD Research Student Model Base Development and Software Quality Assurance Research Group Freie.
UC3 Standards and Best Practices for Datasets and Other Supplemental Journal Article Materials UC3 Stephen Abrams Patricia Cruse John Kunze.
Metadata Concepts / Use in Climate Research Stephan Kindermann, Martina Stockhause German Climate Computing Center (DKRZ) Hamburg, Germany.
F. Toussaint (WDCC, Hamburg) / / 1 CERA : Data Structure and User Interface Frank Toussaint Michael Lautenschlager World Data Center for Climate.
Michael Lautenschlager World Data Center Climate Model and Data / Max-Planck-Institute for Meteorology German Climate Computing Centre (DKRZ)
M.Lautenschlager (WDCC, Hamburg) / / 1 Training-Workshop Facilities and Sevices for Earth System Modelling Integrated Model and Data Infrastructure.
Data Management in Scholarly Journals and possible Roles for Libraries – Some Insights from EDaWaX Sven Vlaeminck | Leibniz-Information Centre for Economics.
Data Publication and Quality Control Procedure for CMIP5 / IPCC-AR5 Data WDC Climate / DKRZ:
| Ingest Levels and Persistent Identification | October Ingest Levels and Persistent Identification Services for R & D and heritage organisations.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
VIVO and Scholarly Repositories: Synergistic Opportunities.
IPCC TGICA and IPCC DDC for AR5 Data GO-ESSP Meeting, Seattle, Michael Lautenschlager World Data Center Climate Model and Data / Max-Planck-Institute.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
NOAA Data Citation Procedural Directive 8 November 2012 DAARWG.
The Repository of the World Data Centre for Climate Frank Toussaint, Michael Lautenschlager Max-Planck-Institut für Meteorologie Repositories in Research.
INFSO-RI Enabling Grids for E-sciencE Intelligent Distributed Data Management in Earth System Science S. Kindermann, DKRZ, Germany.
Data formats and requirements in CMIP6: the climate-prediction case Pierre-Antoine Bretonnière EC-Earth meeting, Reading, May 2015.
WP6/SA2: Access to IS-ENES Data Federation SA2 is a European distributed data infrastructure providing access to data from ESM simulations produced in.
Lautenschlager + Thiemann (M&D/MPI-M) / / 1 Introduction Course 2006 Services and Facilities of DKRZ and M&D Integrating Model and Data Infrastructure.
M. Stockhause 1, G. Levavasseur 2, K. Berger 1 1 Deutsches Klimarechenzentrum (DKRZ) 2 Institute Pierre Simon Laplace (IPSL) ESGF-QCWT Quality Control.
Create XML from a template Browse available records WDCC Metadata Generation with GeoNetwork Hans Ramthun, Michael Lautenschlager, Hans-Hermann Winter.
LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344.
The Modeling Circle Courtesy M. Lautenschlager, DKRZ.
NIH BioCADDIE / Force11 Data Citation Pilot Kickoff Meeting Nine Zero Hotel, Boston MA, 3 February 2016 Introduction: Tim Clark, Maryann Martone and Joan.
IPCC WG II + III Requirements for AR5 Data Management GO-ESSP Meeting, Paris, Michael Lautenschlager, Hans Luthardt World Data Center Climate.
Hannes Thiemann Michael Lautenschlager Deutsches Klimarechenzentrum GmbH, Germany EGU 2010.
Data Citation Implementation Pilot Workshop
PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA …………………………………………………………………………………………………… LOUISE CORTI …………………….…………………………….… UK DATA ARCHIVE.
CAS2K11 in Annecy, France September 11 – 14, 2011 Data Infrastructures at DKRZ Michael Lautenschlager.
Using a Simple Knowledge Organization System to facilitate Catalogue and Search for the ESA CCI Open Data Portal EGU, 21 April 2016 Antony Wilson, Victoria.
ODIN – ORCID and DATACITE Interoperability Network ODIN: Connecting research and researchers Sergio Ruiz - DataCite Funded by The European Union Seventh.
Acknowledgments Funding provided by the Jewett Foundation Introduction Data collected in ocean sciences, whether generated from research or operational.
Weigel, Berger, Kindermann, Lautenschlager EGU Versioning for CMIP6 in the Earth System Grid Federation Data preparation Initial registration.
21st October 2008 eSciDoc – A Service Infrastructure for Cultural Heritage Content VSMM 2008 – Digital Archives Online Natasa Bulatovic, Ulla Tschida,
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Intentions and Goals Comparison of core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points.
Approaches and Challenges in Managing Persistent Identifiers
AP7/AP8: Long-Term Archival of CMIP6 Data
World Conference on Climate Change October 24-26, 2016 Valencia, Spain
EUDAT’s engagement with the Earth Sciences
Data Citation Service for CMIP6 and IPCC DDC Aspects
Data Ingestion in ENES and collaboration with RDA
ACS 2016 Moving research forward with persistent identifiers
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Experiences of the Digital Repository of Ireland
CMIP6 / ENES Data TF Meeting: DKRZ
OpenML Workshop Eindhoven TU/e,
EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
Implementing an Institutional Repository: Part II
Mission DataCite was founded in 2009 as an international organization which aims to: establish easier access to research data increase acceptance of research.
IS-ENES Cases Seven use cases are listed as data lifecycle steps A B C
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Bird of Feather Session
Implementing an Institutional Repository: Part II
RDA uptake activities and plans: ESGF
How to Implement an Institutional Repository: Part II
Leveraging PIDs for object management in data infrastructures RDA UK Node Workshop, July Tobias Weigel (DKRZ)
Presentation transcript:

M. Stockhause et al. Martina Stockhause, Michael Lautenschlager, Frank Toussaint Deutsches Klimarechenzentrum (DKRZ) World Data Centre for Climate (WDCC) DKRZ: Long-term archiving requirements DKRZ LTA, ESGF Conference 2014

M. Stockhause et al. The purpose of Long-term archival (LTA) and the IPCC DDC is to provide stable data for long-term interdisciplinary (re-)use: permanent and persistent data access stable and complete data well-documented high-quality (for acceptance) citable data entities (for credit) 2 Long-Term Archival at DKRZ (1) DKRZ LTA, ESGF Conference 2014

M. Stockhause et al. 3 Long-Term Archival at DKRZ (2)  Long-term archive for climate data, esp. Earth System Model data  IPCC Data Distribution Centre (IPCC-DDC) for climate model output:  DOI Data publisher since 2004 (1 st DOI in the DataCite catalog : doi: /WDCC/EH4_OPYC_SRES_A2 )doi: /WDCC/EH4_OPYC_SRES_A2 World Data Center for Climate (WDCC) at DKRZ: DKRZ LTA, ESGF Conference 2014

M. Stockhause et al. QC Repository CIM Repository Long-Term Archive WDC Climate at DKRZ Long-Term Archive CERA2 Metadata 2 Technical 1 Transfer of Use MD by WDCC ESGF index node ESGF data nodes 1 Transfer of data by WDCC 1 Transfer of ext. MD by WDCC CMIP5 data and metadata Temporary Storage 3 Long-Term Archival Quality Assurance 4 DataCite DOI Publication Process CMIP5 Experience Data Manager LTA manager operates in a diverse and heterogeneous technical environment under development. Questions to be solved: Who is the repository contact? How to identify? Is a mapping needed for DRS_ids? How to access? Who is the data creator? 2 Citation Information DKRZ LTA, ESGF Conference 2014

M. Stockhause et al. 5 Requirements for LTA: Identification (1)  Reliable identification of data and metadata objects by PID and DRS_id (and of persons by ORCID): Use of controlled vocabulary (CV) for DRS components, e.g. institute, model, experiment Consistent ESGF data base over time: persistence of metadata and strict versioning Links provided between data and external metadata  Verification of data and metadata objects by MD5 checksums DKRZ LTA, ESGF Conference 2014 See also: Data Citation Principles at

M. Stockhause et al. 6 Requirements for LTA: QC (2) Quality control information and data citations ESGF published together with the data:  Quality Control: When? - Quality control to be performed as early as possible and as detailed as affordable, e.g.: Check at least DRS conformance prior to ESGF publication How? - Improve the operability of the QC2 tool  Citation of data: Collection and ESGF publication of author lists, titles etc. together with the data If operable: assign a PID to a citation entity DKRZ LTA, ESGF Conference 2014

M. Stockhause et al. 7 Requirements for LTA: Organizational (3)  Operable infrastructure with defined stable interfaces are required (to ESGF and external repositories)  Introduce a Data Management Plan defining the data workflow including quality procedures and ESGF data node manager commitments on e.g. versioning.  Definition/Implementation of a core data subset to prioritize data replication and LTA, e.g. use ESGF product facet  Improved interaction with data creators (In CMIP5 they were approached multiple times from project manager, CIM, Quality/Citation.) DKRZ LTA, ESGF Conference 2014

M. Stockhause et al. QC Repository CIM Repository Long-Term Archive WDC Climate at DKRZ Long-Term Archive CERA2 Metadata 2 Technical 1 Get IDs for data ESGF index node ESGF data nodes 1 Transfer of data + MD by WDCC MIP data and metadata Temporary Storage 3 Long-Term Archival Quality Assurance 4 DataCite DOI Publication Process Requirements for LTA (4) Data Manager LTA manager is able to collect data and metadata by asking the ESGF index for access information. other related repositories, e.g. user annotations, version change information, data citation … and MD access DKRZ LTA, ESGF Conference 2014

M. Stockhause et al Stockhause (2014): Long-term archiving workflow in CMIP5 – a first review, IS-ENES Workshop on workflow solutions, , Hamburg, Germany, PDF.PDF Stockhause et al. (2012): Quality assessment concept of the World Data Center for Climate and its application to CMIP5 data, Geosci. Model Dev., 5, 1023–1032, doi: /gmd doi: /gmd DKRZ LTA, ESGF Conference 2014

M. Stockhause et al. QC Repository CIM Repository Long-Term Archive WDC Climate at DKRZ Long-Term Archive CERA2 Metadata 2 Technical 1 Get IDs for data ESGF index node ESGF data nodes 1 Transfer of data + MD by WDCC MIP data and metadata Temporary Storage 3 Long-Term Archival Quality Assurance 4 DataCite DOI Publication Process Requirements for LTA (4) Data Manager Identification Access Validation other related repositories, e.g. user annotations, version change information, data citation … and MD access DKRZ LTA, ESGF Conference 2014