Presentation is loading. Please wait.

Presentation is loading. Please wait.

European Space Weather Week 3 Brussels, November 13-17, 2006 Atmospheric Data Management - A Challenge - Anne De Rudder and Sue Latham Rutherford Appleton.

Similar presentations


Presentation on theme: "European Space Weather Week 3 Brussels, November 13-17, 2006 Atmospheric Data Management - A Challenge - Anne De Rudder and Sue Latham Rutherford Appleton."— Presentation transcript:

1 European Space Weather Week 3 Brussels, November 13-17, 2006 Atmospheric Data Management - A Challenge - Anne De Rudder and Sue Latham Rutherford Appleton Laboratory, UK

2 European Space Weather Week 3 Brussels, November 13-17, 2006  ? In 2 or 3 decades, the universe of data has gone from… …to

3

4 European Space Weather Week 3 Brussels, November 13-17, 2006 One of the NERC designated Data Centres and a component of the NCAS Documented long-term data archive (currently about 130 catalogued datasets) About 8,000 registered users worldwide, among whom 3,000 have applied for access to specific datasets and 2,000 have downloaded data in the past year Data management in support to NERC research programmes, grants and facilities and occasionally to some international research projects Data are distributed via the web Assistance to users regarding atmospheric data issues (trajectories, online help desk, visualisation facilities, software, links, …) The BADC http://badc.nerc.ac.uk/

5 European Space Weather Week 3 Brussels, November 13-17, 2006 Data policies – their purpose and implementation Model versus observation Metadata Citation and publication Data access networks (grids) Speaking the same language A few traps to beware of Contents

6 European Space Weather Week 3 Brussels, November 13-17, 2006 Aims Ensuring the swift exchange of knowledge within a research project. Ensuring that the newly acquired knowledge, or at least the material on which it relies, is kept for possible future reference, improvement and use and is made available to the community. Ensuring that the data is documented in a way that will allow long-term access to — and understanding of it Ensuring that researchers’ rights are not infringed on. Data policies Data management plans To implement the principles outlined in the data policy To plan how and when data will be generated, shared, stored within a project DMPs also include arrangements for the provision of supporting third- party data (e.g. met data from the UK MetOffice, provision of NRT data or forecasts to support field campaigns)

7 European Space Weather Week 3 Brussels, November 13-17, 2006 oa discussion forum oa way to work on common documents oa way to validate and format preliminary data Data policies To provide a long-term archive to the community: Regular backups on at least two supports and in two places Advertisement of the dataset (dataset catalogue, dataset “publication”) To ease the exchange of knowledge within the project: Submission schedule and deadlines taking into account the synergy between the different groups taking part in the project Common format (often seen as a devilish obstacle in our Excel times…) Provision of a workspace (e.g. BSCW) to be used as

8 European Space Weather Week 3 Brussels, November 13-17, 2006 as possible Data policies To ensure that this long-term archive can be read, interpreted and used: Use of a worldwide metadata standard (CF Convention) Use formats that allow the metadata to be attached to the data inseparably Documentation (metadata) should be as  specific  accurate  explicit  complete

9 European Space Weather Week 3 Brussels, November 13-17, 2006 Metadata To associate to a dataset key terms that will allow its discovery. To give all the information needed to read, understand, interpret the data. Metadata standards Integrate a terminology, recommendations on the metadata content and some format considerations The Climate Forecast Metadata Convention was developed for NetCDF but is largely applicable to information provided with any atmospheric data regardless of its format. Providing (good) metadata and conforming to metadata standards is a habit that still needs to be acquired…

10 European Space Weather Week 3 Brussels, November 13-17, 2006 In order to allow the researchers to be the first ones to analyse and publish their data, while at the same time ensuring some synergy between the different groups participating to the project During the project duration or for a certain period of time after the end of the project, access is restricted to the project participants… With exceptions for close collaborators or participants to associated projects This retention period ranges from 1 to …10 years! Password protected system Modalities of application and of access granting vary (e.g. consultation of PI, list of authorised users, etc.) … after which, the data is released to the public domain. Data policies Protecting researchers’ work and rights: Temporary restriction of access

11 Access to restricted data – Authorised Users Project participants Immediate availability On application External Collaborators (during retention period) Must apply for access Applications channelled through Project PI(s) External Collaborators Public Discovery metadata immediately visible Free access to the data after the retention period (sometimes, Conditions of Use continue to apply) European Space Weather Week 3 Brussels, November 13-17, 2006 Data policies

12 European Space Weather Week 3 Brussels, November 13-17, 2006 Protecting researchers’ work and rights: Conditions of use and publication Data policies Applying during the project and sometimes after it has ended Sometimes included in the data files, as a stamp Committing the user to respect rules such as oRestricting the use of the data to the research topic stated at the time of application oNot to disclose the data to other parties oContacting the data provider oAcknowledging the data provider oOffer co-authorship to the data provider

13 European Space Weather Week 3 Brussels, November 13-17, 2006  Research facility National programme International project Intercontinental initiative Data policies

14 European Space Weather Week 3 Brussels, November 13-17, 2006 (Quoted by David Stevenson, University of Edinburgh, at an UTLS Ozone Science Meeting) Model versus observation any output of model computation (e.g. simulations), datasets resulting from some kind of data assimilation technique, compilation of observations from different sources (synthesized datasets) Is there such a clear difference between the two things? Is processed or derived data observation or modelling? Is a programme “model data”? Nobody believes a modelling paper except the author. Everybody believes an observational paper… except the author. For the purpose of data management, Model data = … which have in common to be more likely or more quickly superseded by newer versions than observations are. They are also usually the end-product of project, while observations are a starting point for further analyses and studies.

15 European Space Weather Week 3 Brussels, November 13-17, 2006 BADC Guidelines for the Archival of Simulated Data oLikely future existence of a community of potential users. oHistorical, legal or scientific importance likely to persist. oThe results will be used in an intercomparison exercise. oIntegration of observation data in a way that adds value to the observations. oThe results have been the basis of a publication. oThe results have confirmed or led to some outstanding discovery. Model versus observation Codes archived only as metadata to support model output Datasets peer-reviewed at regular intervals (a few years) Criteria to select model runs to be archived for the long-term

16 European Space Weather Week 3 Brussels, November 13-17, 2006 Citation and publication Some projects gather together the worlds of librarians and data scientists, e.g. CLADDIER To investigate how datasets can be (better) versioned catalogued peer-reviewed referenced in papers published

17 European Space Weather Week 3 Brussels, November 13-17, 2006 Citation and publication

18 European Space Weather Week 3 Brussels, November 13-17, 2006 E-grids Networks linking several organisations with similar or complementary competences in such a way as to ensure their interoperability. E.g. network of data repositories, models and computers allowing the user to search and use these resources simultaneously and transparently. Issues: Transfer of information (balance between redundant storage and speed of transfer) Authentication (security and access) Format conversion Vocabulary (metadata standards)

19 European Space Weather Week 3 Brussels, November 13-17, 2006 E-grids

20 European Space Weather Week 3 Brussels, November 13-17, 2006 The NERC Data Grid (NDG) Project Infrastructure system to enable the discovery and retrieval of data held at distributed data centres via one single portal Partners: BADC, BODC, PCMDI (LLN) Security issues tackled through “role mapping”, i.e. definition of equivalent authorisations (avoiding the user the need to register with each organisation) A discovery metadatabase already exists based on MOLES = Metadata Objects for Links in Environmental Science Further we intend to make the connection between data held in managed archives and data held by individual research groups seamless in such a way that the same tools can be used to compare and manipulate data from both sources. What will be completely new will be the ability to compare and contrast data from an extensive range of (US, European, UK, NERC) datasets from within one specific context. E-grids

21 European Space Weather Week 3 Brussels, November 13-17, 2006 E-grids

22 European Space Weather Week 3 Brussels, November 13-17, 2006 Standard terminologies Speaking the same language Sets of terms of reference with, sometimes, unique identifiers (key values), definitions and version numbers System of relationships between terms (synonyms, inclusion, related terms) Underpin catalogues and search engines Ex.: GCMD, CF, SeaDataNet MOLES (Metadata Objects for Links in Environmental Science): The metadata scheme underpinning the NDG discovery tool (based on a set of XML records) and the next BADC catalogue (relational metadatabase) Developed in-house Integrates tentative mappings between GCMD, CF, SeaDataNet

23 European Space Weather Week 3 Brussels, November 13-17, 2006 Lessons learnt and traps to avoid Envisage the data policy at an early stage of a project proposal and in consideration of already running projects that may become associated or involved. Design and develop an open standard terminology with direct input from the researchers and carefully thought relationships between terms. Do not try to build a terminology that covers everything but focus on the vocabulary needed in your community. Resist the temptation of replacing tools (software, applications, conceptual tools) every time a new shiny one is launched on the market.


Download ppt "European Space Weather Week 3 Brussels, November 13-17, 2006 Atmospheric Data Management - A Challenge - Anne De Rudder and Sue Latham Rutherford Appleton."

Similar presentations


Ads by Google