VO Sandpit, November 2009 Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan.

Slides:



Advertisements
Similar presentations
GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey February 5, 2008.
Advertisements

Metadata workshop, June The Workshop Workshop Timetable introduction to the Go-Geo! project metadata overview Go-Geo! portal hands on session.
© NERC All rights reserved NERC Data Catalogue Service Patrick Bell NERC (British Geological Survey)
1 OGC Web Services Kai Lin San Diego Supercomputer Center
Page 1© Crown copyright 2006 Registry technology & case study implementation J. Tandy, D. Thomas - November 2006.
Geospatial One-Stop A Federal Gateway to Federal, State & Local Geographic Data
MEDIN Standards M. Charlesworth and the MEDIN Standards Working Group.
OneGeology-Europe - the first step to the European Geological SDI INSPIRE Conference 2010, Session Thematic Communities: Geology Krakow, June 24 th 2010.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
1 How Semantic Technology Can Improve the NextGen Air Transportation System Information Sharing Environment 4th Annual Spatial Ontology Community of Practice.
MEDIN Standards Breakout M. Charlesworth and the MEDIN Standards Working Group.
NERC Data Grid Helen Snaith and the NDG consortium …
NOAA Metadata Update Ted Habermann. NOAA EDMC Documentation Directive This Procedural Directive establishes 1) a metadata content standard (International.
EDMED and EDIOS Roy Lowry, Karen Vickers (Technical) Lesley Rickards, Liz Bradshaw (Content) British Oceanographic Data Centre.
EuroGEOSS Implementing the GEOSS Data Sharing Action Plan: Forestry.
2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet.
MEDIN Partners Meeting Sept 2010 DASSH – The Archive for Marine Species and Habitats Dan Lear DASSH Project Co-ordinator Marine.
Status of upgrading CDI service (user interface, harvesting via GeoNetwork, CDI interoperability options following SeaDataNet D8.7) By Dick M.A. Schaap.
MEDIN Data Guidelines. Data Guidelines Documents with tables and Excel versions of tables which are organised on a thematic basis which consider the actual.
Interoperability ERRA System.
1 The INSPIRE Geoportal Ioannis Kanellopoulos Spatial Data Infrastructures Unit European Commission Joint Research Centre Institute for Environment and.
MEDIN Standards Workshop 27 th March 2012 Programme.
Bryan Lawrence on behalf of BADC, BODC, CCLRC, PML and SOC An Introduction to NDG concepts [ ]=
NIWA National Science Centre for Environmental Information Jochen Schmidt, Chief Scientist Federated Information Infrastructure.
1 The NERC DataGrid DataGrid The NERC DataGrid DataGrid AHM 2003 – 2 Sept, 2003 e-Science Centre Metadata of the NERC DataGrid Kevin O’Neill CCLRC e-Science.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
What is Information Modelling (and why do we need it in NEII…)? Dominic Lowe, Bureau of Meteorology, 29 October 2013.
Vers national spatial data infrastructure training program Value of Metadata Introduction to Metadata An overview of the value of metadata to.
Lesley Rickards MEDIN Core Team European initiatives and how MEDIN and the data in the framework relates to them.
NERC DataGrid NERC DataGrid Vocabulary Server Use Cases Vocabulary Workshop, RAL, February 25, 2009.
Data discovery and data processing for environmental research infrastructures Roberto Cossu ENVRI WP4 leader ESA.
VO Sandpit, November 2009 CEDA Metadata Steve Donegan/Sam Pepler.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
MEDIN Standards Workshop 26/27 th October 2011 Programme.
ESIP & Geospatial One-Stop (GOS) Registering ESIP Products and Services with Geospatial One-Stop.
MEDIN Work Plan for By March 2011 MEDIN will be 3 years into the original 5 year development plan started in Would normally ask for continued.
Metadata – use data discovery e.g. a library catalog data assessment determine the fitness-for-purpose of a data set data retrieval e.g., format.
Laura Russell Programmer VertNet Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and.
An introduction to the MEDIN Discovery Metadata Standard.
WISE Working Group D September 2009, Brussels Jon Maidens.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
The Proliferation of Metadata Standards and the Evolution of NASA’s Global Change Master Directory (GCMD) Standard for Uses in Earth Science Data Discovery.
An introduction to the MEDIN Discovery Metadata Standard.
MEDIN Standards Breakout M. Charlesworth and the MEDIN Standards Working Group.
International Oceanographic Data and Information Exchange - Ocean Data Portal (IODE ODP) Enabling science through seamless and open access to marine data.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
CGI – GeoSciML Testbed 3 Status for BRGM Jean-Jacques Serrano.
COMPASS09 Annual Conference of Compass Informatics.
Using a Simple Knowledge Organization System to facilitate Catalogue and Search for the ESA CCI Open Data Portal EGU, 21 April 2016 Antony Wilson, Victoria.
An Introduction to the MEDIN Discovery Metadata Standard MEDIN Workshop BGS, Edinburgh, June 2015.
An Introduction to the MEDIN Discovery Metadata Standard MEDIN Workshop NOC, Liverpool, Sept 2015.
Bavarian Agency for Surveying and Geoinformation AAA - The contribution of the AdV in an increasing European Spatial Data Infrastructure - the German Way.
CEOS Working Group on Information System and Services (WGISS) Data Access Infrastructure and Interoperability Standards Andrew Mitchell - NASA Goddard.
International Planetary Data Alliance Registry Project Update September 16, 2011.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Page 1 CSISS Center for Spatial Information Science and Systems IIB and GCI Meeting CSR Architecture and Current Registration Status Prof. Liping Di Director.
Making FAAM Flights Discoverable
GEOSS Component and Service Registry (CSR)
GeoNetwork OpenSource: Geographic data sharing for everyone
Flanders Marine Institute (VLIZ)
An Introduction to the MEDIN Discovery Metadata Standard
Signposting your information - MEDIN Discovery Metadata
NORTH CAROLINA state and local government METADATA PROFILE
Session 2: Metadata and Catalogues
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Proposal of a Geographic Metadata Profile for WISE
MSDI training courses feedback MSDIWG10 March 2019 Busan
Presentation transcript:

VO Sandpit, November 2009 Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan

VO Sandpit, November 2009 Introduction –NERC, Science and Data Centres –NERC Discovery Metadata – The Data Catalogue Service –NERC Data Services –Case study: generating Metadata and doing something useful with it!

VO Sandpit, November 2009 – Main UK body for funding research, training, knowledge exchange in environmental sciences – Annual budget £388m (2011) – Covers atmosphere, earth, terrestrial, aquatic sciences – Research ships and aircraft, satellite technology

VO Sandpit, November 2009 What sort of data do we deal with? A variety of environmental measurements, along with the results of model simulations

VO Sandpit, November 2009 NERC Designated Data Centres The UK’s Natural Environment Research Council (NERC) funds eight data centres which between them have responsibility for the long-term management of NERC's environmental data holdings.

VO Sandpit, November 2009 The role of the data centres NERC funds research projects, which produce data. It is essential that these data are properly managed to ensure their long-term availability. NERC’s network of data centres provide support and guidance in data management to those funded by NERC, are responsible for the long-term curation of data and provide access to NERC's data holdings. The NERC Data Policy details their commitment to support the long-term management of data and also outlines the roles and responsibilities of all those involved in the collection and management of data. We are also involved in externally funded projects in informatics, e-Science and domain specific areas.

VO Sandpit, November 2009 Changing and conflicting user demands There is a tension between the requirements of different users. –Scientists / NERC Want raw data in its original format Require long-term stewardship of data Want as much contextual detail as possible –Government Agencies / Knowledge Exchange: Use environmental information to drive policy making Prefer real time data delivery Require derived products that address specific questions Need to synthesise data from many different sources in order to reach a decision Quality control is critical!

VO Sandpit, November 2009 Legislation and technical changes Open standards for geospatial data and services promises a new level of interoperability between data providers EU INSPIRE directive requires us to provide data discovery, view and download services INSPIRE is an Infrastructure for Spatial Information within Europe for the purposes of Community environmental policies and policies or activities which may have an impact on the environment. As NERC data is within the UK public domain and many of its data holdings have a geospatial component, then by law NERC must produce metadata that is compliant with the EU INSPIRE directive ( Data interoperability and data sharing are prime objectives for INSPIRE and these are underpinned by a specification for metadata used for Data Discovery within INSPIRE. I INSPIRE discovery metadata is based on the ISO19115/19119 Application Profile (metadata for geographic information) with a definition of core metadata elements from this required for INSPIRE compliance

VO Sandpit, November 2009 Discovery Metadata to satisfy all requirements NERC requires research/data to be able to generate on demand consistent discovery metadata describing NERC’s data assets. Compliance to this standard helps to ensure that NERC’s data assets are consistently discoverable, and aids in the generation and operation of services that utilise these assets across the NERC disciplines. NERC metadata must also accommodate and comply with international standards and directives. Metadata providers must have the capability to produce metadata conforming to this standard, under

VO Sandpit, November 2009 Discovery Metadata to satisfy all requirements NERC Data Management Advisory Group (DMAG) The ISO standards and define metadata schema definitions adequate for describing data resources held by NERC. For communication purposes, the ISO19115/19119 metadata can be serialised and encoded as XML using the ISO standard NERC produced a profile of the ISO19115 But before official adoption... NERC SIS Group review MEDIN Discovery Metadata Standard: MEDIN: Marine Environment Data Information Network: Some NERC DDC’s MEDIN partners MEDIN largely conformant with INSPIRE and Gemini2 but with specialism's for the marine community (i.e. Seadatanet keywords Decided to base NERC “standard” on MEDIN Discovery Standard but with exceptions/additions for NERC specific areas (i.e. How do you define vertical extent for Butterfly counts?) MEDIN community has published schematron and metadata tools to support standard Datasets, Series and Services! Adopt straight UK Gemini?

VO Sandpit, November 2009 The NERC Data Catalogue Service The NERC DCS aims to provide a searchable interface to “published” discovery records from NERC DDC’s Provides the ability to conduct a simple text, geographic and/or temporal search. Advanced search option allows structuring of complex queries: search for the term “ozone” but NOT if associated with the term “depletion” Results returned with basic information rendered from the discovery metadata – links back to DDC, further information, download service etc Currently datasets & series.. Uses NERC Vocabulary Service for added content/dissemination

VO Sandpit, November 2009 Data Services: Services need discovering too!

VO Sandpit, November 2009 Developing a Portal NERC needed to replace the previous NERC Discovery Service: limited by metadata content (GCMD DIF, interoperability issues – services etcs) Developed as part of the NERC Data Grid (NDG) activity – consisted of a portal connected to a metadata catalogue all located at NEODC/CEDA NERC SIS recommended not only adopting the MEDIN Discovery Standard but also using the existing MEDIN Discovery Portal and web service MEDIN portal uses the Discovery Web Service (DWS) developed by NEODC/CEDA to search a metadata catalogue derived from discovery metadata harvested from data providers Based on previous generation NERC Discovery Portal but adapted for ISO19139 rather than GCMD DIF More powerful “targeted” keyword and text searches Distributed architecture: DWS runs on catalogue at NEODC/CEDA whilst portal located at Geodata in Southampton NERC Data Catalogue Service adapted for NERC style MEDIN records but with added targeted text search etc DWS/Catalogue runs at NEODC/CEDA and DCS portal at BODC

VO Sandpit, November 2009 Developing a Portal (NERC model) Metadata Catalogue (PostgreSql) Discovery Web Service (DWS) Data Providers Web Service (DPWS) OAI-PMHOGC CSWWAF Metadata Providers

VO Sandpit, November 2009 Harvesting the metadata.. OAI-PMH (Open Archive Initiative: Protocol for Metadata Harvesting): Providers and Harvesters A harvester takes full XML metadata and returns a copy to the local environment Any format – however, Dublin Core must be provided to be OAI-PMH compliant Support for deleted records, detection of changed records, regular harvesting Works via HTTP

VO Sandpit, November 2009 Developing a Portal (MEDIN model) Metadata Catalogue (PostgreSql) Discovery Web Service (DWS) Data Providers Web Service (DPWS) OAI-PMHOGC CSWWAF Metadata Providers OGC CSW (Geonetworks)

VO Sandpit, November 2009 Portal future developments UK Location is implementing the UK’s response for INSPIRE All in-scope records must be published to the UK Location Portal OGC CSW (Catalog Service for the Web) or via WAF (Web Accessible Folder) CSW: The Catalog Service defines common interfaces to discover, browse, and query metadata about data, services, and other potential resources. Opensource solution: Geonetworks MEDIN solution for compliance is to run a parallel CSW to the MEDIN DWS with identical content NERC solution is for all DDC’s to replace OAI-PMH with local Geonetworks with one “core” CSW that supports a Discovery portal using a federated search. The core CSW is also the publishing point to UK Location

VO Sandpit, November 2009 Developing a Portal (future NERC model) OGC CSW Metadata Providers OGC CSW (Geonetworks) OGC CSW Metadata Providers OGC CSW Metadata Providers OGC CSW Metadata Providers Federated Searches

VO Sandpit, November 2009 CEDA Case Study CEDA: Centre for Environmental Data Archival: NERC Earth Observation Data Centre (NEODC) British Atmospheric Data Centre (BADC) UK Solar System Data Centre (UKSSDC) Located at STFC Rutherford Appleton Laboratory, Oxfordshire Actively participates in NERC e-infrastructure projects: NERC Data grid INSPIRE LMO, OGC, ISIC, and much much more. Data Centres publish data to NERC DCS but also runs the Harvesting, catalogue and DWS operations supporting the portal (BODC) But how does a data centre generate metadata and get it published?

VO Sandpit, November 2009 CEDA Metadata Catalogue All of CEDA data holdings are catalogued in a database according to a data model (MOLES2/3). This model quantifies various aspects of the data: What is it? (i.e. instrument, format, model, service) Where and when is it? (i.e. spatial coverage, date range/times) Who owns it/where did it come from? (i.e. Who created the dataset? Restrictions on usage – UK only?) What can it do? (i.e. Is it available in a visualisation service? Any legal aspects?) Any associated resources? (i.e. Keyword or Parameter names, Links to original data provider site, documentation, Web Service Endpoints)

VO Sandpit, November 2009 CEDA Metadata Catalogue Information in the data catalogue is created by a combination of manual entry by Data Scientists as well as information taken from the data itself during the ingestion process and placement on the CEDA archive. Metadata in the catalogue is used for a variety of purposes: –Provide a resource to generate metadata for external consumption i.e. to aid data “discovery”, allow data/CEDA services to be used in external resources (i.e. WFS, WMS etc) –Provide an accurate up to date description of each dataset and any related issues as a resource for the community –Reference – allow citation of dataset (DOI) –Dataset management

VO Sandpit, November 2009 Archive Catalogue - MOLES - CEDA Info Data Scientist Automatic Update Discovery XML DataCite XML CSML/WMS /WFS Service metadata XML Generation 3 rd Party Data providers Data Suppliers OAI PMHOGC CSW Web Accessible Folder (TBC) Publicly Visible External Users NERC Catalogue Service, DataCite, UK Location Portal, Go-Geo, MEDIN Portal, INSPIRE…. All use metadata from CEDA metadata publishing layer