ECMWF WMO Metadata Workshop – Beijing Sep 2005 Experience with the WMO core metadata in the SIMDAT/VGISC project Baudouin Raoult ECMWF.

Slides:



Advertisements
Similar presentations
GEOSS StP Browse Scenario Doug Nebert 13Jun2011. Support rapid discovery of data in support of critical EO priorities The GEO Web Portal supports search.
Advertisements

Page 1 © Crown copyright 2005 Workshop on Metadata Beijing27-29 September Report on Metadata Gil Ross (Met Office UK) WIS Working Group Geneva.
BR 1 SIMDAT HALO meeting – Meteo Activity of the SIMDAT project: Building components of the WIS Baudouin Raoult ECMWF.
V-GISC Presentation – ET_WISC – Geneva - February v-GISC key functionalities ET_WISC meeting 2-5 February 2010 Jean-Pierre Aubagnac, Jacques Roumilhac.
GTS MetaData Generation data GTS data bases GTS Switch Volume C1 Central Support Office Information Classes white-list Metadata Synchronization.
- 1 - Heinrich Knottenberg ET WISC: WIS Development in 2006.
WIS V-GISC – Simdat WMO REGIONAL SEMINAR OUAGADOUGOU BURKINA FASO February 2007 Jacques Roumilhac.
The North American Carbon Program Google Earth Collection Peter C. Griffith, NACP Coordinator; Lisa E. Wilcox; Amy L. Morrell, NACP Web Group Organization:
WMO Core Profile of the ISO Metadata Standard Steve Foreman Chair IPET-Metadata Implementation.
Page 1© Crown copyright 2006 Registry technology & case study implementation J. Tandy, D. Thomas - November 2006.
1 SIMDAT SIMDAT demonstration – TECO/WIS 7 Nov SIMDAT: Presentation of the demonstration Baudouin Raoult Coordinator, SIMDAT Meteo Activity ECMWF.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
1 NODC, Russia GISC & DCPC developers meeting Langen, 29 – 31 March E2EDM technology implementation for WIS GISC development S. Sukhonosov, S. Belov.
The International Surface Pressure Databank (ISPD) and Twentieth Century Reanalysis at NCAR Thomas Cram - NCAR, Boulder, CO Gilbert Compo & Chesley McColl.
1 WMO Information System (WIS) and the Next Generation of Worldwide Weather Data Exchange by: Robert Bunge (August 2013)
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
IBM User Technology March 2004 | Dynamic Navigation in DITA © 2004 IBM Corporation Dynamic Navigation in DITA Erik Hennum and Robert Anderson.
SIMDAT : Elements for building the WIS TECO-WIS, Seoul, 6-8 November 2006 Matteo Dell’Acqua, Météo-France.
ISO/TC211 Geographic Information/Geomatics Implementing ISO Metadata David Danko Work Item 15—Project Leader
PSIgate Knowledge Exchange: Using OAI to Share Information Paul Meehan, PSIgate Technical Manager UKSG Meeting. May 14, 2003.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
1 NODC, Russia SeaDataNet TTG meeting Paris, May Overview and potential use of E2EDM technology for SeaDataNet Sergey Belov, Nick Mikhailov.
Page 1 © Crown copyright 2005 NESC Workshop 6th-8th September 2005 V-GISC – SIMDAT Gil Ross (Met Office UK) NESC Workshop 6th to 8th September 2005.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Mapping between SOS standard specifications and INSPIRE legislation. Relationship between SOS and D2.9 Matthes Rieke, Dr. Albert Remke (m.rieke,
Sept 19,  Provides a common set of terminology and definitions  A framework for describing resources and processes  Enables computer based interoperability.
MD9.6 Release: Highlights Increased the character limit for all URL resources to 600 characters. Data_Center/Service_Provider Data_Set_Citation/Service_Citation.
M.Lautenschlager (WDCC, Hamburg) / / 1 Semantic Data Management for Organising Terabyte Data Archives Michael Lautenschlager World Data Center.
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Deutscher Wetterdienst DWD GISC Implementation GISC Development Team.
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute.
Slide 1 ECMWF, 5 September 2007 Slide 1 The SIMDAT project Baudouin Raoult Head of Data and Services Section ECMWF.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Managing the Impacts of Change on Archiving Research Data A Presentation for “International Workshop on Strategies for Preservation of and Open Access.
ESIP Federation 2004 : L.B.Pham S. Berrick, L. Pham, G. Leptoukh, Z. Liu, H. Rui, S. Shen, W. Teng, T. Zhu NASA Goddard Earth Sciences (GES) Data & Information.
Using the Global Change Master Directory (GCMD) to Promote and Discover ESIP Data, Services, and Climate Visualizations Presented by GCMD Staff January.
Uwe SchindlerGES 2007 – May 2-4, 2007 Data Information Service based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler 1, Benny Bräuer.
1 Earth System Modeling Framework Documenting and comparing models using Earth System Curator Sylvia Murphy: Julien Chastang:
Where Should the GALION Data Reside? Centrally or Distributed? Introduction to the Discussion Fiebig, M.; Fahre Vik, A. Norwegian Institute for Air Research.
R. Suresh (NASA/MTECH) Ben Burford (JAXA) Bernhard Buckl (DLR) Contact: - CEOS WGISS Meeting, Beijing, China, September 2004 A RSS.
NDD (National Oceans Office Data Directory) development overview as at 1 July 2002 Tony Rees/Miroslaw Ryba CSIRO Marine Research, Hobart.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Slide 1 GO-ESSP Paris. June 2007 Slide 1 (TIGGE and) the EU Funded BRIDGE project Baudouin Raoult Head of Data and Services Section ECMWF.
Federated Space-Time Query for Earth Science Data Using OpenSearch Conventions ESIP Federated Search Cluster Chris Lynnes Bruce Beaumont Ruth Duerr Hook.
CWIC Developers Meeting January 28 th 2014 Calin Duma CSW and OpenSearch from the CWIC Start client perspective.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
Find Research Data b2find.eudat.eu B2FIND User Training How to find data objects and collections using EUDAT’s B2FIND This work is licensed.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
1 SIMDAT Simdat Project –GTD. Meteo Activity – SIMDAT Meteo Activity OGF June 2008 Barcelona Marta Gutierrez, Baudouin Raoult, Cristina.
M. Lautenschlager (M&D/MPIM)1 WDC on Climate as Part of the CERA 1 Database System Michael Lautenschlager Modelle und Daten Max-Planck-Institut.
1 Collaboration for Beijing and Tokyo GISC prototypes Akira Nakamori JMA ET-WISC-III Jun.2008.
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
WMO GRIB Edition 3 Enrico Fucile Inter-Program Expert Team on Data Representation Maintenance and Monitoring IPET-DRMM Geneva, 30 May – 3 June 2016.
MESA A Simple Microarray Data Management Server. General MESA is a prototype web-based database solution for the massive amounts of initial data generated.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss GAWSIS Jörg Klausen Second GALION Workshop, WMO, Geneva.
A41I-0105 Supporting Decadal and Regional Climate Prediction through NCAR’s EaSM Data Portal Doug Schuster and Steve Worley National Center for Atmospheric.
UNIDART Uniform Data Request Interface
Flanders Marine Institute (VLIZ)
TIGGE Archives and Access
OAI and Metadata Harvesting
Metadata for research outputs management Part 2
EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
Baudouin Raoult Head of Data and Services Section
WIS Project Office WMO WIS Data Exchange WIS Project Office WMO
IPET-DD-1 meeting Feb 2019 Thorsten Busselberg -DWD
Presentation transcript:

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Experience with the WMO core metadata in the SIMDAT/VGISC project Baudouin Raoult ECMWF

WMO Metadata Workshop – Beijing Sep 2005 The SIMDAT/VGISC project SIMDAT  EU funded GRID project  7 Technologies: Grid infrastructure, Virtual Organisation, Ontologies, Analysis Services, Workflows, Distributed data access, Knowledge Services  4 Activities: Automotive, Areospace, Pharmacy and Meteorology Meteorology activity: build a Virtual GISC (V-GISC)  DWD  UKMO  MétéoFrance  EUMETSAT  ECMWF

ECMWF WMO Metadata Workshop – Beijing Sep 2005 V-GISC infrastructure

ECMWF WMO Metadata Workshop – Beijing Sep 2005 V-GISC Conceptual view Through the Distributed Portal users searches for and retrieves data, subscribe to services subject to authentication and authorization The Virtual Database Service provides a single view of partners databases

ECMWF WMO Metadata Workshop – Beijing Sep 2005 VGISC Distributed Architecture

ECMWF WMO Metadata Workshop – Beijing Sep 2005 VGISC Node Functional Design

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Why do we need metadata (in this project)? Create a catalogue (discovery metadata)  Searchable (Keyword, Geographical location, Time range)  Browsable (Directory hierarchy) Implement the V-GISC (service metadata)  Describe where the data resides (physical location)  Describe how to request the data  Describe the data format (useful for offering list of transformations, e.g. sub-sampling of gridded data, plots or format conversions)  Describe associated data policies

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Study of the WMO core Starting point  XML files available on the WMO web site  XML files from DWD earlier prototype  Trying to describe ECMWF archive ( GRIB fields)

ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML Root element or Namespaces are a nightmare to use (especially using XPath when there is a default namespace)

ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML Keywords Russian Federation Moscow region Temperature Clouds Meteorology Observation Pressure Rainfall Snow Snowfall Weather Wind Phenomenon Or… EARTH SCIENCE > Cryosphere > Sea Ice EARTH SCIENCE > Atmosphere EARTH SCIENCE > Oceans EARTH SCIENCE > Solid Earth ocean, atmosphere, ice, land Or… METAR aviation hourly weather observation temperature dew point precipitation amount visibility cloud amount type height weather runway colour state

ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML Geographical extent Or… CCCC2 Or…

ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML Temporal extent monthly daily Or… T00:00: T06:00:00 Or… creationDate

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Repetition of XML elements (means extension) mb Global monthly daily

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Repetition of XML elements (means redefinition) Global Grid 2.5 degree latitude and 2.5 degree longitude steps, 6 sectors, one sector per GRIB bulletin Sector S Global Grid 2.5 degree latitude and 2.5 degree longitude steps, 6 sectors, one sector per GRIB bulletin Sector T

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Findings A flexible format, that leads to a lack of consistency  Different way to encode geographical extent, keywords and temporal extents Missing information (for the V-GISC)  To create a directory  To locate the data  To create retrieval requests  To describe available transformations  To implement data policies

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Findings (cont.) Seems to be designed for human consumption  Free text in XML elements Not scalable  Some document may change frequently (hourly?)  Some documents are orders of magnitude larger than data itself  Cannot represent very large archives with small granularity

ECMWF WMO Metadata Workshop – Beijing Sep 2005 SIMDAT/VGISC problem Each site has its own practices  We have to be ready for variability in the XML  We will have to handle XML from other WMO programmes We need to handle tens of thousands of documents  Lot of repeated information  We need fast search We need to automatically  Index the keywords, the geographical extent and the temporal extent  Create a browsable directory (similar the NCAR’s Community data portal)  Locate and retrieve the data  Implement the data policy

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Solution: split XML documents into fragments WMO core metadata is structured Some part are shared amongst many documents  All metadata share the Core part  All UKMO metadata share the Owner part  All synops (should) share the same description  All observations at Heathrow have the same location  The date part is variable but is very small WMO UKMO Synop Heathrow Core Owner Data type Station (geographical extent) Date (temporal extent)

ECMWF WMO Metadata Workshop – Beijing Sep 2005 XML fragments are hierarchically linked WMOUKMO SynopHeathrow Heathrow Synop Heathrow Synop

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Fragments: advantages Factorizing commonalities into static fragments  Reduces size of XML documents  Indexation done once Avoid redundancy of information  Faster searches Frequently updated documents are small  Manageable  Scalable Complete XML document can be rebuilt  For exchange outside the V-GISC

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Indexing of XML fragments WMOUKMO SynopHeathrow Heathrow Synop Heathrow Synop Keywords Geographical Extent Temporal Extent

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Prototype implementation XML Fragment are stored as “text”  Fragment table  Hierarchy table Indexed at insertion time  Keywords table  Locations table  Periods table  Directory table Implemented with MySQL  With OpenGIS extension  With text search extension Indexes are “inherited”  OO approach

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Object Oriented Approach - Behaviours WMOUKMO SynopHeathrow Heathrow Synop Heathrow Synop Index as geography Index as keyword Index as period Index as keyword

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Fragment properties - Behaviours Only the owner of the data knows how to :  Describe the data (Indexation information)  Request the data (Create internal request)  Extract a subset of the data (Define a interface to extract a subset) Associated to each fragments ancillary metadata can be defined to describe how to index, request and sub-select the data Behaviours are inherited  Object oriented approach

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Behaviours example: indexing //identificationInfo/descriptiveKeywords //identificationInfo/dataExtent/geographicElement/boundingBox //identificationInfo/dataExtent/geographicElement/polygon //identificationInfo/referenceDate/date //identificationInfo/dataExtent/temporalElement //identificationInfo/referenceDate/period //identificationInfo/topicCategory

ECMWF WMO Metadata Workshop – Beijing Sep 2005 extension A element from the “ namespace is embedded in all the fragments It contains all information needed to implement the V-GISC that is not defined by the WMO core because they are not relevant outside the scope of the V-GISC  Internal unique ID  Hierarchy relationship  Physical location (which V-GISC node holds the data)  Information used to create data request  Information used to create web pages It is removed when full XML document is recomposed for use outside the V-GISC

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Fragment example urn:akrotiri.synop.land.second.record urn:akrotiri urn:int.wmo.synop.land.second.record ecmwf.obs

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Variables and Requests Some datasets have two many items  Impossible to describe every one of them  But describing the whole dataset is simple Some datasets are very homogenous  E.g. same parameters for a long period of time  This can be described in a compact form ( and )  But we still need to specify that individual dates can be requested by the user

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Variables and requests (cont.) Associate two elements with an XML fragment:  Hold information specific on how to generate a valid request to the data repository  Holds information on how to create a web interface to let the user select items from the dataset Web portal  We use WMO core for discovery  We use the element to present selection dialogues to the user

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Fragment example: ECMWF Reanalysis urn:int.ecmwf.era40.sfc urn:int.wmo.core ecmwf.mars e4 sfc marser t msl ECMWF 40 Years reanalysis ERA40 ERA-40 in GRIB NWP Outputs > ECMWF > 40 years reanalysis …

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Directory structure Problem: create a browsable hierarchy of topics, as the “Google directory” (see NCAR’s community data portal) Not to be confuse with the internal “fragment hierarchy” which is not exposed to the end user Currently using the element NWP Outputs > ECMWF > 40 years reanalysis The same product can appear in several locations of the directory Observations > By Type > Profile > Temp Land Observations > By Region > Asia > China Usage should be recommended by WMO

ECMWF WMO Metadata Workshop – Beijing Sep 2005 Conclusion The approach taken in the V-GISC should help us support the large variety of XML documents Nevertheless, the standard is too flexible  Lot of programming is required to support all possible variations The WMO must provide “best practices” guidelines  How to encode point in time, how to encode ranges, … A topic hierarchy must be defined, to create the directory WMO core metadata needs only contain sufficient information for discovery  The rest can be implemented as a series of local extensions, as long as they are not exported or exchanged