Presentation on theme: "Page 1 A Proposal for a Reference Implementation of the WMO Core Metadata Profile based on ISO-19115/19139 3 rd May 2006 by Jeremy Tandy, UK Met Office."— Presentation transcript:
Page 1 A Proposal for a Reference Implementation of the WMO Core Metadata Profile based on ISO-19115/19139 3 rd May 2006 by Jeremy Tandy, UK Met Office Jürgen Seib, Deutscher Wetterdienst Michael Burek, National Center for Atmospheric Research
Page 2 A revised WMO Core Metadata Profile WMO Core Metadata Profile … ISO 19115 compliant? (also note that the standard itself has undergone some revision: ISO 19115:2003/Cor. 1:2006) Motivation version 0.2 is NOT compliant to ISO 19115 XML schemas of version 0.2 are bespoke UML model for WMO extensions does NOT exist
Page 3 ISO 19115 compliance WMO Core Profile v0.2 breaks the extension rules defined in ISO19115 Annex C WMO encoding does not match ISO 19115 Changes to ISO 19115 (extensions) not in WMO- governed namespace Examples: Cardinality restricted Optional attributes removed … and much worse … (DQ_DataQuality) Further issues were noted in the WMO IPET-MI report prepared by Clemens Portele
Page 4 A revised WMO Core Metadata Profile Characteristics of WMO metadata documents simple human readable (optionally) self-contained; i.e. the metadata record *should* be able to be expressed as a single document multilingual XML-validity should imply semantic correctness ISO 19115 compliant community profile
Page 5 ISO 19115 - Extensions How do we create an ISO 19115 compliant metadata profile? Guidance: ISO 19115 Annex C: Metadata extensions and profiles ISO 19115 Annex F: Metadata extension methodology In order to ensure interoperability beyond the community where the extensions are implemented: a)You must document the extension via the extension metadata described in ISO 19115, and b)Add an isoType attribute to the class indicating which ISO 19115 class was sub-classed
Page 6 ISO 19115 Clause C.2: Types of extensions adding a new metadata section creating a new metadata code list to replace the domain of an existing metadata element that has free text listed as its domain value creating new metadata code list elements (expanding a code list) adding a new metadata element adding a new metadata entity imposing a more stringent obligation on an existing metadata element imposing a more restrictive domain on an existing metadata element
Page 7 ISO 19115 Clause C.4: Rules for creating an extension the name, definition or data type of an existing element can not be changed a new element may include extended and existing metadata elements as components metadata elements can be more stringent domains can be more restrictive the use of domain values can be restricted code lists of type «CodeList» can be expanded an extension shall not permit anything not allowed by the standard
Page 8 Rules for creating a profile check registered profiles adhere to the rules for defining an extension A profile shall include: the core metadata all mandatory metadata elements all conditional metadata elements, if the dataset meets the condition use UML diagrams to describe a profile use your own namespace publish the profile
Page 9 WMO namespace namespace: collection of names, identified by an URI reference is needed for extensions XML schemas of WMO extensions should reside in the http://www.wmo.int/metadata/2006 namespace xmlns:wmo = http://www.wmo.int/metadata/2006
Page 10 So where do we go from WMO Core v0.2? 1.From the extension rules its clear that WMO Core v0.2 has problems 2.Furthermore, even *before* we extend ISO 19115 we need to identify an XML encoding of the content model ISO 19115 is an abstract specification … IT specifies what information is required but not how to encode it We developed our own XML encoding of ISO 19115 – as did many others Interoperability is impaired by having numerous inconsistent encodings of ISO 19115 …
Page 11 The ISO standard 19139 XML schema implementation for ISO 19115 provides a common specification for describing, validating and exchanging metadata defines implementation guidelines for general-purpose metadata includes XML schema implementations of other ISO 191xx series (including ISO 19136 / GML) according to ISO 19118 Geographic Information – Encoding fulfils 99% of the needs for a WMO metadata standard
Page 12 WMO metadata profile – key elements Whilst ISO 19139 resolves a large proportion (99%?) of metadata-related concerns from the WMO community … there are still a number of outstanding elements worthy of discussion: 1.Time 2.Internationalisation / multilingual support 3.Codelists and keywords 4.Service metadata 5.Catalogues – incl. Feature Catalogues
Page 13 The future? Whilst it is recommended that we adopt ISO 19139 as the default encoding for WMO Core metadata … We must ensure that WMO Core is fit for purpose within the WMO community – where the existing ISO standard information models and encodings are not appropriate we should *adapt* them to our needs (e.g. ISO 19111 and parametric spatial referencing systems) The ISO standards are still in flux; so long as we *engage* with ISO/TC 211 we can adapt the standards to suit WMO This profile is a *big* step forward, but it should not be considered a final version … it will continue to evolve
Page 14 ISO 19118 Geographic Information - Encoding ISO 19118 Geographic information – Encoding specifies rules for encoding geographic information … i.e. converting from the UML model to XML schema ISO 19139 Clause 8 presents a good summary in relation to encoding the ISO 19115 information model
Page 15 ISO 19139 – Encoding summary (1) Each UML class is encoded into 3 XML constructs: 1.XML Class Type (XCT) describing the content (attributes) of the UML class 2.XML Class Global Element (XCGE) ensuring the class has global visibility within the XML schema for import etc. 3.XML Class Property Type containment of a class is managed through the XML Class Property Type of its data type, enabling both by Value and by Ref implementations of content Note: when an XCT is of xs:simpleType, by Ref is not permitted
Page 16 ISO 19139 – Encoding summary (2) Naming conventions: UML class » Class1 XCT » Class1_Type XCGE» Class1 XCPT» Class1_PropertyType Special case encodings: Abstract classes Inheritance and sub-classes Note: to achieve interoperability only extension sub-classing (i.e. adding attributes) is permitted – restriction and multiple inheritance are not allowed Enumerations Codelists Unions
Page 17 Polymorphism and substitution groups (1) Polymorphism: the ability to assume different forms Example: [CI_ResponsibleParty] » individualName [CharacterString] could be specialized such that name is compartmentalized into first and last names Polymorphism provides communities with a mechanism to better refine metadata to meet organizational need individualName is extended within a community namespace … whilst still utilizing ISO 19139 schemas (gmd namespace), and still providing usable & understandable instance documents In OO, the specialized class is a type of the general class; e.g. a dog is a type of animal
Page 18 Polymorphism and substitution groups (2) The is a relationship implies semantic consistency & substitutability ISO 19118 *only* permits simple (extension-only) sub- classing / specialization ISO 19139 allows polymorphism primarily though the _PropertyType encodings; e.g. gmd:PT_FreeText_PropertyType is substituted for the more general gco:CharacterString_PropertyType for multi-lingual support Need to inform the XML parser of the substitution: a)via a substitutionGroup directive in the schema, OR b)via a xsi:type directive in the instance document
Page 19 GML pattern: by-value or by-reference (1) GML properties are defined such that the content can be referenced EITHER by-value within the scope of the containing XML element, OR by-ref (xlink) to an instance of the content residing elsewhere; either within the document or external
Page 20 GML pattern: by-value or by-reference (2) Example - from an observations & measures GML instance 8903 … by-reference by-value
Page 21 «xlink» semantics How should parsers interpret xlink? XML validation will *only* check the grammar of the xlink statement xlink has deferred binding semantics …
Page 23 Interoperability outside WMO community Document without WMO extensions Document with WMO extensions XSL Transformation Users outside WMO community will (probably) NOT understand WMO extensions … Need to remove them???
Page 24 Open Issues Worked examples & implementation test-bed Finalize extensions which are required? Time, ServiceIdentification etc. encode as ISO 19118-compliant XML schema ISO 19115 / 19139 have a multitude of options for describing information … which makes writing parsing applications complex do we need to *restrict* WMO Core metadata? ISO/TC 211 & OGC community peer review
Page 26 Others: irregular point sets exists in WMO Metadata Profile v0.2 can be used to describe the geographic locations of a set of stations never needed if station catalogue files exists remove IrregularPointSet_Type from WMO Metadata Profile
Page 57 Implementing codelists 3 rd May 2006 by Jeremy Tandy, UK Met Office Jürgen Seib, Deutscher Wetterdienst
Page 58 Codelists A permissible extension of ISO 19115 is that free text content can be restricted by the use of a codelist Furthermore, existing codelists can be extended by adding more terms to the list Note that this is NOT possible with enumerations – an enumeration is NOT extensible An «Enumeration» is implemented as an XML schema enumeration list A «Codelist» is implemented as a codelist catalogue file WMO Core profile v0.2 identified a small number of *new* code lists
Page 59 ISO/TS 19139:2005(E) – Clause 8.5.5 Codelist encodings (1) A class that is stereotyped CodeList is an enumerated type like a class that is stereotyped Enumeration. The difference is that the > class is extensible. All > classes defined in ISO 19115, Annex B, are described in tables with three columns: Name, DomainCode and Definition. Codelists and their associated definitions are controlled in a register. ISO 19115, Annex B contains the content of a register or registers that could be created that is well understood by software.
Page 60 ISO/TS 19139:2005(E) – Clause 8.5.5 Codelist encodings (2) The last point is the crucial *implementation* issue … Values of a codelist are NOT encoded in the schema (as is the case for an enumeration) Codelist values are stored in some external register Issue: XML validation CANNOT verify codelist entries
Page 61 Implementing code lists: alternative define all WMO code lists of type «enumeration» Issues: 1)Enumeration does not support multi-language 2)Codelists as enumerations is INFLEXIBLE … it does not permit extension (or local extension) … refer to complexity regarding BUFR 3)XML schema gives SYNTACTIC interoperability via XML schema validation … we need SEMANTIC interoperability via application-level parser
Page 62 ISO 19139 Codelist schema Where codelists are referenced from within a metadata record, the following XML type is used: Example instance values: codeList = http://www.tc211.org/ISO19139/resources/codeList.xml#CI_DateTypeCode codeListValue="creation"
Page 63 Codelist implementation example – CI_DateTypeCode (1) The XML global element corresponding to CI_DateTypeCode: The XML PropertyType corresponding to CI_DateTypeCode:
Page 64 Codelist implementation example – CI_DateTypeCode (2) 1993-01-01 publication … implementation of codelist registers will be discussed later …
Page 65 WMO metadata profile codelist extensions WMO Core profile v0.2 identified a small number of *new* code lists: WMO topic categories «enumeration» WMO keywords «thesaurus» WMO data frequency code «codelist» WMO member countries «codelist»
Page 66 Code lists: WMO topic categories MD_TopicCategoryCode far too restrictive … Create new WMO specialisation: WMO_CommunityTopicCategoryCode (as proposed in WMO Core v0.2)
Page 68 Example: WMO topic categories...... climatologyMeteorologyAtmosphere weatherObservation....... (This part *may* not be required)
Page 69 SIMDAT: metadata directory structure (1) SIMDAT (phase 1) employed WMO Core Metadata v0.2 was misused to hold a directory structure for the metadata instance … Example: NWP Outputs > ECMWF > 40 years reanalysis This implementation is NOT ISO 19115 compliant Proposal: Use MD_DataIdentification » supplementalInformation as a placeholder for the directory
Page 70 SIMDAT: metadata directory structure (2) MD_DataIdentification::supplementalInformation «gco:CharacterString_PropertyType» Question: leave as free text or formalize with a metadata extension?
Page 71 Code lists: WMO keywords enumeration list type WMO_KeywordCodeType exists in WMO Metadata Profile v0.2
Page 72 Example: WMO keywords...... temperature climate data.......
Page 73 Code lists: WMO keywords MD_KeywordTypeCode … WMO Core v0.2 proposed a new WMO keyword list: WMO_KeywordList (a VERY long list!) WMO Core v0.2 expressed this as an enumeration … Also put the actual keyword list *content* into the keyword *type* list (?)
Page 74 ISO 19139 keywords implementation keyword … free text field type … from «MD_KeywordTypeCode» codelist classifies the type of keyword does this codelist need extension? thesaurusName … «CI_Citation» identifies the thesaurus that describes the permissible content of the keyword list thesaurusName [CI_Citation] » identifier [MD_Identfier] » code [CharacterString] can be used to express the URI of the thesaurus / codelist … allowing linkage to URL or URN (alternative: substitute gmx:Anchor for gco:CharacterString to reference the thesaurus)
Page 75 WMO keyword – thesaurus implementation Keyword thesaurus content (i.e. the list of keywords and their meanings) can be implemented as a)URN linkage to existing WMO code manuals urn:x-wmo:wmoManual:306:synop ??? (note: an application will need some additional, pre-defined context to dereference the URN) b)URL linkage to online catalogue; encoded according to the codelist catalogue pattern (see later)
Page 76 Keyword examples climate data relative humidity theme … urn:x-wmo:wmoManual:306:synop … note: XML elements in GML can also be expressed by reference
Page 77 Code lists: frequency values MD_MaintenanceFrequencyCode too restrictive … WMO Core v0.2 proposed an extended list: WMO_DataFrequencyCode expressed as an enumeration in WMO Core v0.2
Page 78 WMO_DataFrequencyCode implementation Standard codelist implementation … Global XML element … XML PropertyType …
Page 79 WMO_DataFrequencyCode example … daily...
Page 80 Frequency values – alternative implementation type WMO_DataFrequencyCodeType exists in WMO Metadata Profile v0.2 remove this enumeration type from the profile use xs:duration data type to specify frequency values A value of xs:duration is specified in the form " PnYnMnDTnHnMnS Examples: PT3H: period of 3 hours P5Y2M10D: period of 5 years, 2 months, and 10 days.
Page 81 Code lists: WMO member countries The organisation responsible for various aspects of the metadata instance is defined thus: [CI_Citation] » citedResponsibleParty [CI_ResponsibleParty] » organisationName [CharacterString] WMO profile needs to formalise list of responsible organisations as the member countries of WMO WMO_WMO-MemberCountryCode
Page 82 WMO_WMO-MemberCountryCode implementation Standard codelist implementation … Global XML element … XML PropertyType …
Page 83 WMO_WMO-MemberCountryCode example … Deutscher Wetterdienst...
Page 84 Member country list – alternative implementation does not exist in WMO Metadata Profile v0.2 should be implemented as an XML enumeration type.......
Page 85 Member country list – alternative implementation...... Germany.......
Page 87 Service metadata Encoding in ISO 19119 metadata May 2006
Page 88 Service metadata WMO Core Profile key concern is describing *datasets* … Implication: MD_DataIdentification is sufficient? This is NOT the case … Where a dataset is described (i.e. via a WMO Core metadata record) you need to be able to identify the *service* by which you can *access* the data … Example: when a GISC harvests metadata records from a DCPC (refer to plans for CBS Workshop demonstration Nov 2006)
Page 90 ISO 19119:2005 Geographic information - Services Service metadata is described in ISO 19119 MD_ServiceIdentification is replaced with SV_ServiceIdentification The ISO standard discusses much more than simple *access* services … e.g. defining metadata constructs that can be used to describe service chaining Alternative service description standards (including WSDL) were deemed incomplete and insufficient
Page 93 ISO 19119:2005 – implementation (1) ISO 19119 seems to meet requirements But … 1.there is no ISO-ratified encoding (equivalent to ISO 19139) although an incomplete (?) ISO 19118 compliant XML encoding exists from OGC CSW initiative 2.there does not appear to be a place holder to reference pre- existing WSDL definitions most implementation examples I have seen (OGC service offerings) have a WSDL definition in *parallel* to an ISO 19119 definition
Page 94 ISO 19119:2005 – implementation (2) Best practice for WMO Core metadata? Bind the service definition to the dataset via the operatesOn relationship (via xlink reference) Set SV_CouplingType as mixed Alternative data access service binding? [MD_Distribution] » transferOptions [MD_DigitalTransferOptions] » onLine [CI_OnlineResource] Extend CI_OnlineResource to hold SV_ServiceIdentification, wsdl:definition & default values / predefined queries?
Page 95 ISO 19119:2005 – implementation (3) Incorporating WSDL service definitions? Include *both* SV_ServiceIdentification and wsdl:definitions elements … … … … … … … …
Page 96 Default parameters / parameter ranges SIMDAT (phase 1) implemented bespoke service binding constructs ISO 19119 provides alternative implementation … Except for a mechanism for identifying default (fixed) values & ranges of values Extend ISO standard information model to cater for default values or range lists? int.ecmwf.mars e4 sfc marser 2/2 1980-01-01 1990-12-31 0000 1200
Page 98 Catalogues and registries 3 rd May 2006 by Jeremy Tandy, UK Met Office Jürgen Seib, Deutscher Wetterdienst
Page 99 Feature catalogues Current WMO Core best practice: dataset content identified via keywords Alternative: define content model as GML feature … a feature type UK Met Office undertaking to develop GML feature types for meteorological information (http://www.metoffice.gov.uk/informatics) Feature types are registered in a feature catalogue
Page 100 Feature catalogue - implementation Best practice implementation for feature catalogues is in flux ISO standards: ISO 19110: Geographic information – Methodology for feature cataloguing (new work item) ISO 19126: Geographic information – Profiles for feature data dictionary registers and feature catalogue registers ISO 19135 Geographic Information – Procedures for registration of items of geographic information Open Geospatial Consortium initiatives: MOTIIVE; INSPIRE-related project aiming to develop best practice methodology for implementing feature catalogues within registries
Page 101 Supporting metadata A key objective of IPET-MI is to define the schemas for WMO Core metadata … However, a number of resources are *external* to the metadata instances … Coordinate Reference System definitions Unit of Measure definitions Phenomena (parameter) definitions Codelists Keyword thesauri Feature catalogues Gazetteers Station lists
Page 102 Registries and catalogues (1) These supporting metadata artefacts are registered in a *catalogue* or *registry* Definition: a registry is a catalogue with *governance* How do you implement a catalogue? Simple: create an XML instance document containing the registered definitions & reference the content via hyperlinks Issue: brittle implementation – relies on fixed URLs … susceptible to broken links
Page 103 Registries and catalogues (2) Feature rich: import the content into a *repository* and use a sophisticated registry to index the definitions, abstracting the *real* location of the documents, enabling the expression of rich content models and the traversal of associations between resources Issue: complex implementation, limited reference implementations UK Met Office have undertaken to develop a registry- repository implementation based on OGC CSW and ebRIM 3.0; planned demonstration at CBS Workshop, Nov 2006
Page 104 ISO 19139 catalogue implementation 3 concrete catalogue classes: 1.Coordinate Reference System 2.Unit of Measure 3.Codelist
Page 106 Codelist catalogue – XML encoding Also a multi-lingual implementation available from ISO 19139
Page 107 Codelist catalogue: example instance (1) (http://www.isotc211.org/2005/resources/gmxCodelists.xml) gmxCodelists Codelists for description of metadata datasets compliant with ISO/TC 211 19115:2003 and 19139 GMX (and imported) namespace 0.0 2005-03-18 … code list items …
Page 108 Codelist catalogue: example instance (2) identification of when a given event occurred CI_DateTypeCode date identifies when the resource was brought into existence creation … more codeEntry elements …
Page 112 Volume A: observing stations hierarchy: WMO region country station name station index number latitude and longitude elevation in metres the elevation of the station (HP) the elevation of the ground (H) or the official altitude of the aerodrome (HA) pressure level in HPa observation times remarks
Page 117 Example: meteorological parameter monthly mean of daily minimum of 10 minutes air temperature observations PT10M air temperature degree C mean P1M minimum P1D
Page 118 ISO 19112:2003 - Geographic information Spatial referencing by geographic identifiers
Page 119 Gazetteer implementation – OGC 05-035 (1) Within the OGC community there is a growing interest in the development of a common feature- based model for access to named features, often referred to as a gazetteer. OGC 05-035 aims to implement a Gazetteer Service as a profile of the OGC Web Feature Service Gazetteer Service is a specialized profile of a Web Feature Service that specifies a minimum set of FeatureTypes and operations required to support an instance of a gazetteer service
Page 120 Gazetteer implementation – OGC 05-035 (2) By using the capabilities of a Web Feature Server, the Gazetteer Service as proposed here exposes the following interfaces to query location instances in a gazetteer database: Get or Query features based on thesaurus- specific properties (broader term (BT), narrower term (NT), related term (RT) Retrieve properties of the gazetteer database, such as the location type class definitions and the spatial reference system definitions
Page 121 Gazetteer implementation – interim The Gazetteer Service (WFS-G) initiative has developed a set of ISO 19118-compliant XML encodings for the ISO 19112 information model: SI_Gazetteer «FeatureType» SI_LocationInstance «FeatureType» SI_LocationType «FeatureType» … Recommendation: liaise with OGC regarding development of WFS-G Interim: use the ISO 19112 schema definitions to create a XML instance document for the gazetteer definitions & reference these via xlink from metadata instances Simple implementation Poor functionality (serving ONLY as a list)