The BODC Parameter Markup and Usage Vocabulary Semantic Model Roy Lowry British Oceanographic Data Centre GO-ESSP Meeting, RAL, June 2005.

Slides:



Advertisements
Similar presentations
Connecting Social Content Services using FOAF, RDF and REST Leigh Dodds, Engineering Manager, Ingenta Amsterdam, May 2005.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Status of vocabulary developments Roy Lowry, Adam Leadbetter, Rob Thomas & Ray Cramer SDN TTG & EMODnet Chemistry 2 TWG – Trieste, Italy.
Alexandria Digital Library Project The ADEPT Bucket Framework.
GE/BCDMEP Meeting March 2004 EnParDis Enabling Parameter Discovery Roy Lowry, Michael Hughes & Laura Bird British Oceanographic Data Centre.
A Semantic Modelling Approach to Biological Parameter Interoperability Roy Lowry & Laura Bird British Oceanographic Data Centre Pieter Haaring RIKZ, Rijkswaterstaat,
Roy Lowry Adam Leadbetter British Oceanographic Data Centre.
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
Vocabulary management: a foundation for semantic interoperability through ontology development Roy Lowry British Oceanographic Data Centre GO-ESSP, Paris,
NERC DataGrid Vocabulary Governance Vocabulary Workshop, RAL, February 25, 2009.
Chapter 14 Getting to First Base: Introduction to Database Concepts.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Demonstration of adding content to an ICAN Semantic Resource Roy Lowry, Adam Leadbetter, Olly Clements (NETMAR - BODC) Tanya Haddad (ICAN - OCA)
Introduction to Controlled Vocabularies (Term / Code Lists)
1 Augmenting MatML with Heat Treating Semantics Aparna Varde, Elke Rundensteiner, Murali Mani Mohammed Maniruzzaman and Richard D. Sisson Jr. Worcester.
An Introduction to Database Management Systems R. Nakatsu.
Clare Postlethwaite Mary Mowat Judith Thomas February 2014 Edinburgh.
Vocabulary Services “Huuh - what is it good for…” (in WDTS anyway…) 4 th September 2009 Jonathan Yu CSIRO Land and Water.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
The NERC DataGrid Vocabulary Server Roy Lowry British Oceanographic Data Centre Ontology Registry Meeting.
The NERC DataGrid Vocabulary Server: an operational system with distributed ontology potential Roy Lowry British Oceanographic Data Centre GO-ESSP 2008,
SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July (+ Lessons.
Overview of the Database Development Process
MEDIN Data Guidelines. Data Guidelines Documents with tables and Excel versions of tables which are organised on a thematic basis which consider the actual.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
The role of metadata schema registries XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN.
Controlled Vocabularies (Term Lists). Controlled Vocabs Literally - A list of terms to choose from Aim is to promote the use of common vocabularies so.
M. Rösslein, B. Wampfler Evaluation of Uncertainty in Analytical Measurements B. Neidhart, W. Wegscheider (Eds.): Quality in Chemical Measurements © Springer-Verlag.
Bryan Lawrence on behalf of BADC, BODC, CCLRC, PML and SOC An Introduction to NDG concepts [ ]=
1 Chapter 15 Methodology Conceptual Databases Design Transparencies Last Updated: April 2011 By M. Arief
CF Conventions Support at BADC Alison Pamment Roy Lowry (BODC)
NERC DataGrid Vocabulary Server Access Vocabulary Workshop, RAL, February 25, 2009.
1 The NERC DataGrid DataGrid The NERC DataGrid DataGrid AHM 2003 – 2 Sept, 2003 e-Science Centre Metadata of the NERC DataGrid Kevin O’Neill CCLRC e-Science.
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Tommie Curtis SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2023.
EMODNET Chemistry 2 Semantic Suggestions Roy Lowry and Adam Leadbetter British Oceanographic Data Centre.
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Dimitrios Skoutas Alkis Simitsis
NERC DataGrid NERC DataGrid Vocabulary Server Use Cases Vocabulary Workshop, RAL, February 25, 2009.
Experience from Mapping Existing Models to the Transfer Schema Robert Kukla.
CDI Controlled Vocabularies Roy Lowry, Karen Vickers (BODC) Michele Fichaut, Catherine Maillard (SISMER) Reinhard Schwabe (DOD) 4 June 2003.
Problems/Disc. Adoption of standards Should there be standards? (not a big problem – responsibility lies with data centre – onus not on scientist) (peer.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Annual Meeting, June , Istanbul, Turkey WP1: Data collection and metadata compilation in sea regions: current status EMODnet Chemistry Partner.
Exporting WaterML from the Earth System Modeling Framework Xinqi Wang Louisiana State University NCAR SIParCS Program August 4, 2009.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Registry of MEG-related schemas MEG BECTa, Coventry, 17 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported by:
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
An Ontological Approach to Financial Analysis and Monitoring.
1 Alison Pamment, 2 Calum Byrom, 1 Bryan Lawrence, 3 Roy Lowry 1 NCAS/BADC,Science and Technology Facilities Council, 2 Tessella plc, 3 British Oceanogrphic.
Roy Lowry British Oceanographic Data Centre.  Controlled Vocabularies - What and Why  Controlled Vocabularies - History  Controlled Vocabularies -
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
P01 parameters: tips and tools
Usage of BODC parameter vocabularies
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Software Specification Tools
Introduction to Controlled Vocabularies (Term / Code Lists)
Controlled Vocabularies: What, Why, How?
Databases and Information Management
The Re3gistry software and the INSPIRE Registry
SDMX: A brief introduction
2. An overview of SDMX (What is SDMX? Part I)
Database Systems Instructor Name: Lecture-3.
Databases and Information Management
Vocabularies at the British Oceanographic Data Centre
Presentation transcript:

The BODC Parameter Markup and Usage Vocabulary Semantic Model Roy Lowry British Oceanographic Data Centre GO-ESSP Meeting, RAL, June 2005

Presentation Outline  Parameter codes and their metadata load  EnParDis Project  BODC PMUV Semantic Model  Issues: Synonyms and Tooling  Points to Ponder

What is a Parameter Code?  According to the original oceanographic data standard (GF3) a parameter code is a key attached to a data value that:  Specifies:  What was measured  How it was measured  Actual (not canonical) units of measurement  Is defined through the attributes of a parameter dictionary  Includes semantics (e.g. TEMP7RTD, TEMP7STD)  The data models of IODE data centres were strongly influenced by GF3 concepts (even if the format was little used)  Parameter codes are therefore endemic in legacy oceanographic data

What is a Parameter Code?  Parameter codes have been grossly abused by the oceanographic data management community  Codes mapped to free text fields causing  Compromised semantic purity (e.g. vague spatio- temporal co-ordinates like ‘sea-surface’ introduced)  Incomplete or ambiguous specifications. Sample GF3 parameters:  Sea temperature (estuaries?)  Potential temperature (of what?)  Potential air temperature (OK!)  Wet bulb temperature (could be in water!)  Metadata overload (e.g. taxon names included in parameter – what was measured - descriptions)  Random scatterings of synonyms  Parameter semantics in unit definitions (e.g. per gram dry weight)

What is a Parameter Code?  BODC joined in this abuse through a Parameter Dictionary following the GF3 model  BODC Parameter Dictionary originally mapped code to:  Two plain-text fields of what measured (parameter) and how (parameter subgroup)  Units specification  Valid data range  Formatting information  Abbreviated description label  As chemistry and biology were added the plain-text fields evolved into a total mess

Enabling Parameter Discovery (EnParDis)  EnParDis was a one-off injection of NERC funding aiming to:  Integrate taxonomic knowledge into the BODC Parameter Dictionary  Include data on penguins in a query for ‘birds’  Based on ITIS and works providing taxa are in ITIS  Totally overhaul the parameter plaintext descriptions  Standardise terms and syntax  Eliminate implied semantics  Plaintext field overhaul achieved by using structured text (concatenated elements from a semantic model)

Semantic Model  The Semantic Model maps each parameter code to a set of atomic metadata elements populated by entries from controlled vocabularies  The parameter description is built by structured concatenation of the Semantic Model elements  The Semantic Model forms a flexible interface between legacy systems based on parameter codes and/or semantically poor text descriptions and modern (meta)data content models  The Parameter Dictionary becomes a registry of valid Semantic Model element combinations (insurance against the introduction of the green dog)

Semantic Model  The parameter description is built up as three themes:  What theme – what was measured  Where theme – where it was measured (sphere NOT spatio-temporal co-ordinates or their textual representation like sea surface)  How theme – how it was measured  Example  Temperature (What)  of the water column (Where)  by CTD (How)

What Theme  Parameter Entity of Measurand Entity (chemical, physical or biological) by Measurand Entity  Examples  Clearance rate of Dinophycae by Acartia  Concentration of nitrate+nitrite  Temperature

Parameter Entity  Three Semantic Elements  Parameter Name  Example: concentration  Parameter Statistic  Example: standard deviation  Parameter Subgroup  Example: v/v

Biological Entity  Nine Semantic Elements  Taxon Name  ITIS code for taxon  Taxon size  Taxon gender  Taxon development stage  Taxon morphology (shape terms)  Taxon subcomponent (body parts)  Taxon colour  Taxon subgroup (subdivision ‘bucket’)

Chemical Entity  Two Semantic Elements  Chemical name  Example: carbon  Chemical subgroup  Example: organic

Physical Entity  Three Semantic Elements  Physical Name  Examples: sea surface elevation, temperature  Physical subgroup  Example: IPTS-68  Datum  Example: Ordnance Datum Newlyn

Where Theme  Relationship and a Sphere Entity or Biological Entity  Examples  Per unit volume of the water column  Of the atmosphere  Per unit wet weight of Mytilus edulis flesh  Per unit dry weight of sediment  Per unit area of the water column

Sphere Entity  Four Semantic Elements  Sphere Name  Example: sediment  Sphere subgroup  Example:<63um  Sphere phase  Examples: particulate, aerosol, gaseous, dissolved plus reactive particulate  Sphere phase subgroup  Examples: >GF/F, 2-20um

How Theme  Sample processing entity  Example: radiotracer inoculation and incubation in natural sunlight  Analysis entity  Example: proportional counting  Data processing entity  Example: conversion to carbon using unspecified algorithm

How Theme  Sampling and Data Processing Entities are single entities  Analysis Entity has two Semantic Elements  Analysis Description  Analysis Instance Discriminator (multiple sensors)

Issues  Synonyms  What synonyms do we need?  How do we store them?  How do we utilise them?  Tooling  Tools for code assignment  Tools for dictionary expansion

Synonyms  Synonyms required for:  What Theme  Parameter Name  Parameter Entity  Chemical Name  Chemical Entity  Physical Name  Physical Entity  Biological Entity  Taxon name  Parameter Description  Probably more as well

Synonyms  The following information needs to be known for each synonym  The Semantic Model entity type  The primary term  The secondary term  Could be managed through a conventional relational schema, but RDF seems more attractive as there is more to relationships than ‘synonymous’  Current thinking on synonym exposure is to produce multiple parameter descriptions for a single parameter code incorporating all synonym combinations

Tooling  Web services to give access to code definitions, model elements, controlled vocabularies, mappings and synonyms  Automated data markup tooling based on model semantic element specification  Automated request mechanism for dictionary population extension with efficient moderation mechanism  Could be configured as a single tool sitting on a common set of services

Points to Ponder  What is the mapping between components of the Semantic Model and a CF Standard Name?  How can the CF Standard Name list and the BODC Data Markup vocabulary be integrated into a unified resource covering both the oceanographic and atmospheric domains?