EMODNET Chemistry 2 Semantic Suggestions Roy Lowry and Adam Leadbetter British Oceanographic Data Centre.

Slides:



Advertisements
Similar presentations
Connecting Knowledge Silos using Federated Text Mining Guy Singh Senior Manager, Product & Strategic Alliances ©2014 Linguamatics Ltd.
Advertisements

Unveiling ProjectWise V8 XM Edition. ProjectWise V8 XM Edition An integrated system of collaboration servers that enable your AEC project teams, your.
Status of vocabulary developments Roy Lowry, Adam Leadbetter, Rob Thomas & Ray Cramer SDN TTG & EMODnet Chemistry 2 TWG – Trieste, Italy.
GE/BCDMEP Meeting March 2004 EnParDis Enabling Parameter Discovery Roy Lowry, Michael Hughes & Laura Bird British Oceanographic Data Centre.
A Semantic Modelling Approach to Biological Parameter Interoperability Roy Lowry & Laura Bird British Oceanographic Data Centre Pieter Haaring RIKZ, Rijkswaterstaat,
NERC DataGrid Vocabulary Workshop, RAL, February 25, 2009 NERC DataGrid Vocabulary Server Description.
Roy Lowry Adam Leadbetter British Oceanographic Data Centre.
The BODC Parameter Markup and Usage Vocabulary Semantic Model Roy Lowry British Oceanographic Data Centre GO-ESSP Meeting, RAL, June 2005.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
Vocabulary management: a foundation for semantic interoperability through ontology development Roy Lowry British Oceanographic Data Centre GO-ESSP, Paris,
Automating Keyphrase Extraction with Multi-Objective Genetic Algorithms (MOGA) Jia-Long Wu Alice M. Agogino Berkeley Expert System Laboratory U.C. Berkeley.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Demonstration of adding content to an ICAN Semantic Resource Roy Lowry, Adam Leadbetter, Olly Clements (NETMAR - BODC) Tanya Haddad (ICAN - OCA)
Introduction to Controlled Vocabularies (Term / Code Lists)
2 nd Training Workshop 4 – 5 June 2007 Common Data Index - CDI By Dick M.A Schaap Technical Coordinator SeaDataNet.
The NERC DataGrid Vocabulary Server Roy Lowry British Oceanographic Data Centre Ontology Registry Meeting.
The NERC DataGrid Vocabulary Server: an operational system with distributed ontology potential Roy Lowry British Oceanographic Data Centre GO-ESSP 2008,
SeaDataNet Ontology Use Case Roy Lowry British Oceanographic Data Centre Coastal Atlas Interoperability Workshop, Corvallis, July (+ Lessons.
Status of upgrading CDI service (user interface, harvesting via GeoNetwork, CDI interoperability options following SeaDataNet D8.7) By Dick M.A. Schaap.
MEDIN Data Guidelines. Data Guidelines Documents with tables and Excel versions of tables which are organised on a thematic basis which consider the actual.
DHCP: Dual-Stack Issues draft-ietf-dhc-dual-stack-01 Tim Chown dhc WG, IETF 60, San Diego, August 2, 2004.
Submitting data to (and getting data from!) BODC Adam Leadbetter British Oceanographic Data Centre Joseph Proudman Building 6 Brownlow Street Liverpool.
GADS: A Web Service for accessing large environmental data sets Jon Blower, Keith Haines, Adit Santokhee Reading e-Science Centre University of Reading.
Controlled Vocabularies (Term Lists). Controlled Vocabs Literally - A list of terms to choose from Aim is to promote the use of common vocabularies so.
SeaDataNet2 Plenary meeting, 19th-20th September 2012, Rhodes SeaDataNet2 Plenary meeting THE EMODNET CHEMISTRY LOT. FROM THE EXPERIENCE OF THE THREE YEARS.
Workshop – 10, December 2014, Berlin ICCS / NTUA Greece Efthymios Chondrogiannis An Intelligent Ontology Alignment Tool Dealing with Complicated Mismatches.
Mihir Daptardar Software Engineering 577b Center for Systems and Software Engineering (CSSE) Viterbi School of Engineering 1.
CF Conventions Support at BADC Alison Pamment Roy Lowry (BODC)
North American Profile: Partnership across borders. Sharon Shin, Metadata Coordinator, Federal Geographic Data Committee Raphael Sussman; Manager, Lands.
1 The NERC DataGrid DataGrid The NERC DataGrid DataGrid AHM 2003 – 2 Sept, 2003 e-Science Centre Metadata of the NERC DataGrid Kevin O’Neill CCLRC e-Science.
Web Design and Usability.  Web design has become increasingly complex  First generation sites are simply default backgrounds with "wall to wall" text,horizontal.
DDI-RDF Leveraging the DDI Model for the Linked Data Web.
NERC DataGrid NERC DataGrid Vocabulary Server Use Cases Vocabulary Workshop, RAL, February 25, 2009.
BI4ALL Demonstration 03 Stored Procedures 1/7/2012
EMODnet Chemistry 2 Service Contract MARE/2012/10 S Progress of the CDI service By Dick M.A. Schaap – Technical Coordinator Istanbul – Turkey,
IODE Ocean Data Portal – from data access to integration platform Sergey Belov, Tobias Spears, Nikolai Mikhailov International Oceanographic Data and Information.
Weka: Experimenter and Knowledge Flow interfaces Neil Mac Parthaláin
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
A/WWW Enterprises 28 Sept 1995 AstroBrowse: Survey of Current Technology A. Warnock A/WWW Enterprises
1 (21) EZinfo Introduction. 2 (21) EZinfo  A Software that makes data analysis easy  Reveals patterns, trends, groups, outliers and complex relationships.
CDI Controlled Vocabularies Roy Lowry, Karen Vickers (BODC) Michele Fichaut, Catherine Maillard (SISMER) Reinhard Schwabe (DOD) 4 June 2003.
Chapter 6 Distributed File Systems Summary Bernard Chen 2007 CSc 8230.
NJIT UML Class Diagrams Chapter 16 Applying UML and Patterns Craig Larman.
Annual Meeting, June , Istanbul, Turkey WP1: Data collection and metadata compilation in sea regions: current status EMODnet Chemistry Partner.
Mark Brady 11/19/2012 Southwest Florida Water Management District Data Analyst Interview.
1 1 ECHO Extended Services February 15, Agenda Review of Extended Services Policy and Governance ECHO’s Service Domain Model How to…
EMODnet Chemistry 2 Service Contract MARE/2012/10 S How to make EMODnet Chemistry fit for purpose at system level By Dick M.A. Schaap – Technical.
Website design and structure. A Website is a collection of webpages that are linked together. Webpages contain text, graphics, sound and video clips.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
EMODNet Chemistry (MARE/2012/10). The portal should collect the following groups of chemicals: - in 3 matrices: water column, biota, sediment. - in all.
Introduction to Controlled Vocabularies (Term Lists)
1 Alison Pamment, 2 Calum Byrom, 1 Bryan Lawrence, 3 Roy Lowry 1 NCAS/BADC,Science and Technology Facilities Council, 2 Tessella plc, 3 British Oceanogrphic.
Roy Lowry British Oceanographic Data Centre.  Controlled Vocabularies - What and Why  Controlled Vocabularies - History  Controlled Vocabularies -
Of 24 lecture 11: ontology – mediation, merging & aligning.
© 2010 IBM Corporation RESTFul Service Modelling in Rational Software Architect April, 2011.
EMODnet Chemistry 2 Service Contract MARE/2012/10 S Progress of the CDI service By Dick M.A. Schaap – Technical Coordinator Helsinki – Finland,
P01 parameters: tips and tools
Usage of BODC parameter vocabularies
User Characterization in Search Personalization
Data aggregation and products generation in the Mediterranean Sea
Introduction to Controlled Vocabularies (Term / Code Lists)
Flanders Marine Institute (VLIZ)
Dick M.A. Schaap – Technical Coordinator SeaDataNet Training Workshop
Vocabularies at the British Oceanographic Data Centre
One Language. One Enterprise.™
NIEM Tool Strategy Next Steps for Movement
Upgrading the portal and its services
Steering Committee Meeting Amsterdam, Dec 2014
Status on Products Catalogue service
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

EMODNET Chemistry 2 Semantic Suggestions Roy Lowry and Adam Leadbetter British Oceanographic Data Centre

Semantic Issues Parameter semantic issues encountered during the pilot – Naming of the aggregated products – Inability to aggregate across multiple P01 codes – Difficulty mapping local parameter vocabularies to P01 – P01 scalability issues – Inability to discover a specified contaminant

Aggregation Naming Problem – During the pilot a lot of (circular) traffic concerned the labelling of aggregated parameters Solution – Naming needs to be governed – Governance decisions need to be implemented as a controlled vocabulary

P01 Aggregation Issues Problem – Aggregation tools create an aggregated parameter for every P01 code in the source dataset – Different P01 codes used for parameters that are not significantly different (or even not different at all) – Fixes for this (retagging source data or merging channels in the aggregation tool) is both labour intensive and error prone

P01 Aggregation Issues Solution – Define each aggregation as a set of P01 codes – Store and serve resultant mapping in the NERC Vocabulary Server – Update aggregation tools to access mapping and use it to dynamically merge channels with different P01 codes

P01 Mapping Difficulties Problem – There’s a lot (>28000) of codes in P01 – Finding the code needed for a given local parameter vocabulary term seems to cause a lot of difficulty – Text generated from a semantic model isn’t always intuitive (e.g. [dissolved plus reactive particulate phase] = ‘unfiltered’)

P01 Mapping Difficulties Solutions – Mapping based the semantic model (matrix, substance, taxon, gender, organ) rather than the preferred label text – Improvements to the search algorithm in the client (e.g. Addition of ‘excluding’ clause) – Exposure of P01 subsets through NVS2 concept schemes (thesauri) – Training in how to map

P01 Scalability Issues Problem – Many contaminants in many different biological entities = a number of P01 codes that is predicted to be unmanageable Solution (not favoured) – Redesign formats to use discrete semantic model not P01 code Different formats for different data types Moves complexity from semantic domain into the data files

P01 Scalability Issues Solution (preferred) – Retain P01 as a register of semantic element combinations – Automate concept registration (part of a semantic model-based mapping tool perhaps) – Use NVS V2 concept schemes to expose P01 subsets to make navigation easier

Contaminant Discovery Issues Problem – Parameter discovery (CDI interface) is based on P02 – P02 groups contaminants with variable granularity Good for PCBs Not so good for ‘other organic contaminants’ – A search for datasets with cadmium in Mytilus edulis flesh isn’t possible – The nearest is metals in biota, which will give many unwanted hits

Contaminant Discovery Issues Possible Solution – Mine the P01 codes in the SeaDataNet file stock into the CDI metadatabase – Use these for drill-down parameter discovery in the CDI search engine

Taking This Forward Some of the solutions presented are ODIP pilot candidates Specifications of these are currently vague Not absolutely clear who should be doing what and when Meeting (Liverpool or London if easier) to develop the specifications and an implementation roadmap