Ontology Semantic Mediation in the Big Picture MMI Workshop - August 2005
Semantic Issues What is the meaning ? How are the terms related ? Data Metadata
Information systems talk different languages
Community agreements Metadata Data EML ISOADL DCMI FGDC MARINE XML GML ADL NetCDDF ASCII ContentContent ProtocolProtocol ESML OPenDAP Rest SOAP Z39.50 DFDL MMI Demo for Demo Agreement
What do we want to achieve ? User searches for: Source type (Platform/Sensor/ instrument/model) LatLongDepthTime (Z)Latest value (units) Link to Metadata and data category label_of_source Sea Temperature Salinity Nitrate Phosphate Oxygen Silicate … BOG SSDS AOSN ROV World Ocean Data Atlas CIMT PFEL FLIT NUMERICAL MODEL …
MMI Demo for Demo WSDL Source
Metadata in DCMI
MMI Demo for Demo WSDL Source
MMI ASCII Is in ASCII Field delimiter = tab Record delimiter = "\n" (line feed) Has one header line, with the variable names and units, units in parenthesis. Order of the columns is: time depth lat long variableName. time(YYYY-MM-DDThh:mm:ss) depth(meters) lat(degrees) lon(degrees) Temperature_8(deg C) If no units, then "()". Lat long are in degrees. Show "-" before the coordinate values for south east coordinates For dateTime always use T to separate them. Format of time is: YYYY-MM-DDThh:mm:ss±hh:mm or YYYY-MM-DDThh:mm:ss±hh. Missing values write "null" in lowercaps …
So far looks good, so where do we have semantic mediation problems ?
Need more than an agreement on a metadata specification. Why ? Could all of these be discovered? Search for sea temperature data TCNTTCMF (BODC) Metadata repository keyword value sea surface temperature (GCMD) sea water temperature (CF)
Needed controlled vocabularies Units Parameters Phenomena Models Sensors Instruments Formats Organizations Geographic Places Datums Species categories etc …
Controlled vocabularies serve different purposes Discovery Vocabulary Terms people use to search (discover) data. Systems that use these terms, know how to link with usage terms embeded in data repositories. e.g. ocean temperature Usage Vocabulary Terms people use when cataloging data. Most of the times have units associated. Systems that used this terms, know how to manage it. e.g. temp3 or TCNTTCMF
Strategies to solve semantic interoperability issues Make general agreement about one and only one controlled vocabulary. Accept that more than one vocabulary exists, and try to mediate across them. Middle way solution: Try to establish an agreed preferred controlled vocabulary and create mappings to and from this vocabulary.
How are the agreements of controlled vocabularies expressed and implemented ? Expressed in HTML files, CSV, word documents etc… Implemented embedding the semantics in software programs (Hardcoded).
MMI Strategy Faciliate semantic mediation Harmonization strategies Mapping tools Vocabulary web services Semantic mediation in discovery services
Guides for Harmonization DTD CommaSeparatedValues HTML TabSeparatedValues RelationalDatabase XML/XSD RDF OWL
Ontologies Repository
Mapping Tools
Web services
1) Vocabulary Harmonization 2) Vocabulary Mapping 3) Vocabulary Services 4) Access to Data Demonstration Ontology metadata mediation
Demo
Ontology metadata mediation Searching “sameAs” and “narrowerThan” for Ocean Temperature Loading model Found ssds:Temperature_8 and sea_surface_temperature Searching ontology Found corresponding WSDL for SSDS calling the web service searching Temperature_8 Number of results added: 4 Searching ontology Found corresponding WSDL calling the web service searching sea_surface_temperature Number of results added: 9
temperature_8 water temperature from unit Identifier is: urn:ssds.mbari.org.recordVariable.id: water temperature from unit Identifier is: urn:ssds.mbari.org.recordVariable.id: … Sea Surface Temperature Observation Data/Drifters/MBARI Drifter 4 (8/11-9/5/2003)/Sea Surface Temperature (count=190) Identifier is: urn:aosn.mbari.org.recordVariable.i.id: Observation Data/Aircraft/Sea Surface Remote Sensing and Atmospheric Meteorology (8/4-6,10-11,13,15,20-22,25,29,9/4- 5,6/2003)/Sea Surface Temperature (count=148538) Identifier is: urn:aosn.mbari.org.recordVariable.id:44
Conclusions Controlled vocabulary is an open issue. It should be addressed and agreements must take place. Impossible to reach one and only one agreement, mapping and mediation should be part of the interoperable systems. Follow standards as much as possible Tools and more tools are needed.
Ontologies MMI Workshop - August 2005
Ontologies Specification of conceptualizations Body of Water Class RiverLake Has water Is inland body Has a relative defined channel LakeRiver Example: 1.Properties of real world objects are identified. 2.Similarities are identified. 3.Concepts are created… 4.and are expressed as a class. 5.Classes are related. Subclass
Web Ontology Language: OWL W3C Recommendation 02/04. Based on RDF. (-> URI ) Inference capabilities. Restriction of inherit properties. Can be used to express specifications and vocabularies Body of Water River
Hydrologic Unit RegionSubregionAccounting Unit Cataloging Unit Is part of Mid Atlantic Delaware Lower Delaware Schuylkill Is part of Vocabularies expressed in ontologies Subclasses Is Transitive Infer isPartOf Class Looks like a Real world objects Instances