Presentation on theme: "Using Darwin Core as a Model: An Ontologically Minimalist Approach to Publishing Occurrence Data in RDF Joel Sachs Formal Models track of the Semantics."— Presentation transcript:
Using Darwin Core as a Model: An Ontologically Minimalist Approach to Publishing Occurrence Data in RDF Joel Sachs Formal Models track of the Semantics for Biodiversity Symposium TDWG 2013
The first thing I want to communicate: Semantics != Ontologies
Ontologies as a vehicle for semantics Ontologies were the first choice for putting the “semantic” in semantic web. But ontologies aren’t the only way to supply semantics. Furthermore, ontologies can be a barrier to shared semantics, in a number of ways.
What’s green? Def 1:
What’s green? Def 2: Green is the portion of the electromagnetic spectrum with a wavelength between 520 – 570 nm. What’s electromagnetic? What’s a spectrum? What’s a wavelength? What’s a nanomemter?
Occurrence_ID Location_ID URI DateTime IndividualOrganism_ID URI Location_ID URI Latitude float Longitude float Datum URI Identification_ID Individual_ID URI Taxon URI Identified_by URI Occurrence_ID Latitude Longitude Scientific Name Vernacular Name Taxon_ID Scientific Name Vernacular Name Authorship Year etc. Occurrence Location Identification Taxon Occurrence
There are many ways to think about biodiversity data.
Thing #2 that I want to communicate Darwin Core (as it is) can be used as a light weight “ontology”.
Don’t try this at home
Thing #3 How to minimize the amount of ontology in the Core.
Example: Material Sample dwctype:MaterialSample (roughly?) corresponds to OBI:Specimen.
(forall (x) (if (MaterialEntity x) (IndependentContinuant x))) // axiom label in BFO2 CLIF: [ ] material MaterialEntity (forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt x y t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [ ] (forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt y x t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [ ]
curl -L -H "Accept: application/rdf+xml" | grep OBI MaterialSample A resource describing the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed recommended DataSets/DataSet/Units/Unit
On the one hand Nobody forces consuming application to ingest the OBI and BFO ontologies when they ingest Darwin Core. So what’s the big deal?
On the other hand Many semantic web clients automatically fetch and load referenced documents. – Especially if the documents are referenced with important properties like rdfs:subClassOf It’s bad form (and slightly dangerous) to clutter a semantic web document with terms from unnecessary namespaces.
My suggestion? Assertions that tie Core terms to upper ontologies should be asserted in a separate document. E.g. should be asserted in obi.owl, or dwc_obi.owl That way, those doing integration that depends on OBI axioms can ingest the appropriate descriptions. Those that don’t need the OBI axioms don’t have to worry about incorrect inference. – Keep in mind: There is no preferred upper ontology for science on the semantic web. BFO, Dolce, SUMO, UMBEL, NULO, etc.
Thank you for paying attention! Question, comments, and criticism