LSIDs and RDF in TDWG Roger Hyam, TDWG, RBGE Donald Hobern, GBIF June 7-9, 2006 - Edinburgh, UK.

Slides:



Advertisements
Similar presentations
Core Ontology J Kennedy (R Gales, R Hyam, R Kukla, J Wieczorek G Hagerdorn, M Doering, D Vieglais, S Perry, D Hobern)
Advertisements

TDWG GUID-2 June 10, 2006Jessie Kennedy/Rob Gales LSID Resolution In SEEK Taxon.
Web service registration Anton Güntsch FUB-BGBM. Collection Specimen URIs as Services Do we want other systems to discover, harvest, and re-use our collection.
DDI3 Uniform Resource Names: Locating and Providing the Related DDI3 Objects Part of Session: DDI 3 Tools: Possibilities for Implementers IASSIST Conference,
GUID-1 Workshop Welcome and Introduction Donald Hobern GBIF Program Officer for Data Access and Database Interoperability February 2006.
The Semantic Web – WEEK 4: RDF
Integrating Biodiversity Data
BIS TDWG Conference, New Orleans, 2011 GBIF: Issues in providing federated access to digital information related to biological specimens David Remsen Senior.
Entomological Collections Network Meeting, Indianapolis, IN 13 December 2009 Darwin Core Ratified in the Year of Darwin Gail E. Kampmeier Illinois Natural.
The TDWG Infrastructure Project Lee Belbin Project Manager.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice RDF and SOA David Booth, Ph.D. HP.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Enterprise Information Integration.
Web Services and the Semantic Web: Open Discussion Session Diana Geangalau Ryan Layfield.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn.
IASW – 2005, Jyväskylä, FinlandUniversity of Vaasa, Department of Computer Science, Finland INFORMATION ARCHITECTURES FOR SEMANTIC WEB APPLICATIONS Kimmo.
Semantic Mediation & OWS 8 Glenn Guempel
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat Data Publishing.
Beispielbild SYNTHESYS II: Updating the BioCASe Technology Suite Jörg Holetschek Botanic Garden & Botanical Museum Berlin-Dahlem Dept. of Biodiversity.
GLOBAL BIODIVERSITY INFORMATION FACILITY The Global Biodiversity Information Facility (GBIF ): The distributed architecture Samy Gaiji Head of Informatics.
Using Vocabulary Services in Validation of Water Data May 2010 Simon Cox, JRC Jonathan Yu & David Ratcliffe, CSIRO.
ICT Technologies Session 2 4 June 2007 Mark Viney.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
Economic Botany Mark Jackson Giardino d’Inverno 16:00-17:30TDWG 2013 Florence.
The Elements of Collaboration in TDWG Stanley Blum California Academy of 2013 Florence, Italy.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer October DarwinCore Archives – Simplified Format for publishing.
Globally Unique Identifiers Workshop (GUID-1) International Working Group on Taxonomic Databases - TDWG Global Biodiversity Information Facility - GBIF.
Conclusions. LSIDs suck (sadly) “suck”is a technical term.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Integrating Live Plant Images with Other Types of Biodiversity Records Steve Baskauf Vanderbilt Dept. of Biological Sciences
Portfolio interoperability progress in the UK Simon Grant JISC CETIS IMS Quarterly ePortfolio session Birmingham
ABCD & BioCASe A Quick Introduction. Motivation & Rationale – ABCD I “Access to Biological Collection Data”  v2.06 ratified by TDWG, v1.20 still in use.
TDWG Standards Roadmap Roger Hyam (Technical Architecture Group)
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.
TDWG Infrastructure Project 1. Project Status Lee Belbin & Donald Hobern.
TAPIR 1.0 Renato De Giovanni, Markus Döring, Javier de la Torre October 2006.
Ricardo Pereira Software Engineer TDWG Infrastructure Project (TIP)
TDWG Life Sciences Identifiers Applicability Statement Ben Richardson Review Manager, LSID Applicability Statement Western Australian Herbarium Department.
An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008.
TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, Madrid, Spain.
Globally Unique Identifiers Workshop (GUID-1) International Working Group on Taxonomic Databases - TDWG Global Biodiversity Information Facility - GBIF.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
Globally Unique Identifiers in Biodiversity Informatics Kevin Richards Landcare Research NZ TDWG 2008.
© 2006 University of Kansas An LSID resolver for specimens and a digression into issues raised by the use of GUIDs Steve Perry
Acronym Soup GBIF, TDWG & GUIDs Jerry Cooper. Global Biodiversity Information Facility (GBIF) Established in 2000 through non-binding MOU (25 countries.
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
Djc -1 Daniel J. Crichton NASA/JPL 9 May 2006 CCSDS Information Architecture Working Group.
TDWG – Looking Backward and Forward Donald Hobern, Director, Atlas of Living Australia 20 October 2008.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS Jessie Kennedy Rob Gales, Robert Kukla.
Introduction to Web Services Presented by Sarath Chandra Dorbala.
TapirLink: Enabling the transition to TAPIR Renato De Giovanni TDWG 2007.
TDWG Infrastructure Project (TIP) Globally Unique Identifiers (GUID) Donald Hobern - Ricardo Pereira TDWG Executive Meeting June 1-2, Madrid, Spain.
DC Architecture WG meeting Wednesday Seminar Room: 5205 (2nd Floor)
1 Semantic Web Technologies for UK HE and FE Institutions: Part 1: Background to the Development of the Web Brian Kelly UK Web Focus UKOLN
2005 All Hands Meeting Data & Data Integration Working Group Summary.
David Leal / Ontology Summit Synthesis Panel - 26-Mar URI for quantities, units and scales Motivation  URIs are being assigned to quantities,
TDWG Infrastructure Project (TIP) Documentation Roger Hyam TDWG Executive Meeting June 1-2, Madrid, Spain.
IPT + Darwin Core OBIS XML Schema OBIS Database Schema Explained Mike Flavell OBIS Data Manager OBIS Nodes Training Course, Oostende, Belgium, 6 May 2014.
Course on persistent identifiers, Madrid (Spain) Information architecture and the benefits of persistent identifiers Greg Riccardi Director Institute for.
TDWG Core Ontology J Kennedy R Gales, R Hyam, R Kukla, J Wieczorek, G Hagedorn, M Döering D Vieglais, S Perry, D Hobern.
Introduction to Persistent Identifiers
A Web Services Journey on the .NET Bus
Flanders Marine Institute (VLIZ)
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
GLOBAL BIODIVERSITY INFORMATION FACILITY
LOD reference architecture
Australian and New Zealand Metadata Working Group
Presentation transcript:

LSIDs and RDF in TDWG Roger Hyam, TDWG, RBGE Donald Hobern, GBIF June 7-9, Edinburgh, UK

Paradigm Starting assumption is that standards are about sharing data. Sharing data also implies sharing data through time. Archive

What is Shared? Sharing raw literals isn’t much use. They need to be gathered together into ‘semantic’ units or objects. TaxonName:1234 Bellis perennis perennis Bellis 1234

Semantics of Objects Objects need to be based on some shared semantics. There needs to be somewhere to look up what they mean – an ontology. TaxonName: Bellis perennis Ontology TaxonName?

Identity of Objects How do I refer to this object? Who should I credit? Who should I send corrections to? Is it the same record as I already have or is it a new one? What is the official version of this data - has some one altered it before I received it?

TDWG TAG-1 Meeting There was consensus on- –Architecture is concerned with shared data –Biodiversity data will be modeled as a graph of identifiable objects –The semantics of these objects will be encoded in a series of shared ontologies –Ontologies will be related to each other on the basis of a shared Base and Core ontologies as a minimum Discussion continues on how this is done

Implications We need a ontology to define and relate the objects we exchange. Ontology governance/management is paramount. We need a system of GUIDs to identify the objects. We need a roadmap for the protocols to exchange these objects.

Structure of the Ontology Base Ontology Core Ontology Domain Ontology Application Ontologies BaseThing BaseActor CoreTaxonNameCoreInstitution TaxonName NomencalturalType NomeclaturalNote Herbarium ABCDDarwinCore???

Ontology Governance Allow people to create Domain sub- ontologies easily – prevent alienation. Each ontology construct (concept) has a status. Status is increased by passing through explicit gates defined by actual usage. ExperimentalSharedRecommend

What about RDF? The need to share identifiable objects has been established without reference to a technology. We are interested in objects not triples. Typical use case involves a client consuming semantically heterogeneous data from multiple sources. Semantic Web technologies would be ideal – but aren’t part of the TDWG culture and there are ‘unbelievers’.

Current ‘Standards’ DarwinCore & DiGIR –Based on Z39.50 –HTTP based XML message / response –Simple ‘flat’ application schemas (RDF-like) ABCD & BioCASe –Based on DarwinCore & DiGIR –Complex document structure. TAPIR –Unification of BioCASe and DiGIR No RDF, Objects or GUIDs here yet!

Combing Data GBIF data portal is the only ‘application’ that does data integration between these formats. No standard way to include XML fragments from other XSD other than xs:any. There is overlap between the different schemas and no easy way to merge them.

What about LSIDs GUID-1 meeting considered several GUID technologies including (LSID, DOI & Handle). Life Science Identifiers are being assessed. –I3C & OMG URNs –urn:lsid:ncbi.nlm.nih.gov:pubmed: –getData() –getMetadata()

LSID Permanence LSIDs should not be recycled – i.e. Used for more that one object. LSIDs should always resolve but it is OK for them to resolve to a 404 (Gone) error. No central authority to control these things. Even DOIs go away if there isn’t institutional backing!

LSIDs for Everything? Are there some things for which LSIDs are inappropriate? – –xsi:schemaLocation=“urn:lsid:example.com:xsd:taxon.xsd” –xmlns:tn=“urn:lsid:example.com:ontology:taxon/” Definitely places where we will use something else. Other people will use their own identifiers e.g. DOI, Handle etc.

So what’s cooking? XSD Based Conceptual Schemas XML Based Exchange Protocols 200+ Data Providers 50+ Million Anonymous ‘Records’ Emergent Semantic Web Recognised Need For GUIDS Different GUID Technologies A TDWG Ontology OGC Standards (GML) BioMOBY Other! Clients?

Possible Roadmap Build the ontology as a focus for semantics. Resolution and Harvest protocols should be relatively easy to plug into or wrap round existing service providers so approach these first. Search/Query – More problematic BioCASe, DiGIR, TAPIR, SPARQL, other?

Thank You Gordon and Betty Moore Foundation Global Biodiversity Information Facility NESC TDWG Members