A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.

Slides:



Advertisements
Similar presentations
Using SKOS in practice, with examples from the classification domain
Advertisements

Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
OAEI 2007: Library Track Results Antoine Isaac, Lourens van der Meij, Shenghui Wang, Henk Matthezing Claus Zinn, Stefan Schlobach, Frank van Harmelen Ontology.
STITCH final event KB July Agenda Brief presentation of STITCH main achievements Demo: annotation suggestion at KB The future use of STITCH results.
Interoperability Aspects in Europeana Antoine Isaac Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen.
Do not use fonts other than Arial for your presentations ‘From A2A to Web 3.0’: local authority archives and the challenges in working across sectors in.
6. Applying metadata standards: Controlled vocabularies and quality issues Metadata Standards and Applications Workshop.
Semantic Web and Linked Data for cultural heritage materials Approaches in Europeana Antoine Isaac Vrije Universiteit Amsterdam Europeana DANS Linked Data.
Thesauri, Terminologies and the Semantic Web
Standards for networked knowledge organisation systems Ron Davies European Library Automation Group Bucharest, April 2006.
SKOS and Linked Data Antoine Isaac ISKO, London, Sept. 14th 2010.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn.
Using quantitative aspects of alignment generation for argumentation on mappings Antoine Isaac, Cassia Trojahn, Shenghui Wang, Paulo Quaresma Vrije Universteit.
Notes on ThoughtLab / Athena WP4 November 13, 2009 Antoine Isaac
On practical aspects of enhancing semantic interoperability using SKOS and KOS alignment Antoine ISAAC Vrije Universiteit Amsterdam National Library of.
Aligning Thesauri for an integrated Access to Cultural Heritage Collections Antoine ISAAC (including slides by Frank van Harmelen) STITCH Project UDC Conference.
The Value of Usage Scenarios for Thesaurus Alignment in Cultural Heritage Context Antoine Isaac, Claus Zinn, Henk Matthezing, Lourens van der Meij, Stefan.
An Empirical Study of Instance-Based Ontology Mapping Antoine Isaac, Lourens van der Meij, Stefan Schlobach, Shenghui Wang funded by NWO Vrije.
Multi-Concept Alignment and Evaluation Shenghui Wang, Antoine Isaac, Lourens van der Meij, Stefan Schlobach Ontology Matching Workshop Oct. 11 th, 2007.
Vocabulary Matching for Book Indexing Suggestion in Linked Libraries – A Prototype Implementation & Evaluation Antoine Isaac, Dirk Kramer, Lourens van.
A Registry for controlled vocabularies at the Library of Congress
Putting ontology alignment in context: Usage scenarios, deployment and evaluation in a library case Antoine Isaac Henk Matthezing Lourens van der Meij.
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
Linked Data The Short Version. Linked Data is a set of best practices for publishing and deploying instance and class data using the RDF data model, naming.
Accessing Cultural Heritage using Semantic Web Techniques Antoine ISAAC VU Amsterdam - KB Digital Access to Cultural Heritage Master March 20 th, 2008.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
9/10/20151 SKOS. 9/10/20152 SKOS Describes thesauruses and taxonomies Properties: broader, narrower, subject, related Classes: Concept, Collection
Is Semantic Web Our Future? Computers in Libraries Conference 2012 March 21-23, 2012 Hilton Washington Washington, DC Sharon Q. Yang, Rider University,
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
The OAI-ORE based data model of Europeana and the Digital Public Library of America: implications for educational publishing Dov Winer MAKASH – Advancing.
SKOS Simple Knowledge Organization System Antoine Isaac Dublin Core tutorial, Sept. 21, 2011.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
The Europeana Data Model: Constraints and Opportunities Prof. Dr. Stefan Gradmann Based on work done with M. Doerr, S. Hennicke, A. Isaac, C. Meghini,
The MMI Tools Carlos Rueda Monterey Bay Aquarium Research Institute OOS Semantic Interoperability Workshop Marine Metadata Interoperability Project Boulder,
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
Europeana as a Linked Open Data case (in progress) Antoine Isaac ISKO UK Seminar “Making Metadata Work” London, June 23, 2014.
Europeana and semantic alignment of vocabularies Antoine Isaac Jacco van Ossenbruggen, Victor de Boer, Jan Wielemaker, Guus Schreiber Europeana & Vrije.
Vocabularies in the VO Alasdair J G Gray Norman Gray Iadh Ounis.
D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
SKOS Tutorial Catch Mark van Assem, Antoine Isaac Vrije Universiteit Amsterdam Based on slides by Alistair Miles CCLRC Rutherford Appleton Laboratory
DDI-RDF Leveraging the DDI Model for the Linked Data Web.
Aligning library-domain metadata with the Europeana Data Model Sally CHAMBERS Valentine CHARLES ELAG 2011, Prague.
Incorporating ARGOVOC in DSpace-based Agricultural Repositories Dr. Devika P. Madalli & Nabonita Guha Documentation Research & Training Centre Indian Statistical.
Boris Villazón-Terrazas, Ghislain Atemezing FI, UPM, EURECOM, Introduction to Linked Data.
Schema Interoperability Liam Magee Global Cities Institute RMIT University Melbourne, Australia.
ISO 25964: a standard in support of interoperability Stella G Dextre Clarke Project Leader, ISO NP
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
EConnect WP1 & semantic issues VU members –Guus Schreiber, Antoine Isaac, Jacco van Ossenbruggen, Jan Wielemaker.
1 Linked Open Europeana: Semantic Leveraging of European Cultural Heritage Prof. Dr. Stefan Gradmann Humboldt-Universität.
KAnOE: Research Centre for Knowledge Analytics and Ontological Engineering Managing Semantic Data NACLIN-2014, 10 Dec 2014 Dr. Kavi Mahesh Dean of Research,
“New Dimensions in KOS” CENDI/NKOS Workshop September 11, 2008 Washington, DC, USA An international conference to share and advance knowledge and experience.
Objectives and scope of semantic enrichment and tools Europeana v1.0 work package 3 meeting Berlin, 25/26 January 2010 Stefan Gradmann / Marlies Olensky.
© Copyright 2015 STI INNSBRUCK PlanetData D2.7 Recommendations for contextual data publishing Ioan Toma.
EXtended Knowledge Organization System (XKOS) Prepared by Franck Cotton, Institut National de la Statistique et des Études Économiques Daniel W. Gillman,
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
1 The Europeana Data Model (EDM): Object Representations, Context and Semantics Prof. Dr. Stefan Gradmann Humboldt-Universität zu Berlin / School of Library.
Applications of IFLA Namespaces
LOD reference architecture
Antoine Isaac SEMIC conference
Linked Data Ryan McAlister.
Presentation transcript:

A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010

Linked Data Principles 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful information using standards (RDF, SPARQL) 4. Include links to other URIs, so that they can discover more things. Tim Berners-Lee, A way to publish Semantic Web data

A web of data Publish and re-use data via the web, building innovative applications over former data silos Principle #4 is crucial to this vision: Include links to other URIs, so that they can discover more things.

SKOS, Knowledge Organization Systems and Linked Data SKOS allows representing (simple) KOS data as RDF animals NT cats cats UF domestic cats RT wildcats BT animals SN used only for domestic cats domestic cats USE cats wildcats

SKOS, KOSs and LD SKOS allows bridging across KOSs from different contexts

Some landmark KOS LD implementations Many Libraries – not a surprise! Swedish National Library’s Libris catalogue and thesaurus Library of Congress’ vocabularies, including LCSH DNB’s Gemeinsame Normdatei (incl. SWD subject headings) Documentation at BnF’s RAMEAU subject headings OCLC’s DDC classification and VIAF STW economy thesaurus National Library of Hungary’s catalogue and thesauri (example) Other fields Wikipedia categories through Dbpedia New York Times subject headings IVOA astronomy vocabularies GEMET environmental thesaurus UMTHES Agrovoc Linked Life Data Taxonconcept UK Public sector vocabularies (e.g., )

KOS Alignments? Quite many of them are linked to some other resource LCSH, SWD and RAMEAU interlinked through MACS mappings GND linked to DBpedia and VIAF Libris linked to LCSH Agrovoc to CAT, NAL, SWD, GEMET NYT to freebase, DBpedia, Geonames dbPedia links are overwhelming Hungary, STW, TaxonConcept, GND… Is that enough? Are these links any good?

[Cyganiak, Jentzsch] Sparse linkage: the LD cloud

[Guéret, 2010] Sparse of linkage: another view

Linked Data Issues Mike Uschold’s “semantic elephants” Proliferation of URIs, Managing Coreference Versioning and URIs Overloading owl:sameAs

What kind of links? Coreference links are the most used (and needed) owl:sameAs skos:exactMatch skos:closeMatch rdfs:seeAlso umbel:isLike

Overloading owl:sameAs Formally, two URIs linked by owl:sameAs are inferred to have the same properties ex:a name “Antoine Isaac”. ex:b owl:sameAs ex:a. Implies ex:b name “Antoine Isaac”. Many owl:sameAs statements are asserted between resources that are only very similar [Halpin 2009] A same resource but in different contexts, a reference…

Case study: New York Times 10K concepts (places, descriptors, persons, organizations) Manually or automatically mapped by NYT staff to dbPedia, freebase, geonames Linking LD cloud to NYT articles! Allows to easily mix NYT content with other content Started with quite messy modeling dcterms:rightsHolder The New York Times Company. owl:sameAs

Clearer KOS alignments (1) What is being aligned? Concepts, documents, real-world entities “out there” (persons, places…) In principle owl:sameAs should not be applied across disjoint categories But even for one category there can be issues Two KOS concepts representing a same notion but with different management metadata attached (skos:changeNote)

Clearer KOS alignments (2) How is it aligned? Distinguish: exact co-reference conceptual similarity, including equivalence classification Making clearer distinctions between conceptual links skos:narrowMatch, skos:broadMatch, skos:relatedMatch Minimize ontological commitment for KOS data consumers skos:exactMatch: concepts can be used interchangeably across a wide range of information retrieval applications. skos:exactMatch is a transitive property skos:closeMatch: In order to avoid the possibility of "compound errors" when combining mappings across more than two concept schemes, skos:closeMatch is not declared to be a transitive property

Case study: New York Times (2) Data quality has considerably improved Factual data is at the concept itself, management data is at the resource representing the data source (context) rdf:type skos:Concept ; skos:prefLabel “Park Slope (NYC)” ; geo:lat “ ” ; owl:sameAs dcterms:rightsHolder “The New York Times Company” ; foaf:primaryTopic Still, for resources linked with owl:sameAs statements representing different modeling choices can be merged the DBpedia resource might not be a skos:Concept, or use different latitude format

Clearer KOS alignments (3) What is the alignment for? SKOS mapping properties use the notion of validity within one application context Application context for mapping has been investigated in thesaurus interoperability studies Application of alignments matters: STITCH application scenarios for Cultural Heritage: book re-indexing, thesaurus merging, query reformulation… A same alignment performs differently for different scenarios [Isaac 2008, Wang 2009]

Application-specific alignment evaluation Example: OAEI 2007 campaign, 3 matching tools evaluated for thesaurus merging & book re-indexing

Application-specific alignments Why? Take 2 thesauri at the Nat. Library of the Netherlands: GTT and Brinkman For thesaurus merging, gtt:excavation should be aligned to brinkman:excavation For book re-indexing, gtt:excavation should be aligned to brinkman:archeology_netherlands

Requires a finer representation grain for the context in which the alignment is produced Who created it? Manual vs. Automatic? Which alignment strategy or tool? Is there a degree of confidence?

Case study: New York Times (3) Using nyt:mapping_strategy property with nyt:manual or nyt:automatic: nyt:mapping_strategy Problem: it applies to the context file for the concept, not to the statement itself: owl:sameAs Using simple binary properties (skos:exactMatch…) between aligned resources does not allow for much flexibility

Ontology Matching community practices Community investigating the ontology and vocabulary matching issues Ontology Alignment Evaluation Initiative Matching tools produce some metadata Metadata repositories store and manage them – Bioportal – CATCH vocabulary and alignment repository … Consensus: richer alignment metadata is needed

From a simple representation

to a more complete one

Can LD accommodate complex representations? The strength of the LD vision lies in the relative simplicity of a standard representation LD provides a simple way to publish data and follow one’s nose to connected data Serendipity! Reification and metadata on links are not really compatible with it Higher barrier for data publication and consumption

Peaceful co-existence Applications with narrow scope and that require precise data can afford Selecting alignments they consume Exploiting finer-grained representations Creating finer-grained representations Simple data for applications that are simple and/or exploiting a wide range of datasets Simple mesh-up applications robust to (limited) approximation Web-scale applications Large-scale document retrieval, Concept discovery

Does it need to be perfect anyway? Do we really want to throw away crucial URI co-reference data? has 35,187,488 URIs in 11,285,263 bundles Extensive linking to dbPedia is useful, even with a type of link which is not used in the theoretically good way Cf. BBC content and data mesh-ups Issues with mixed quality are being tackled – as a “service to provide you with help finding URIs”, keeping track of data sources – Representation and exchange of provenance info is under active investigation

Peaceful co-existence (2) If you have complex representation, don’t be pedantic and publish simpler data, too! Articulation between LD (to discover links) and alignment repositories is needed Technically feasible, best practices have to be identified

Conclusions (Almost) any alignment is better than none This is a web of data, without links there’s almost no value There is already great linking happening! More involvement from this community would certainly help! Alignment themselves & Theoretical foundations

Thanks! Possible participation channels: Linked Open Data community ( and mailing list Library Linked Data W3C incubator group ( ) and community list

References [Halpin 2009] Harry Halpin, Pat Hayes. When owl:sameAs isn't the Same: An Analysis of Identity Links on the Semantic Web. LDOW 2009 [Isaac, 2008] Antoine Isaac, Henk Matthezing, Lourens van der Meij, Stefan Schlobach, Shenghui Wang, Claus Zinn. Putting ontology alignment in context: usage scenarios, deployment and evaluation in a library case. ESWC 2008 [Wang, 2009] Shenghui Wang, Antoine Isaac, Balthasar Schopman, Stefan Schlobach, Lourens van der Meij. Matching multi-lingual subject vocabularies. ECDL 2009