Presentation on theme: "Granularity in Library Linked Open Data Gordon Dunsire Keynote presentation to Code4Lib 2013, 12-14 Feb 2013, Chicago, USA."— Presentation transcript:
Granularity in Library Linked Open Data Gordon Dunsire Keynote presentation to Code4Lib 2013, Feb 2013, Chicago, USA
Fractals Self-similar at all levels of granularity Cannot determine level: all levels are equal!
Multi-faceted granularity What is described by a bibliographic record? Or a single statement? What is the level of description? How complete is it? How detailed is the schema used? How dumb? Semantic constraints? Unconstrained? AAA! OWA! Rumsfeld and the white light!
This resourcehas intended audienceJuvenileTriple: has Granularity? SubjectPredicateObject Coarse-grained systems consist of fewer, larger components than fine-grained systems [Wikipedia] Resource Description Framework – Linked data
Subject: what is the statement about? Journal title Issue Article Journals Library collection Paragraph Word Consortium collection Graphics SubjectsAccess Digital collection RDF map SectionPage Markup RDF/XML URINode Component Super-Aggregate Sub-Component Aggregate Focus coarser finer Journal index Festschrift ResourceWork
Predicate: what is the aspect described? Access to resource Access to content Suitability rating Membership category Audience Audience of audio-visual material Audience and usage Component Super-Aggregate Sub-Component Aggregate Focus coarser finer
isbd: International Standard Bibliographic Description dct: Dublin Core terms schema: Schema.org rda: Resource Description and Access m21: marc21rdf.info frbrer: Functional Requirements for Bibliographic Records, entity-relationship model unc: unconstrained version Possible Audience map (partial) rdfs: subPropertyOf unc: “has note on use or audience” isbd: “has note on use or audience” unc: “Intended audience” rda: “Intended audience” m21: “Target audience” frbrer: “has intended audience” dct: “audience” rdfs: subPropertyOf m21: “Target audience of …” rdfs: subPropertyOf schema: “audience”
What is the aspect described? Manifestation record Title and s.o.r Title statement Resource record Title word First word of title Title of manifestation Component Super-Aggregate Sub-Component Aggregate Focus coarser finer
dct: “Title” dc: “Title” rdfs: “Literal” sP r rdagrp1: “Title proper (Manifestation)” rdafrbr: “Manifestation” rdaopen: “Title proper” rdaopen: “Title” rdagrp1: “Title (Manifestation)” d d sP isbd: “has title proper” isbd: “has title” isbd: “Resource” d d sP eP Possible Title semantic map (partial) sP: rdfs:subPropertyOf d: rdfs:domain r: rdfs:range
Semantic reasoning: the sub-property ladder isbd: “has title proper” dct:title rdfs: subPropertyOf Semantic rule: If property1 sub-property of property2; Then data triple: Resource property1 “string” Implies data triple: Resource property2 “string” isbd: ”Resource” “Physics” isbd: “has title proper” Resource “Physics” dct: “has title” machine entailment coarser finer dumb-up
“For children aged 7-” ex:3 rda: ”Intended audience (Work)” “For ages 5-9” ex:2 isbd: ”has note on use or audience” “Primary school” ex:1 frbrer: ”has intended audience” “Juvenile” ex:4 m21: ”Target audience” m21terms: commonaud#j skos:prefLabel Data triples from multiple schema
“For ages 5-9” ex:2 unc:”has note on use or audience” Data triples entailed from sub-property map “Primary school” ex:1 unc:”has note on use or audience” “For children aged 7-” ex:3 unc:”has note on use or audience” “Juvenile” ex:4 unc:”has note on use or audience”
ex:1 frbrer:”Work” ”is a” ex:3rda:”Work” ”is a” ex:2isbd:”Resource” ”is a” Data triples entailed from property domains
What is the aspect described? Creator Author Screenwriter Children’s cartoon screenwriter Animation screenwriter Component Super-Aggregate Sub-Component Aggregate Focus coarser finer
d rda:”Work” rdaroles:”Creator” [rda:”Agent”]rdaroles:”Author (Work)” rdaroles:”Screenwriter (Work)” d d r r r s s dct:”Creator” dc:”Creator” dct:”Agent” r s marcrel:”Author” marcrel:”Author of screenplay, etc.” dc:”Contributor” s lcsh: ”Screenwriters” ? ? ? s: rdfs:subPropertyOf d: rdfs:domain r: rdfs:range ?
Machine-generated granularity Full-text indexing: down to word level A very large multilingual ontology with 5.5 millions of concepts A wide- coverage "encyclopedic dictionary" Obtained from the automatic integration of WordNet and Wikipedia Enriched with automatic translations of its concepts Connected to the Linguistic Linked Open Data cloud!
User-generated granularity “OK for my kids (7 and 9)” “Too childish for me (age 14)” “Ideal for the child of ambitious parents” “This sucks – for kids only” “Great! Has cool stuff”
KISS Keep it simple, stupid Keep it simple and stupid? The data model is very simple: triples! The (meta)data content is complex The Mandelbrot Set: “an example of a complex structure arising from the application of simple rules” - Wikipedia Resource discovery is complex
AAA Anyone can say anything about any thing Someone will say something about every thing In every conceivable way Linguistically
OWA Will all the gaps get filled? “There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know.” - Donald Rumsfeld Open World Assumption: the absence of a statement is not a statement of non-existence