Presentation is loading. Please wait.

Presentation is loading. Please wait.

Publishing the British National Bibliography as Linked Open Data

Similar presentations


Presentation on theme: "Publishing the British National Bibliography as Linked Open Data"— Presentation transcript:

1 Publishing the British National Bibliography as Linked Open Data
Corine Deliot Metadata Standards Analyst British Library Linked Data: what cataloguers need to know London, 20 February 2015 © The British Library Board 2014

2 Overview Motivations and approach
The modelling process and the data model Technical process: from MARC 21 to RDF, linking to external datasets Outcomes and dissemination Plans for future developments Use of the BNB data Challenges Benefits

3 Motivations Publishing our data for others to re-use
Looking beyond library audiences Taking part in the Linked Data conversation

4 How? Pragmatic, bottom-up approach Using existing staff
Building on existing skills Using existing tools as much as possible But training and mentoring from external provider

5 Why BNB? General bibliography - not a unique institutional catalogue
Consistent format - over 60 years Size & range of content - 3 million records on all subjects in many languages Control of metadata – publishable as CC0. © Waldir/ Wikimedia Commons/ CC BY-SA-3.0 Usage terms:

6 The modelling process (I)
identify our objects of interest, i.e. what does the MARC record says about “things in the world” e.g. Bibliographic resources, people, organizations, places, subjects, etc. Assign URIs to identify these objects of interests

7 URIs: Things to think about
Create our own URIs or use existing ones? e.g. Create opaque or transparent URIs? e.g. or What pattern? URI pattern guidance from the UK Cabinet Office “Designing URI Sets for the UK Public Sector” Create valid, i.e. syntax conformant URIs

8 URI patterns http://bnb.data.bl.uk/id/resource/{control-number}
number}/{dewey-number}

9 URI patterns http://bnb.data.bl.uk/id/resource/008043929

10 The modelling process (II)
Describe these objects of interest, i.e. use classes and how they relate to each other, i.e. use properties Use classes and properties from existing RDF vocabularies Define our own classes and properties when required; documented in the British Library Terms RDF schema

11 RDF Vocabularies Bibliographic Resource Person/Organization Event
Dublin Core Bibliographic Ontology ISBD British Library Terms Event Event Ontology Person/Organization FOAF: Friend of a Friend Bio: a Vocabulary for Biographical Information Org: an Organisation Ontology RDA MADS/RDF Place WGS84 Geo Positioning Concept SKOS British Library Terms RDF RDF Schema OWL

12 The British Library Terms RDF Schema
@prefix blt:< . Existing property “not quite right” (e.g. not granular enough) e.g. dcterms:identifier vs blt:bnb

13 The British Library Terms RDF Schema
@prefix blt:< . Property or class required by specific feature of the model e.g. blt:publication and blt:PublicationEvent (rdfs:subClassOf event:Event)

14 The British Library Terms RDF Schema
@prefix blt:< . For pragmatic reasons, e.g. facilitate searching and navigating through the graph e.g. blt:TopicLCSH and blt:TopicDDC e.g. blt:hasCreated owl:inverseOf dcterms:creator

15 The BNB data model - Books

16 Data Model Features (I): the Bibliographic Resource

17 Data Model Features (II): Publication as an event
@prefix dc:< . @prefix dcterms:< . <BibResource> dc:publisher “Publisher” ; dcterms:issued “Date” ; ?:placeOfPublication “Place” . @prefix blt:< . @prefix event:< . <BibResource> blt:publication <PublicationEvent> . <PublicationEvent> event:place <Place> ; event:agent <Publisher> ; event:time <Year> . Usual approach Event-based approach

18 Data model features (III)
Birth and death are modelled as biographical events extensive use of foaf:focus to relate “things in the world” (e.g. people, organizations, places) to their SKOS concepts. e.g. “London”, the capital of England and the UK as a single “thing in the world” may be the “focus” of multiple concepts belonging to different concept schemes, e.g. thesauri (LCSH, Rameau, etc.) <Thing-as-Concept> foaf:focus <Thing in the World> . by Pete Johnston

19 MARC to RDF Conversion Workflow
Process Selection Character set conversion Pre-processing URI generation Data transformation Create & load triples Produce VoiD descriptions Tools Catalogue Bridge Utilities MARC Global/MARC Report Jena Eyeball

20 Linking to external sources (I)
To give our data broader context we linked to: General resources: GeoNames Lexvo ISNI RDF Book Mashup Library resources: LCSH VIAF Dewey.info MARC language and country codes Page 20

21 Linking to external sources (II)
Techniques included: Automatic generation from record data Auto text match with linked data dumps Crosswalk matching for coded data © Silverspoon/ Wikimedia Commons/ CC BY-SA-3.0 Usage terms:

22 Serializations available: RDF/XML, N-Triples
Outcomes Two datasets – Books and Serials - and their VoID descriptions, accessible at: BNB Linked data platform: SPARQL endpoint: SPARQL editor: Bulk downloads: Updated monthly Serializations available: RDF/XML, N-Triples “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. Usage terms:

23

24 Platform change 2011 - initial Talis platform
2013 – data migration to TSO platform Tendering process Migration of data and services over a couple of months

25

26

27

28 Dissemination British Library Terms RDF schema declared in LOV (Linked Open Vocabularies) Linked Open BNB on data.gov.uk 5 * Openness rating included in the National Information Infrastructure bibliography Open Data Institute certification Pilot level 92% Data Quality indicator as part of Heritage & Culture Challenge evaluation

29 Plans for Future Developments
Refine and extend the model Investigate frbr-ization Link to other external sources e.g. DBPedia/Wikidata Collaborate with other national libraries Expand scope beyond current BNB, e.g. the British Catalogue of Music. Improve developer support

30 Use of the BNB data Statistics
e.g. Number of hits on the SPARQL endpoint e.g. Number of downloads on the BL webpage e.g. Web logs analysis reports BNB data used in pilot projects e.g. Linked Open BNB data used as test data for a semantic search demonstrator. e.g. data provided to Microsoft to assist in their research into linking structured data. BNB data used in tutorials e.g. - Owen Stephens

31 Use of the BNB data Anecdotal evidence
However, use is quite difficult to assess; part and parcel of the data being open and available for all to use

32 Challenges Legacy data issues Converting MARC data into RDF!
Publication event approach: transforming transcribed text into data URI creation from string may result in duplication changes over time may also produce duplication. Legacy data issues e.g. inconsistency of the data e.g. cataloguers using inadequate input tools for diacritics This was (relatively) new, nobody had all the answers

33 Benefits of Linked Open Data
We have learnt a lot about the practical aspects of working with linked data. The data model is influencing other implementations Re-used by Danish Bibliography Centre LOD raised the profile of Collection Metadata internally and the Library’s profile externally LOD helped us focus our legacy data enhancement activities

34 For further information
Thank you. Questions?


Download ppt "Publishing the British National Bibliography as Linked Open Data"

Similar presentations


Ads by Google