Unleashing UNIMARC to the Semantic Web: UNIMARC in RDF Gordon Dunsire, UK & Mirna Willer, Croatia UNIMARC Workshop, Biblioteca Nacional de Portugal Lisbon,

Slides:



Advertisements
Similar presentations
Presented to the ALCTS FRBR Interest Group, ALA Annual, 24 June 2011
Advertisements

Interoperability and semantics in RDF representations of FRBR, FRAD and FRSAD Gordon Dunsire Presented at the Cologne Conference on Interoperability and.
Authority control, new library standards, and the Semantic Web
Bibliographic data in the Semantic Web – what issues do we face in getting it there? Gordon Dunsire Presented to the ALCTS Cataloging and Classification.
Subjects in the FR family Gordon Dunsire Presented at the CC:DA/SAC joint meeting, ALA Annual, 27 June 2011.
Initiatives to make standard library metadata models and structures available to the Semantic Web Gordon Dunsire, UK Mirna Willer,
Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
From content standards to RDF Gordon Dunsire Presented at AKM 15, Porec, 2011.
Introduction to linked data Gordon Dunsire Presented at the Cataloguing and Indexing Group Scotland seminar Linked data and the Semantic Web: what have.
How to publish local metadata as linked data Gordon Dunsire Presented at Linked Open Data: current practice in libraries and archives (3rd Linked Open.
An introduction to RDF and library linked data Gordon Dunsire Presented at the Dewey Decimal Classification Executive Briefing 15 Sep 2011, London.
Bibliographic data in the Semantic Web – what issues do we face in getting it there? Gordon Dunsire Presented to the ALCTS Cataloging and Classification.
RDA and the semantic Web Lectio magistralis in Library Science by Gordon Dunsire Florence University, Florence, Italy 4th March, 2014.
Granularity in Library Linked Open Data Gordon Dunsire Keynote presentation to Code4Lib 2013, Feb 2013, Chicago, USA.
Representation of the UNIMARC bibliographic data format in Resource Description Framework Gordon Dunsire, Mirna Willer, Predrag Perožić Presented at DC-2013,
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability Gordon Dunsire Presented at Kunnskapsorganisasjonsdagene 2013, 7-8 February 2013,
The UNIMARC in RDF project: namespaces and linked data Mirna Willer, Gordon Dunsire, Predrag Perožić Presented at Session 222, IFLA WLIC 2013, 22 August.
IFLA Namespaces Gordon Dunsire Chair, IFLA Namespaces Technical Group Session 204 — IFLA library standards and the IFLA Committee on Standards – how can.
Linking bibliographic standards: alignments and maps, protocols and liaisons Gordon Dunsire Presented at AKM 18, Rovinj, Croatia, Nov 2014.
UNIMARC, RDA and the Semantic Web Gordon Dunsire Presented at Les Journées ABES May 2010, Montpellier, France (Originally presented at WLIC 2009,
National libraries and identity in the Semantic Web Gordon Dunsire BNE, Madrid, 14 Dec 2011.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
International Bibliographic Standards, Linked Data, and the Impact on Library Cataloging Gordon Dunsire A NISO/DCMI Webinar 24 August 2011.
IFLA Satellite Meeting, 13 August 2014, Frankfurt-am-Main, Germany
An introduction to open linked data for librarians Gordon Dunsire National Library of Finland, Helsinki 11 December 2012.
Multilingual Issues in the Representation of International Bibliographic Standards for the Semantic Web Gordon Dunsire Independent Consultant; Chair of.
RDA data and applications Gordon Dunsire Presented to staff of the British Library, Boston Spa, 20 Mar 2014.
RDA and Linked Data by Gordon Dunsire National Seminar, National Library of Finland, Helsinki, Finland, 25 March 2014.
ISBD for the Semantic Web: namespaces, elements, vocabularies, application profile Gordon Dunsire Presented at Centar zu Stalno Stručno Usavršavanje (CSSU),
Report on recent activity of the IFLA Namespaces Task Group and the DCMI/RDA Task Group Gordon Dunsire Presented to the Semantic Web Special Interest Group,
Turtle dreaming Gordon Dunsire Presented to the seminar “Five years on” British Library, London, 27 April 2012.
The Semantic Web and expert metadata: pull apart then bring together Presented at 12.seminar Arhivi, Knjižnice, Muzeji Nov 2008, Pore č, Croatia.
Relevance of the consolidated edition ISBD for national bibliographies Professor Mirna Willer, PhD University of Zadar Department of Information Sciences.
Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web Gordon Dunsire Presented at the Canadian Library.
A Bibliographic Roadmap miscellany Vocabularies in space, time, and nets Gordon Dunsire Presented to NISO BibRM Group 20 November 2015.
A Bibliographic Roadmap miscellany Vocabularies in space, time, and nets Gordon Dunsire Presented to NISO BibRM Group 20 November 2015.
Key issues in publishing and consuming linked data for libraries Gordon Dunsire Presented to CILIP Linked Data Executive Briefing 24 November 2015, London.
RDA: thinking globally, acting globally Gordon Dunsire Presented at International Developments in Library Linked Data: Think Globally, Act Globally – Part.
Getting triples from records: the role of ISBD Gordon Dunsire Presented at Centar zu Stalno Stručno Usavršavanje (CSSU), Zagreb 21 Nov 2011.
On building universal bibliographic control Gordon Dunsire Presented to COBISS November 2014, Maribor, Slovenia.
Current initiatives in developing library linked data Gordon Dunsire Presented at the Cataloguing and Indexing Group Scotland seminar “Linked data and.
RDA and Linked Data Gordon Dunsire Presented at Cita BNE - RDA and Linked Data, 15 April 2016, Madrid, Spain.
MARC Tags to BIBFRAME Vocabulary: a new view of metadata Sally McCallum Library of Congress ALA - January 2014.
RDA and Linked Data Gordon Dunsire Presented at Selmathon 1, 9 May 2016, Stockholm, Sweden.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
Shrinking the silo boundary: data and schema in the Semantic Web Gordon Dunsire Presented at AKM 16, Poreč, 2012.
RDA and linked data Gordon Dunsire Presented to Code4Lib Ottawa, MacOdrum Library, Carleton University, Ottawa, 27 April 2016.
Subjects in the FR family
LRM-RDA Gordon Dunsire
RDA work plan: current and future activities
Quo vadis? Getting there with linked data
Authority versus authenticity: the shift from labels to identifiers
The Vocabulary Mapping Framework matrix
Chair, IFLA Namespaces Technical Group; Chair-Elect, JSC/RDA
Recording RDA data as linked data
UNIMARC and linked data
RDA, linked data, and update on development
Applications of IFLA Namespaces
Appellations, Authorities, and Access
Gordon Dunsire, Françoise Leresche, Mirna Willer
Metadata vocabulary alignment: opportunities and challenges
RDA and practical linked open data
Gordon Dunsire, Françoise Leresche, Mirna Willer
Gordon Dunsire, UK & Mirna Willer, Croatia
RDA and semantic data Gordon Dunsire
Introducing IFLA-LRM Gordon Dunsire, Chair, RSC
RDA cataloguing and linked data
RDA in a non-MARC environment
RDA Community and linked data
The new RDA: resource description in libraries and beyond
Presentation transcript:

Unleashing UNIMARC to the Semantic Web: UNIMARC in RDF Gordon Dunsire, UK & Mirna Willer, Croatia UNIMARC Workshop, Biblioteca Nacional de Portugal Lisbon, 6 April 2016

Overview Based on presentation to IFLA 2015 With latest developments Introduction to linked data and UNIMARC UNIMARC vocabularies Future research and plans UNIMARC in RDF: Workshop, Lisbon, 6 April

Introduction to linked data and UNIMARC UNIMARC in RDF: Workshop, Lisbon, 6 April

Background Representation of IFLA standards for use in the Semantic Web Work of the FRBR Namespaces project and IFLA Namespaces Task Group Work of the ISBD/XML Study Group Included a feasibility study of representation of UNIMARC Representations allow legacy catalogue records to be published as linked data using RDF Branding IFLA standards for authority & trust Semantic Web lets “Anyone say Anything about Any resource” UNIMARC in RDF: Workshop, Lisbon, 6 April

Linked data and RDF Resource Description Framework (RDF) Designed for machine-processing of metadata at global scale (Semantic Web) 24/7/365 Trillions of operations per second Everything must be dis-ambiguated Machines are dumb A simple approach helps! Machine-readable identifiers UNIMARC in RDF: Workshop, Lisbon, 6 April

RDF triple Metadata expressed as “atomic” statements A simple, single, irreducible statement The title of this book is “Cataloguing is fun!” Constructed in 3 parts “Triple” The title of this book is “Cataloguing is fun!” Subject of the statement = Subject: This book Nature of the statement = Predicate: has title Value of the statement = Object: “Cataloguing is fun!” This book – has title – “Cataloguing is fun!” subject – predicate - object UNIMARC in RDF: Workshop, Lisbon, 6 April

Machine-readable identifiers Uniform Resource Identifier (URI) Can be any unique combination of numbers and letters No intrinsic meaning; it’s just an identifier RDF requires the subject and predicate of triple to be URIs Object can be a URI, or a literal string (“Cataloguing is fun!”) URIs can be matched by machine to link triples together UNIMARC in RDF: Workshop, Lisbon, 6 April

Vocabularies, values and element sets Controlled terminology represented as RDF “value” vocabulary Entities, attributes, and relationships represented as RDF “element set” vocabulary Attributes and relationships represented as RDF properties (“predicates”) Entities represented in RDF as classes UNIMARC-B has only 1 entity: Resource ISBD already has an equivalent class for Resource UNIMARC in RDF: Workshop, Lisbon, 6 April

Element sets “Bibliographic” format has same focus as International Standard Bibliographic Description (ISBD) The entity [bibliographic] Resource ~ FRBR Manifestation Attributes => RDF properties RDF properties require URIs IFLA/UNIMARC URL domain + local unique UNIMARC part Lossless data requires finest level of granularity Important for UNIMARC qualified coded subfield UNIMARC in RDF: Workshop, Lisbon, 6 April UNIMARC in RDF: Workshop, Lisbon, 6 Apr

UNIMARC element and concept identifiers UNIMARC in RDF: Workshop, Lisbon, 6 April National bibliography numberElement: Unique in element set tag:020subfield:b1 st ind.:2 nd ind.: U020__b iflastandards.info/ns/unimarc/unimarcb/elements/0XX/U020__b Unique in local namespace Unique in global namespace

UNIMARC in RDF: Workshop, Lisbon, 6 Apr

UNIMARC element and concept identifiers UNIMARC in RDF: Workshop, Lisbon, 6 April Target audience code …Element: U100__a17-19 tag:100subfield:a1 st ind.:2 nd ind.:pos:17-19 adult, generalConcept: code:m tac#m Unique in value vocabulary iflastandards.info/ns/unimarc/terms/tac#m

UNIMARC in RDF: Workshop, Lisbon, 6 Apr

200 1#$aBibliographica belgica $fCommission belge de bibliographie $f= Belgische Commissie voor bibliografie “= “ : Parallel U2001_f : First Statement of Responsibility ??? : Parallel First Statement of Responsibility Exception! Semantic data embedded in content UNIMARC in RDF: Workshop, Lisbon, 6 April

Translations The same identifier is used for translated elements (captions, definitions, etc.) and vocabularies (preferred terms, definitions, etc.) E.g. Frequency of continuing resources code. UNIMARC in RDF: Workshop, Lisbon, 6 April

UNIMARC in RDF: Workshop, Lisbon, 6 April IFLA linked data vocabularies

UNIMARC in RDF: Workshop, Lisbon, 6 April

UNIMARC vocabularies UNIMARC in RDF: Workshop, Lisbon, 6 April

UNIMARC in RDF: Workshop, Lisbon, 6 Apr …

UNIMARC in RDF: Workshop, Lisbon, 6 Apr …

Value vocabularies “thesauri, code lists, term lists, classification schemes, subject heading lists, …” W3C Library Linked Data Incubator Group Often represented in RDF using Simple Knowledge Organization System (SKOS) UNIMARC in RDF: Workshop, Lisbon, 6 April

Value vocabularies Coded information stored in tag block 1xx Code lists specify notation, term, description, and scope Represented as RDF/SKOS vocabularies Italian and Portuguese translations – multilingual environment Interoperability with vocabularies of other schema 50 published so far For example: Target audience UNIMARC in RDF: Workshop, Lisbon, 6 April

metadataregistry.org/concept/list/vocabulary_id/322.html UNIMARC in RDF: Workshop, Lisbon, 6 April

Target audience code Subfield a, character positions 17-19, of tag 100 General processing data “applicable to records of materials in any media“ U100__a17-19 U100__a17 Order of position carries no significance in UNIMARC format But content rules may assign significance 3 instances of one-character code UNIMARC in RDF: Workshop, Lisbon, 6 April U100__a18 U100__a19

U100__a17-19 UNIMARC in RDF: Workshop, Lisbon, 6 April U100__a17U100__a18U100__a19 sub-property of Maps within element sets

Unconstrained versions Map of “Audience” umarc: m “adult, general” “adult, serious” pbcore: adult “adult” m21: e “adult” MPAA: NC-17? BBFC: 18? Element sets (schema) Value vocabularies (KOS) Broader/narrower/same? m21: “Target audience of …” m21: “Target audience” frbrer: “has intended audience” schema: “audience” dct: “audience” rdau: “Intended audience” isbd: “has note on use or audience” isbdu: “has note on use or audience” rdaw: “Intended audience” rdfs:subPropertyOf umarc: k UNIMARC in RDF: Workshop, Lisbon, 6 April Maps between vocabularies

Attribute Character position ValueNotes Type designator0cnewspaper Frequency of issueladaily Regularity2aregular 110 (CODED DATA FIELD: CONTINUING RESOURCES) $a (Continuing Resource Coded Data) UNIMARC in RDF: Workshop, Lisbon, 6 April RDF linked data Publishing UNIMARC data in RDF 110 ##$acaa…

UNIMARC in RDF: Workshop, Lisbon, 6 Apr Syntactic parsing 110 ##$acaa… String U110__a01 U110__a00 U110__a02 RDF properties continuingfreq#a continuingtype#c continuingreg#a RDF objects … Myspace:Resource23 unimarcb:U110__a01 ufreq:a. … RDF data triples

unimarcb:U110__a01 resource: 123 freq: a type: c unimarcb:U110__a00 reg: a unimarcb:U110__a02 “a” skos:notation skos:prefLabel Frequency map for Dublin Core, MARC 21, and RDA UNIMARC in RDF: Workshop, Lisbon, 6 April Semantic graph

Future research and plans UNIMARC in RDF: Workshop, Lisbon, 6 April

Level 0: the finest level of granularity Subfield qualified by indicators “A defined unit of information within a field. See also Data Element” “The smallest unit of information that is explicitly identified” Field: “A defined character string, identified by a tag, which contains one or more subfields” Coarser level of granularity (Level 1+) with structure of combinations of Level 0 elements Indicator qualification is at field level, and redundant for Level 0 elements that are not in scope. UNIMARC in RDF: Workshop, Lisbon, 6 April

tagtagCapind1ind1Capind2ind2CapsubsubCapdefinition 210PUBLICATION, DISTRIBUTION, ETC. #Not applicable / Earliest available publisher #Produced in multiple copies, usually published or publically distributed aPlace of Publication, Distribution, etc. The town or other locality where the item is published or distributed or, in the case of a manuscript, written. 210PUBLICATION, DISTRIBUTION, ETC. 0Intervening publisher #Produced in multiple copies, usually published or publically distributed aPlace of Publication, Distribution, etc. The town or other locality where the item is published or distributed or, in the case of a manuscript, written. 210PUBLICATION, DISTRIBUTION, ETC. 1Current or latest publisher #Produced in multiple copies, usually published or publically distributed aPlace of Publication, Distribution, etc. The town or other locality where the item is published or distributed or, in the case of a manuscript, written. 210PUBLICATION, DISTRIBUTION, ETC. #Not applicable / Earliest available publisher 1Not published or publically distributed aPlace of Publication, Distribution, etc. The town or other locality where the item is published or distributed or, in the case of a manuscript, written. 210PUBLICATION, DISTRIBUTION, ETC. 0Intervening publisher 1Not published or publically distributed aPlace of Publication, Distribution, etc. The town or other locality where the item is published or distributed or, in the case of a manuscript, written. 210PUBLICATION, DISTRIBUTION, ETC. 1Current or latest publisher 1Not published or publically distributed aPlace of Publication, Distribution, etc. The town or other locality where the item is published or distributed or, in the case of a manuscript, written. U21011a Place of publication … in Publication, distribution, etc. (Current or latest publisher) (Not published …) URILabel UNIMARC in RDF: Workshop, Lisbon, 6 April

U21011a Place of publication … in Publication, distribution, etc. (Current or latest publisher) (Not published …) U210_1a Place of publication … in Publication, distribution, etc. (Not applicable …) (Not published …) U21001a Place of publication … in Publication, distribution, etc. (Intervening publisher) (Not published …) U2101_a Place of publication … in Publication, distribution, etc. (Current or latest publisher) (Produced in multiple copies …) UNIMARC in RDF: Workshop, Lisbon, 6 April

is sub-property of Place … u:2100_a Place … u:2101_a Place … u:210XXa Place … u:210__a Place … u:210a Publication … u:210 is aggregated by UNIMARC in RDF: Workshop, Lisbon, 6 April

Place 1 Publication … Statement 1 Place 2Place 3Place 4 Publication … Statement 2 UNIMARC in RDF: Workshop, Lisbon, 6 April

Representing UNIMARC authorities in RDF UNIMARC in RDF: Workshop, Lisbon, 6 April

Representing UNIMARC authorities in RDF: use of parallel vocabularies UNIMARC in RDF: Workshop, Lisbon, 6 April

Representing UNIMARC authorities in RDF: authorised and variant forms of a name UNIMARC in RDF: Workshop, Lisbon, 6 April

Mappings UNIMARC tags and subfields have corresponding ISBD “elements” Now out-of-date after publication of ISBD consolidated edition Category of alignment relationship to be determined Equivalent or broader/narrower To be used as basis for sub-property mappings Mappings from UNIMARC to other vocabularies being developed UNIMARC in RDF: Workshop, Lisbon, 6 April

UNIMARC and ISBD properties Element identifier/URI: unimarcb:U205__b Label (English): (has) issue statement Equivalent ISBD URI: isbd:P1011 Label (English): has additional edition statement The meaning is the same, but the identifiers and labels are different unimarcb:U205__b same as isbd:P1011 (in RDF) Or use isbd:P1011 instead of unimarcb:U205__b UNIMARC in RDF: Workshop, Lisbon, 6 April

UNIMARC ISBD PropertyLabelAPropertyLabel U200__aTitle proper= <> P1004has title proper P1117has title of individual work by same author P1137has common title of title proper UNIMARC Alignment with ISBD Alignment is equal, broader, and narrower! UNIMARC in RDF: Workshop, Lisbon, 6 April

UNIMARC and MARC21 (BIBFRAME) UNIMARC Level 0 approach is based on publication of MARC21 element sets in the Open Metadata Registry BIBFRAME has a coarser granularity, but is extensible Sub-properties and sub-classes can be added to refine the semantics BF is lossy at current levels of granularity UNIMARC separates content (values) from structure (encoding) in most cases = Parallel is an exception BF model is based on data in legacy records Extensive “archaeology” required to trace semantics and syntax. UNIMARC in RDF: Workshop, Lisbon, 6 April

Granularity Intellectual value of UNIMARC is preserved by a finest-grained semantic representation Data can always be dumbed-down to the level of coarseness required by applications Processed with shared open maps Including schema.org and dct! And BIBFRAME too … Data should be published without loss For semantically rich applications Universal Bibliographic Control ~ Semantic Web UNIMARC in RDF: Workshop, Lisbon, 6 April

Thank you! UNIMARC in RDF: Workshop, Lisbon, 6 April

References Dunsire, Gordon; Mirna Willer. UNIMARC and Linked Data. // IFLA Journal 37, 4(December 2011), , _2011.pdf _2011.pdf Dunsire, G. Using the sub-property ladder, [blog] 2012, property-ladder/ property-ladder/ Hillmann, D., G. Dunsire, J. Phipps. Maps and Gaps: Strategies for Vocabulary Design and Development. In Proc. Int’l Conf. on Dublin Core and Metadata Applications 2013, 82-89, /paper/view/185/80; /paper/view/185/80 Willer, M., G. Dunsire. Bibliographic information organization in the Semantic Web. Oxford: Chandos, UNIMARC in RDF: Workshop, Lisbon, 6 April

Note This presentation is an updated version of the workshop held at IFLA 2015, Cape Town, Session 105 under the title “UNIMARC in RDF: Representation of UNIMARC Bibliographic Format in Resource Description Framework for Linked Data”. UNIMARC in RDF: Workshop, Lisbon, 6 April