DSD Distributed Systems Division MTA SZTAKI Automatic Conversion from MARC to FRBR Christian Mönch (MTA SZTAKI) Trond Aalberg (NTNU)

Slides:



Advertisements
Similar presentations
Presented to the ALCTS FRBR Interest Group, ALA Annual, 24 June 2011
Advertisements

John Espley and Robert Pillow ALA New Orleans 26 June 2011 The RDA Sandbox and RDA Implementation Scenario One.
FRBR : A Review Joy Kim University of Southern California.
Future of Cataloging RDA and other innovations Pt. 2.
FRBR Overview and Application Authorities: Part 1 June 2012.
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
OCLC Research at work: FRBR, VIAF & Classify Eric Childress OCLC Research.
FRBR – A Refresher Course Marjorie E. Bloss RDA Project Manager April 9, 2008.
Bibliographic Relationships and Bibliographic Families.
MARC 21, FRBR, RDA Review terminology (especially for non-native English speakers) Conceptual models Elements Attributes Future: Probably not a bib record,
RDA & Serials. RDA Toolkit CONSER RDA Cataloging Checklist for Textual Serials (DRAFT) CONSER RDA Core Elements Where’s that Tool? CONSER RDA Cataloging.
FRBR Functional requirements for bibliographic records (IFLA, 1998) Don Thornbury, RBSC Technical Services April 5, 2005.
FRBR: Functional Requirements for Bibliographic Records it is the Final Report of the IFLA Study Group on the Functional Requirements for Bibliographic.
RDA and libraries Gordon Dunsire Presented at a College Development Network webinar, 13 June 2013.
RDA Terminology: A Summary Atoma Batoma. RDA Terminology RDA Vocabularies: Controlled Vocabularies -Closed – Open –
Images Application Profile meeting 29th October 2007, London Julie Allinson Digital Library Manager Library & Archives, University of York SWAP a Dublin.
RDA, FRBR & MARC RDA Cataloguing Seminars September 2012.
RDA AND AUTHORITY CONTROL Name: Hester Marais Job Title: Authority Describer Tel: Your institution's logo.
The European Manuscript & Hand Press Book Heritage The role of the Consortium of European Research Libraries (CERL) Manuscript Collection in the National.
Z39.50, XML & RDF Applications ZIG Tutorial January 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre,
XML & Library Applications ELAG 2001 Poul Henrik Jørgensen, Danish Bibliographic Centre,
RDA Training FRBR: a brief introduction British Library 2015 (2015 April RDA update)
Vended Authority Control --Procedures and issues.
Updated :02 Hong Kong University of Science & Technology Library XML Name Access Control Repository at the Hong Kong University of Science.
Linking resources Praha, June 2001 Ole Husby, BIBSYS
Grant Number: IIS Institution of PI: Arizona State University PIs: Zoé Lacroix Title: Collaborative Research: Semantic Map of Biological Data.
7/14/09. Robert L. Maxwell RDA Lecture Series National Library of South Africa 22 July /14/09 Cataloging: Still a Professional Asset to Become Excited.
OCLC Online Computer Library Center Kathy Kie December 2007 OCLC Cataloging & Metadata Services an introduction.
Putting RDA: Resource Description and Access into context 1. FRBR: Functional requirements for bibliographic records Alan Danskin Data Quality & Authority.
IME ICC5 Report Working Group 3: Seriality Working Group Leader: Elise Roberts Co-leader: Martha de Waal Working Group Recorder: Marion Chibambo IME ICC5,
1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,
Entity Relationships for the Bibliographic Universe Jacquie Samples September 7,2010 FRBR.
Functional Requirements for Bibliographic Records: FRBR and Millennium
Robert Pillow, VTLS Inc. How Will RDA Impact Your System? A Forum of Vendors Discussing Implementation Plans Association for Library Collections & Technical.
What’s the use?: Searching for catalog user tasks beyond finding, identifying, selecting, and obtaining Marty Kurth Heads of Cataloging Interest Group.
Module 2: FRBR refresher Adapted from: Teaching RDA: Train-the-trainer course RDA: Resource description and access presented by the National Library of.
Module 2: FRBR refresher This work is licensed under the Creative Commons Attribution 3.0 Australia License
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
What users want & how FRBR can help Diane Vizine-Goetz Research Scientist OCLC Research.
Resource Description and Access Deirdre Kiorgaard Australian Committee on Cataloguing Representative to the Joint Steering Committee for the Development.
How to express MARC in XML ELAG Workshop 10 Report.
APPLYING FRBR TO LIBRARY CATALOGUES A REVIEW OF EXISTING FRBRIZATION PROJECTS Martha M. Yee September 9, 2006 draft.
What Does FRBR Mean To You? Jenn Riley Metadata Librarian IU Digital Library Program
RDA DAY 1 – part 2 web version 1. 2 When you catalog a “book” in hand: You are working with a FRBR Group 1 Item The bibliographic record you create will.
Building blocks for RDA Theory behind RDA ALLUNY Annual Meeting September 28-30, 2012.
RDA Update Background Implementation plan Basics FRBR New MARC fields in CruzCat UCSC training plan (A number of slides are from Lori Robare’s “RDA For.
Intellectual Works and their Manifestations Representation of Information Objects IR Systems & Information objects Spring January, 2006 Bharat.
FRBR: Cataloging’s New Frontier Emily Dust Nimsakont Nebraska Library Commission NCompass Live December 15, 2010 Photo credit:
Functional Requirements for Bibliographic Records The Changing Face of Cataloging William E. Moen Texas Center for Digital Knowledge School of Library.
From FRBR to FRBR OO through CIDOC CRM… A Common Ontology for Cultural Heritage Information Patrick Le Bœuf, National Library of France International Symposium.
1 Resource Management: Resource Management Fundamentals.
LIS512 lecture 2: FRBR reading International Federation of Library Association “Fundamental Requirements for Bibliographic Records”, revised.
Do we need to change? Do we want to change? The future of bibliographic information systems Maja Žumer University of Ljubljana Slovenia.
Helsinki, November FRBR: the bright new future? Part 1 Maja Žumer University of Ljubljana Slovenia.
Sally McCallum Library of Congress
RDA: history and background Ann Huthwaite Library Resource Services Manager, QUT ACOC Seminar, Sydney, 24 October 2008.
RDA in NACO Module 1.b Background : FRBR/FRAD. 2 FRBR Foundation of RDA If you want a more detailed background, watch this webcast by Barbara Tillett:
CASEY A. MULLIN WITH: LALA HAJIBAYOVA SCOTT MCCAULAY DECEMBER 8, 2008 FRBR in RDF: a proof-of-concept model 1 ©2008 Casey A. Mullin.
Web Services Overview Thomas Hickey. 2 What are Web Services? Machine-to-machine communication Run over standard Web protocols –XML syntax, HTTP packaging.
Introduction to FRBR Functional Requirements for Bibliographic Records GACOMO Oct. 16, 2008.
Subjects in the FR family
LRM-RDA Gordon Dunsire
From FRBR to FRBROO through CIDOC CRM…
Appellations, Authorities, and Access
LRM-RDA Gordon Dunsire.
Metadata - Catalogues and Digitised works
Functional Requirements for Bibliographic Records
Describing Documents Ch3 in textbook Organizing Knowledge: An
FRBR and FRAD as Implemented in RDA
Presentation transcript:

DSD Distributed Systems Division MTA SZTAKI Automatic Conversion from MARC to FRBR Christian Mönch (MTA SZTAKI) Trond Aalberg (NTNU)

Distributed Systems Division MTA SZTAKI DSD 2ECDL Trondheim, Norway Outline Bibliographic catalogs FRBR model A framework for extracting FRBR entities from MARC-based catalogs Application to the BIBSYS catalog Results

Distributed Systems Division MTA SZTAKI DSD 3ECDL Trondheim, Norway Record-Based Bibliographic Catalogs Structure: Set of records (search, exchange) Record: Surrogate for a publication Set of attributes, name-value pairs Problems: Non normalized structure with excessive data replication Many search requests are unsupported or require knowledge of bibliographic format

Distributed Systems Division MTA SZTAKI DSD 4ECDL Trondheim, Norway IFLA’s FRBR Model ER-Model, three groups of entities Four operations on entities: search, identify, select, obtain Item Work is realized through is embodied in is exemplified by Translation Expression Adaptation Manifestation Whole/Part Corporate BodyPerson

Distributed Systems Division MTA SZTAKI DSD 5ECDL Trondheim, Norway Availability? Highly structured model, that supports A multitude of search operations Navigation of bibliographic records Expensive to create Re-cataloging unaffordable Automatic conversion

Distributed Systems Division MTA SZTAKI DSD 6ECDL Trondheim, Norway Automatic Creation of FRBR Instances Records Item Manifestation SRecords Expression Work Splitting SRecords Expression 1 SRecords Expression 2 SRecords Work 2 Expression Clustering Work Clustering SRecords Work 1 is realized through Extract manifestations and items from records Identify and split aggregative records Cluster record set to identify works Cluster work sets to identify expressions Create entities from the clusters

Distributed Systems Division MTA SZTAKI DSD 7ECDL Trondheim, Norway Obstacles to the Automatic Creation FRBR Model Instances Inconsistency of data in catalogs: Identical information is represented differently in different records (attributes, syntaxes) Erroneous data Incompleteness of data in catalogs: Information necessary for clustering has not been captured in the records

Distributed Systems Division MTA SZTAKI DSD 8ECDL Trondheim, Norway Obstacles to the Automatic Creation FRBR Model Instances Inconsistency of data in catalogs: Identical information is represented differently in different records (attributes, syntaxes) Erroneous data Might be resolved automaticaly, for example, through authority files Incompleteness of data in catalogs: Information necessary for clustering has not been captured in the records

Distributed Systems Division MTA SZTAKI DSD 9ECDL Trondheim, Norway Obstacles to the Automatic Creation FRBR Model Instances Inconsistency of data in catalogs: Identical information is represented differently in different records (attributes, syntaxes) Erroneous data Incompleteness of data in catalogs: Information necessary for clustering has not been captured in the records Requires additional information linked to individual records

Distributed Systems Division MTA SZTAKI DSD 10ECDL Trondheim, Norway The Attribute Layer SRecords Expression 1 SRecords Expression 2 SRecords Work 2 Expression Clustering Work Clustering SRecords Work 1 Attribute Layer Extract consistent and error-free FRBR-related Generic Attributes and Properties from the records, e.g. title, creator, isTranslation. Specific to bibliographic formats and catalogs

Distributed Systems Division MTA SZTAKI DSD 11ECDL Trondheim, Norway The Attribute Layer for BIBSYS (I) Classify records: Series, monographs Monographs may have each of the following characteristics: Linked Aggregative Example for retrieval of Generic Attributes from monograph records: Attribute title: Searched in: 130$a, 740$a, 240$a (if 240$l does not exist), and 245$a Extended to referenced records

Distributed Systems Division MTA SZTAKI DSD 12ECDL Trondheim, Norway The Attribute Layer for BIBSYS (II) Attribute original title: Searched in: 241$a, 240$a (if 240$l does exist), and 500$a (if it starts with the indicators originaltittler:, or orig.titt.: ) Extended to referenced records Attribute creator: Searched in: 100$a, and 110$a Extended to referenced records

Distributed Systems Division MTA SZTAKI DSD 13ECDL Trondheim, Norway Tested on 4379 records related to Henrik Ibsen Works: 41, of which eight were false positives due to different spelling or spelling errors Expressions: 1111 Manifestations: 1072, of which 35 contained more than one expression But: 3307 records were ignored, because reliable retrieval of Generic Attributes was impossible Unreliable: 580 works, 3706 expressions, 3567 manifestations. Not convincing! Application of the Framework to BIBSYS

Distributed Systems Division MTA SZTAKI DSD 14ECDL Trondheim, Norway Ongoing Work Fault tolerant dissimilarity measure for the clustering process Use of authority files to dissambiguate values Leverage information retrieved from high quality records for incomplete records. Thus making incompleteness a property of the whole catalog and not of single records Apply to a record subset of BIBSYS

Distributed Systems Division MTA SZTAKI DSD 15ECDL Trondheim, Norway Questions?