OCLC Online Computer Library Center Two Paths to Interoperable Metadata Jean Godby, Devon Smith, Eric Childress DC-2003 September 29, 2003.

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

Dublin Core for Digital Video: Overview of the ViDe Application Profile.
Data Documentation Initiative (DDI) Workshop Carol Perry Ernie Boyko April 2005 Kingston Ontario.
A centre of expertise in digital information management Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background The.
XML-based Network Management Rob Enns
The OCLC Metadata Switch Project Jean Godby, Thomas Hickey, Diane Vizine-Goetz OCLC Office of Research Digital Library Federation May 14, 2003.
An Introduction to MODS: The Metadata Object Description Schema Tech Talk By Daniel Gelaw Alemneh October 17, 2007 October 17, 2007.
Providing Online Access to the HKUST University Archives: EAD to INNOPAC Sintra Tsang and K.T. Lam The Hong Kong University of Science and Technology 7th.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
ICS 123 XML: It’s a Good Thing Richard N. Taylor & Eric M. Dashofy ICS 123 S2002.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
A Practical Introduction to XML in Libraries Marty Kurth NYLA October 22, 2004.
Outline Chapter 1 Hardware, Software, Programming, Web surfing, … Chapter Goals –Describe the layers of a computer system –Describe the concept.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
Introduction to XSLT & its use in Grainger Library full-text & metadata projects Thomas G. Habing Grainger Engineering Library Presentation to ASIS&T,
Batch-conversion of Non-standard Multiscript Records by XSLT Lucas Mak Metadata and Catalog Librarian Michigan State University Catalog Management Interest.
A centre of expertise in digital information management Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background Dublin.
Digital Encoding What’s behind E-text Resources?.
Digital Object: A Virtual Online Storage Solution 598C Course Project Huajing Li.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
XML: More than just a cool acronym? Michael Mason DecisionSoft Limited.
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Scientific Markup Languages Birds of a Feather A 10-Minute Introduction to XML Timothy W. Cole Mathematics Librarian & Professor of.
Object and component “wiring” standards This presentation reviews the features of software component wiring and the emerging world of XML-based standards.
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
“Old Style” Libraries, Digital Libraries: Convergences, Divergences, And the Troubles in Between.
XML The Overview. Three Key Questions What is XML? What Problems does it solve? Where and how is it used?
Introduction technology XSL. 04/11/2005 Script of the presentation Introduction the XSL The XSL standard Tools for edition of codes XSL Necessary resources.
An Introduction to XML Presented by Scott Nemec at the UniForum Chicago meeting on 7/25/2006.
Session II Chapter 2 – Chapter 2 – XSLhttp://
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
Lucas Mak and Dao Rong Gong Michigan State University Millennium and XML: Repurposing and Customizing Metadata May , 2009.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
ALCME: OAI at OCLC Jeffrey A. Young OCLC Online Computer Library Center, Inc.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
DLI Training April 2004 Kingston Ontario. DDI What, Why, How?
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
5. Applying metadata standards: Application profiles Metadata Standards and Applications Workshop.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata Interaction, Integration, and Interoperability MODS, MARC and Metadata Interoperability, ALA Conference, June 27, 2005, Chicago, IL William E.
Interoperability How to Build a Digital Library Ian H. Witten and David Bainbridge.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
XML The Overview. Three Key Questions What is XML? What Problems does it solve? Where and how is it used?
Digital libraries research IG Cataloging and metadata IG Web services and metadata switch February 2003 Web services and metadata switch February 2003.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
CHAPTER NINE Accessing Data Using XML. McGraw Hill/Irwin ©2002 by The McGraw-Hill Companies, Inc. All rights reserved Introduction The eXtensible.
A MARC-LOM Crosswalk for e-Learning Yang Cao, Fuhua Lin, Rory McGreal, etc May 19, 2004.
A RCHIVAL COLLECTIONS IN A D IGITAL W ORLD Cheryl Walters Nov. 6, 2008.
Sharing Your Finding Aids in CONTENTdm Encoded Archival Description (EAD) Files in Mountain West Digital Library June 3, 2009 Sandra McIntyre, Mountain.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Beyond HTML: Extensible Markup Language (XML)
Kynn Bartlett 11 April 2001 STC San Diego The HTML Writers Guild Copyright © 2001 XML, XHTML, XSLT, and other X-named specifications.
1 XML and XML in DLESE Katy Ginger November 2003.
7th Annual Hong Kong Innovative Users Group Meeting
HTML, XHTML, and the World Wide Web
Yaşar Tonta & Orçun Madran [yasartonta, Hacettepe University
Metadata and XML <xmlpresentation>
A Lightweight Structured Data Implementation Using JSON-LD and Schema
Workshop on XML-Based Library Applications 5
A Web service for transforming metadata schemas
CSE591: Data Mining by H. Liu
Metadata supported full-text search in a web archive
Presentation transcript:

OCLC Online Computer Library Center Two Paths to Interoperable Metadata Jean Godby, Devon Smith, Eric Childress DC-2003 September 29, 2003

Outline of this talk The Metadata Switch project The Metadata Schema Transformations project An experiment with XML and XSLT A new system design Open issues Project status

The Metadata Switch project An umbrella activity for a set of OCLC Research projects that construct modular services for adding value to metadata. Some examples: –Harvesting –Fusion of metadata from different sources –Name authorities

Metadata Schema Transformations: Project goals A robust design for metadata translation –Clean separation of: document data model schema translations machinery –Support for current practice and for foreseeable innovation A metadata translation system/toolkit –Self-contained Web service for metadata translation –A place for human input (intellectual mappings) in an automated system

High-level system design Metadata schema translator Web services layer Crosswalk repository client Record translation client A transformed record A record A metadata crosswalk

XML and XSLT XML (Extensible Markup Language) –A markup language for structured documents. –Like SGML, but designed for use in the Web environment. –Like HTML, but display and structure markup are distinct. –A World Wide Web Consortium (W3C) standard; many supporting tools. XSLT (Extensible Stylesheet Language Transformations) –A tree-oriented language for transforming XML documents. –A W3C recommendation; newer than XML.

Our working client

A test case The records: –Data streams from the Colorado Digitization Project. –Minimal Dublin Core XML records that describe photographs. The process: –OCLC Research uses an XSLT script to convert DC simple records to MARC XML. A Perl script converts the XML records to MARC –Records are sent to OCLC production software for correction, validation, and batch loading into the WorldCat database.

Before and after

Problems with our XML/XSLT solution Lots of conditions for use –XML records –Supporting XML documentation: schemas, namespaces, DTDs, URIs –XSLT scripts or XSLT programming expertise –Simple structural transforms Not appropriate –for semantic mappings. Element semantics is lost. –when standards and encodings are in flux. Supporting documentation is unmanageable. –for our model of collective intelligence. Knowledge in a set of XSLT scripts can’t be mined.

The XSLT solution: reprise If they’re not equivalent, how do they differ? Which crosswalks have XML schemas that match my data? Which crosswalk s are equivalent ?

The long translation path: why? Metadata translation needs a layer of abstraction. –Metadata standards have many versions or encodings, but element definitions and mappings stay the same. –The lack of abstraction leads to a combinatoric explosion of pairwise mappings. (Metadata schema X (versions * encodings)) * (Metadata schema Y (versions * encodings)) –The meanings behind the semantic transforms are lost unless they are recorded and associated with element definitions. A full commitment to XML may be premature.

The long translation path 11 File of records in format X 55 File of records in format Y 22 Transform to intermediate format STRUCTURAL TRANSFORM Transform to output format Y STRUCTURAL TRANSFORM Transform interoperable core to intermediate format44 SEMANTIC TRANSLATION Transform intermediate format to interoperable core33 Interoperable Core SEMANTIC TRANSLATION Semantic maps from Excel tables

What the long path accomplishes The model encodes two sources of abstraction. –Syntactic normalization –Semantic mappings The user interacts with familiar objects. –A set of documents –Human-readable mappings But: Metadata schema translation is indirect. There is more processing overhead for “best-case” XML documents.

Custom software Handles “XML-ish” and non-XML data Normalizes variation in records Does special handling of data required for complete and robust translations Translates user-supplied crosswalks Advanced XML Uses XPointer and XLink to: – implement the interoperable core – document the semantics of translations Works with established standards Open issue 1: Implementing the long path

Open issue 2: The interoperable core A union…or an intersection of elements? An established standard…or a custom design? One…or many interoperable cores?

Project status Development is interspersed with testing on third-party data. The custom software for the long translation translation path is due to be completed in Autumn The advanced XML solution is being studied.

For further information The Metadata Switch Project at OCLC