The Basics of OAI An Introduction to the Protocol for Metadata Harvesting Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond July.

Slides:



Advertisements
Similar presentations
IMLS NLG Collection Registry & Item-Level Metadata Repository at the University of Illinois Timothy W. Cole Mathematics Librarian &
Advertisements

The Open Archives Initiative Protocol for Metadata Harvesting and the IMLS Digital Collections & Content Project at the University of Illinois Timothy.
Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel.
OAI in DigiTool DigiTool Version 3.0.
Harvesting Metadata Using OAI-PMH Roy Tennant California Digital Library.
OAI-PMH Dawn Petherick, University Web Services Team Manager, Information Services, University of Birmingham MIDESS Dissemination.
National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
UCLA Digital Library UC Digital Library Forum August 5, 2002 UCLA Digital Library Presenter: Curtis Fornadley Senior Programmer/Analyst.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
OAI-PMH at Yale Report on the DLF OAI Training Session November 10, 2005 Charlottesville, VA.
Basic Concepts Architecture Topology Protocols Basic Concepts Open e-Print Archive Open Archive -- generalization of e-print Data Provider and Service.
Introduction to the OAI Metadata Harvesting Protocol Hussein Suleman, Digital Library Research Laboratory Virginia Tech.
IMLS NLG Collection Registry & Item-Level Metadata Repository at the University of Illinois Timothy W. Cole Mathematics Librarian &
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
Metadata Repositories for Interoperable/Shareable Metadata.
Metadata Harvesting Interoperable digital collections.
Getting Started with CONTENTdm Corey Harper, University of Oregon Terry Reese, Oregon State University OLA - April 8, 2005.
I Never Met a Data I Didn’t Like Metadata Issues in Local and Shared Digital Collections Presentation to ALCTS Electronic Resources Interest Group January.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources Timothy W. Cole University of Illinois at Urbana-Champaign.
Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials Arwen Hutt, University of Tennessee.
What does this record describe? identifier: X0802]1004_112 publisher: Museum of Zoology, Fish Field Notes format:jpeg.
What does this record describe? identifier: X0802]1004_112 publisher: Museum of Zoology, Fish Field Notes format:jpeg.
November 10, 2005DLF OAI Training Interoperability, OAI, and Shareable Metadata Sarah Shreeves University of Illinois at Urbana-Champaign OAI Best Practices.
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
Metadata and Documentation Iain Wallace Performing Arts Data Service.
The OAI Protocol for Metadata Harvesting Van de Sompel, Herbert Los Alamos National Laboratory – Research Library.
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March © Heriot-Watt University. You may reproduce all or any part.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
Digitization Training and Metadata The View from Two UIUC Projects Sarah L. Shreeves University of Illinois at Urbana-Champaign Truth and Consequences.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
Bitter Harvest Metadata Harvesting Issues, Problems, and Possible Solutions Roy Tennant California Digital Library.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata and OAI DLESE OAI Workshop June 29 to July 2, 2002 Katy Ginger Presentation available at:
The OAI: technical overview OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University -- Computer Science.
Best Practices for OAI: A Status Report Kat Hagedorn Sarah Shreeves DLF Spring Forum San Diego, CA April
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
Open Archives Initiative Protocol for Metadata Harvesting.
Digitization – Basics and Beyond workshop Interoperability of cultural and academic resources New services for digitized collections Muriel Foulonneau.
Shareable Metadata in the Museum Community Jenn Riley Metadata Librarian Indiana University Digital Library Program.
OAI Tools By Thomas G. Habing Grainger Engineering Library Information Center University.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Search Interoperability, OAI, and Metadata An Introduction to the OAI Protocol for Metadata Harvesting Sarah Shreeves University of Illinois at Urbana-Champaign.
2/22/2016J Ammerman1 Open Archives Initiative What is it? What’s it good for?
NSDL & the Open Archives Initiative A Brief Introduction to OAI Timothy W. Cole Mathematics Librarian & Professor of Library Administration.
DLF Fall Forum The Distributed Library: OAI for Digital Library Aggregation UIUC’s Role: Registry of OAI Data Providers
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
Creating Shareable Metadata Pre-Conference at WebWise 2006: Inspiring Discovery: Unlocking Collections Los Angeles, CA February 15, 2006 Jenn Riley, Indiana.
The NSDL, OAI and Your Metadata Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
OAI and ODL Building Digital Libraries from Components Hussein Suleman Virginia Tech DLRL 12 September 2002.
Harvesting and Exporting Metadata 714: Metadata Margaret E.I. Kipp -
Utility of an OAI Service Provider Search Portal
Getting a Leg Up on OAI for the NSDL
Interoperability, OAI, and Shareable Metadata
Georges Arnaout Chaitanya Krishna
Introduction to Metadata
IMLS NLG Collection Registry & Item-Level Metadata Repository at the University of Illinois Timothy W. Cole Mathematics Librarian.
OAI and Metadata Harvesting
OAI 11/20/07.
IDEALS at the University Of Illinois: A Case Study of Integration Between an IR and Library Discovery Systems Sarah L. Shreeves University of Illinois.
Open Archive Initiative
IVOA Interoperability Meeting - Boston
Shareable Metadata: Why and How
Integrated Access and Shareable Metadata
Presentation transcript:

The Basics of OAI An Introduction to the Protocol for Metadata Harvesting Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond July 27, 2004

Basics and Beyond2 Outline What the OAI protocol is & what it is not Place in digital library infrastructure How it works (basically) Challenges for data / service providers

July 27, 2004Basics and Beyond3 OAI- PMH is a tool Moves metadata (not content) from a data provider to a service provider (or harvester) A set of rules that defines the communication between two systems (like FTP and HTTP) Build once, use for many applications – a building block for digital library services Facilitates the federation of metadata

July 27, 2004Basics and Beyond4 OAI-PMH is not…. Metadata A search tool A database Open Access

July 27, 2004Basics and Beyond5 Who uses OAI? Approximately 400 data providers Basic building block of the National Science Digital Library (NSDL); OAIster Incorporated into D-Space and Eprints.org Part of CONTENTdm, Michigan’s DLXS, and other products International use

July 27, 2004Basics and Beyond6 Basic OAI-PMH Concepts “Aggregated search” rather than “Federated search” Data providers – support OAI PMH as a means to expose metadata Service providers – ‘harvests’ metadata from data providers via the OAI-PMH OAI-PMH based upon HTTP and XML OAI-PMH requires use of simple Dublin Core  BUT supports and encourages use of other metadata schemas Unique and Persistent Identifiers and a Datestamp for each OAI record

July 27, 2004Basics and Beyond7 Aggregated Metadata Dig. Mana Sys. OAI Data Provider Data Base OAI Data Provider XML files OAI Data Provider OAI Request OAI Response OAI Request OAI Response OAI Request OAI Data Provid er SERVICESSERVICES OAIHARVESTEROAIHARVESTER

July 27, 2004Basics and Beyond8 Examples of OAI Service Providers OAIster: Engineering, Computer Science, and Physics: Open Language Archives Community:

July 27, 2004Basics and Beyond9 How OAI Works (Technically) 6 distinct ‘verbs’ or requests OAI requests are sent via HTTP Responses are sent in valid XML Dig. Mngt. Sys. OAI H A R V E S T E R OAI Data P R O V I D E R Service Provider Data Provider HTTP Request (OAI Verb) HTTP Response (Valid XML) AGGREGATEDAGGREGATED METADATAMETADATA

July 27, 2004Basics and Beyond10 An OAI Record - oai:docsouth.unc.edu: T13:15:52Z 4 - Advice to Soldiers William Royal United States -- History -- Civil War, Religious aspects. Confederate States of America -- Religion. Soldiers -- Religious life -- Confederate States of America. Soldiers -- Confederate States of America -- Conduct of life. Confederate States of America -- Church history. Sin. [Raleigh, N. C.: s. n., between 1861 and 1865] T13:15:52Z Text text/html en-us

July 27, 2004Basics and Beyond11 OAI “VERBS” Identify ListMetadataFormats ListSets ListIdentifiers ListRecords GetRecord

July 27, 2004Basics and Beyond12 Identify Purpose  Return general information about the archive and its policies (e.g., datestamp granularity) Parameters  None Sample URL  rb=Identify rb=Identify

July 27, 2004Basics and Beyond13 ListSets Purpose  Provide a listing of sets in which records may be organized (may be hierarchical, overlapping, or flat) Parameters  None Sample URL:  rb=ListSets rb=ListSets

July 27, 2004Basics and Beyond14 ListMetadataFormats Purpose  List metadata formats supported by the archive as well as their schema locations and namespaces Parameters  identifier – for a specific record (O) Sample URL  rb=ListMetadataFormats rb=ListMetadataFormats

July 27, 2004Basics and Beyond15 ListIdentifiers Purpose  List headers for all items corresponding to the specified parameters Parameters  from – start date (O) and/or until – end date (O)  set – set to harvest from (O)  metadataPrefix – metadata format to list identifiers for (R)  resumptionToken – flow control mechanism (X) Sample URL  entifiers&metadataPrefix=oai_dc entifiers&metadataPrefix=oai_dc

July 27, 2004Basics and Beyond16 GetRecord Purpose  Returns the metadata for a single item in the form of an OAI record Parameters  identifier – unique id for item (R)  metadataPrefix – metadata format for the record (R) Sample URL  ecord&identifier=oai:aerialphotos.grainger.uiuc.edu:AP- 1A &metadataPrefix=oai_dc ecord&identifier=oai:aerialphotos.grainger.uiuc.edu:AP- 1A &metadataPrefix=oai_dc

July 27, 2004Basics and Beyond17 ListRecords Purpose  Retrieves metadata records for multiple items Parameters  from – start date (O)  until – end date (O)  set – set to harvest from (O)  resumptionToken – flow control mechanism (X)  metadataPrefix – metadata format (R) Sample URL  ecords&metadataPrefix=oai_dc ecords&metadataPrefix=oai_dc

July 27, 2004Basics and Beyond18 Other Pieces of OAI Flow Control Sets Multiple metadata schemas

July 27, 2004Basics and Beyond19 Challenges for the OAI Community Relatively recent protocol but no best practices (yet) ‘Shareablity of metadata’  Heterogeneity of items described  Loss of Context / Information loss  Knowledge structures differ so…. Native metadata schemas differ Controlled vocabularies differ Use and presentation of items differ

July 27, 2004Basics and Beyond20 Metadata for different communities

July 27, 2004Basics and Beyond21 Metadata for different communities

July 27, 2004Basics and Beyond22 Loss of Context: Record in OAI aggregation

July 27, 2004Basics and Beyond23 Context: Record in native database

July 27, 2004Basics and Beyond24 Loss of context / data

July 27, 2004Basics and Beyond25 Loss of context / data

July 27, 2004Basics and Beyond26 Sense / Completeness of Metadata identifier: idx?view=entry;subview=detail;cc=fish3ic;entryid=X- 0802;viewid=1004_112http://images.umdl.umich.edu/cgi/i/image/image- idx?view=entry;subview=detail;cc=fish3ic;entryid=X- 0802;viewid=1004_112 publisher: UMMZ Fish Division format: jpeg type: image subject: subject: 1926;0812;18;Trib. to Sixteen Cr. Trib. Pine River, Manistee R.;R10W;S26; S27;JAM26-460;05;T21N;1926/05/18 language: UND description: Flora and Fauna of the Great Lakes Region;

July 27, 2004Basics and Beyond27

July 27, 2004Basics and Beyond28 Granularity of Description: Excerpt of Metadata Record Describing "Cotton coverlet with embroidered butterfly design" Digital Image of "Cotton Coverlet with Emboridered Butterfly Design" Description: Digital image of a single-sized cotton coverlet for a bed with embroidered butterfly design. Handmade by Anna F. Ginsberg Hayutin. Source: Materials: cotton and embroidery floss. Dimensions: 71 in. x 86 in. Markings: top right hand corner has 1 1/2 in. x 1/2 in. label cut outs at upper left and right hand side for head board; fabric is woven in a variation of a rib weave; color each of yellow and gray; hand-embroidered cotton butterflies and flowers from two shades of each color of embroidery floss - blue, pink, green and purple and single top 20 in. bordered with blue and black cotton embroidery thread; stitches used for embroidery: running stitch, chain stitch, French knot and back stitches; selvage edges left unfinished; lower edges turned under and finished with large gray running stitches made with embroidery floss. Format: Epson Expression 836 XL Scanner with Adobe Photoshop version 5.5; 300 dpi; 21-53K bytes. Available via the World Wide Web. Coverage: — Date Created: :45:18; Updated: ; Created: ; Created: ? Type: Image

July 27, 2004Basics and Beyond29 Granularity of Description: Excerpt of Metadata Record Describing “American Woven Coverlet” Digital Image of "American Woven Coverlet" Description: Materials: Textile--Multi, Pigment—Dye; Manufacturing Process: Weaving--Hand, Spinning, Dyeing, Hand-loomed blue wool and white linen coverlet, worked in overshot weave in plain geometric variant of a checkerboard pattern.Coverlet is constructed from finely spun, indigo-dyed wool and undyed linen, woven with considerable skill. Although the pattern is simpler, the overall craftsmanship is higher than A. - D. Schrishuhn, 11/19/99 This coverlet is an example of early "overshot" weaving construction, probably dating to the 1820's and is not attributable to any particular weaver. -- Georgette Meredith, 10/9/1973 Source: — Format: 228 x 169 x 1.2 cm (1,629 g) Coverage: Euro-American; America, North; United States; Indiana? Illinois? Date: Early 19th c. CE Type: cultural; physical object; original

July 27, 2004Basics and Beyond30 Range of vocabularies in use Element Top three used Controlled Vocabulary (% of respondents who identified C.V.) Subject LCSH (73%); LC TGM I (27%); AAT (17%) Format LC TGM II (17%); AAT (10%); MIME types (8%); AACR2 (8%) Type LC TGM II (21%); DCMI Type (13%); AACR2 (10%) Personal names LC Name Authority File (67%) Geographic names LCSH (27%); LC Name Authority File (25%); Getty Thesaurus of Geographic Names (15%)

July 27, 2004Basics and Beyond31 Data providers can: Create metadata for interoperability  Reusable metadata - think beyond your local users and environment  Use well structured and defined schemas; move beyond simple DC  Use and identify controlled vocabularies

July 27, 2004Basics and Beyond32 Service Providers can… Analyze metadata and cluster and normalize some aspects Communicate with data providers about their metadata Custom interfaces and selective views for target audiences / domains

July 27, 2004Basics and Beyond33 Resources OAI for beginners tutorial OAI Frequently Asked Questions IMLS Digital Collections and Content Project

July 27, 2004Basics and Beyond34 Recap OAI protocol is a tool OAI is easy - metadata is hard Better metadata = better interoperability

July 27, 2004Basics and Beyond35 Sarah Shreeves Project Coordinator IMLS Digital Collections and Content University of Illinois Library at Urbana-Champaign Phone: Website: Contact Information