Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources Timothy W. Cole University of Illinois at Urbana-Champaign.

Similar presentations


Presentation on theme: "Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources Timothy W. Cole University of Illinois at Urbana-Champaign."— Presentation transcript:

1 Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources Timothy W. Cole (t-cole3@uiuc.edu) University of Illinois at Urbana-Champaign http://dli.grainger.uiuc.edu/Publications/TWCole/ALA2003OAI/ ALA/CLA Annual Meeting 22 June 2003 Toronto, CA

2 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Order of Presentation Perspectives on OAI-PMH Illinois OAI metadata harvesting project Goals & objectives Findings regarding metadata Findings regarding search & discovery New OAI projects at Illinois IMLS digital collections & content CIC OAI metadata harvesting project

3 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) OAI Protocol for Metadata Harvesting Harvesting approach to interoperability at metadata level Divides world into Metadata Providers & Service Providers Builds on HTTP, XML, & Dublin Core http://www.openarchives.org/

4 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) OAI Antecedents Call to other E-Print archives (July 1999) Paul Ginsparg, Rick Luce, & Herbert Von de Sompel: “…mobilize core group to work towards achieving a universal service for author self-archived scholarly literature.” Santa Fe Mtgs. (Oct. 1999 & June 2000) OAI – PMH version history: First Alpha Release, Sept. 2000 1.0 (Beta) Release January 2001 1.1 (Beta 2) Release July 2001 2.0 (Production) Release June 2002

5 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Original OAI Organization OAI Executive: Carl Lagoze & Herbert Van de Sompel OAI Steering Committee: Co-Chairs: Dan Greenstein, Cliff Lynch OAI Technical Committee Funded by NSF, DLF & CNI Seeks to be user community driven

6 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) OAI-PMH as a tool All about moving metadata around Designed to be a building block, useable by many different communities Can facilitate (in some cases enable) services & functions Assumes widely distributed content, but centralized indexing(!) & services Build once, use for many applications Focus of OAI is interoperability

7 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Harvesting vs. Broadcast Competing approaches to interoperability Distributed/Broadcast searching: search and discovery over remote services and data Harvesting is when data/metadata is transferred from the remote source to the destination where search & discovery services are located (e.g. Union catalogs) OAI-PMH is a harvesting protocol

8 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) As Compared to Z39.50 Z39.50OAI Content (Objects)Distributed World ViewBibliographic Object PresentationData provider Searching isDistributedCentralized Search done byData providerService provider Metadata searched isUp to dateStale Semantic MappingWhen searchingMetadata delivery

9 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Metadata vs. Resources Resource refers to information objects or digital representations of information objects Metadata item is a collection of properties about a resource (e.g. title, author, etc.) Metadata record is a metadata item expressed in a specific syntax according to an XSD OAI focuses on metadata, with the implicit understanding that metadata contains useful links to the source information object(s)

10 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Data and Service Providers Data Providers (Repositories) refer to entities who possess resources & metadata and are willing to share metadata with others via well-defined OAI protocols Service Providers (Harvesters) are entities who harvest metadata from Data Providers in order to supply higher- level services to users (e.g. search & discovery) OAI uses these denotations for its client/server model (data=server, service=client)

11 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Reliance on HTTP & XML OAI-PMH is a REpresentational State Transfer (REST) protocol (unlike RPC, SOAP) OAI requests and responses are sent via the HTTP protocol OAI Requests are encoded as HTTP GET or POST operations OAI Responses are valid XML documents

12 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) XML Namespaces and Schema Consistency and data “quality” is ensured by using XML Schema Definitions (XSD) for all responses XML Namespaces are used where necessary to clearly define which parts of the responses are actual metadata and which support the Metadata Harvesting Protocol

13 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) OAI-PMH Use of Dublin Core DC is OAI’s lowest common denominator OAI supports & encourages use of other, community-driven metadata schemas Typically, metadata provider stores metadata in ‘best’ schema as dictated by material & resources Crosswalk (semantic mapping) to simpler schemas Semantic mapping at metadata delivery (rather than at time of search) As with Z39.50, can’t search for what’s not there

14 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) When to use OAI-PMH Metadata is sufficient for services desired Normalization, dedupping, metadata augmentation desired Content is widely distributed across small, non-Z39.50 enabled repositories OAI-PMH is more lightweight than Z39.50 Portals can use BOTH Z39.50 & OAI-PMH

15 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) What OAI-PMH Is Not Not search & discovery on its own Not a database management system Not a single metadata schema Not OAIS

16 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) How OAI Works OAI “VERBS” Identify ListMetadataFormats ListSets ListIdentifiers ListRecords GetRecord HARVESTERHARVESTER REPOSITORYREPOSITORY OAI Service Provider Metadata Provider HTTP Request HTTP Response (OAI Verb) (Valid XML)

17 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) OAI Provider Architectures Descriptive Metadata DBMSXML HTML OAI Administrative Metadata, e.g., Ids, datestamps, sets, formats Webserver - HTTP OAI Application (CGI, ASP, PHP, etc.) OAI Harvesters

18 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) A few projects using OAI-PMH Basic building block of the National Science Digital Library Large-scale implementations in E-Prints, OLAC, NDLTD, … Built into ENCompass, ContentDM, Michigan’s DLXS, D- Space, and other products Open Archives Forum in Europe; will be part of federation activities in the UK and EU

19 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Univ. of Illinois OAI Metadata Harvesting Project Funded by Andrew W. Mellon Foundation (July 2001 – May 2003) Primary objectives: Develop & make available OAI harvesting tools Build search services for aggregated metadata in the domain of cultural heritage Examine metadata aggregation issues, including use of EAD in OAI context Investigate utility of aggregated metadata, including preliminary testing with end-users

20 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Type of resources 39 data providers academic libraries Museums / cultural orgs digital libraries public library 1.1 million original DC records + 1.5 million derived from EAD

21 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Variations in DC element usage Records containing subject & description element SUBJECTDESCRIPTION Digital libraries (10 total, 122,719 records) 78%36% Museums, hist. societies, etc. (6 total, 255,800 records) 93% Academic libraries (7 total, 235,294 records) 15%13% Many different controlled and local vocabularies in use Granularity: a record may describe a collection of coins — or one coin

22 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Excerpt of a metadata record describing a cotton coverlet Description: Digital image of a single-sized cotton coverlet for a bed with embroidered butterfly design. Handmade by Anna F. Ginsberg Hayutin. Source: Materials: cotton and embroidery floss. Dimensions: 71 in. x 86 in. Markings: top right hand corner has 1 1/2 in. x 1/2 in. label cut outs at upper left and right hand side for head board; fabric is woven in a variation of a rib weave; color each of yellow and gray; hand-embroidered cotton butterflies and flowers from two shades of each color of embroidery floss - blue, pink, green and purple and single top 20 in. bordered with blue and black cotton embroidery thread; stitches used for embroidery: running stitch, chain stitch, French knot and back stitches; selvage edges left unfinished; lower edges turned under and finished with large gray running stitches made with embroidery floss. Format: Epson Expression 836 XL Scanner with Adobe Photoshop version 5.5; 300 dpi; 21-53K bytes. Available via the World Wide Web. Coverage: — Date Created: 2001-09-19 09:45:18; Updated: 20011107162451; Created: 2001-04- 05; Created: 1912-1920? Type: Image

23 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Excerpt of a metadata record describing "American woven coverlet“ Description: Materials: Textile--Multi, Pigment—Dye; Manufacturing Process: Weaving--Hand, Spinning, Dyeing, Hand-loomed blue wool and white linen coverlet, worked in overshot weave in plain geometric variant of a checkerboard pattern.Coverlet is constructed from finely spun, indigo-dyed wool and undyed linen, woven with considerable skill. Although the pattern is simpler, the overall craftsmanship is higher than 1934.01.0094A. - D. Schrishuhn, 11/19/99 This coverlet is an example of early "overshot" weaving construction, probably dating to the 1820's and is not attributable to any particular weaver. -- Georgette Meredith, 10/9/1973 Source: — Format: 228 x 169 x 1.2 cm (1,629 g) Coverage: Euro-American; America, North; United States; Indiana? Illinois? Date: Early 19th c. CE Type: cultural; physical object; original

24 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Implications Service providers Automatically normalize metadata encoding where possible (e.g., dates) Normalize for and co-locate by type / format where possible Metadata providers Create metadata for interoperability Consider more expressive schema – e.g., Qualified DC, MARC

25 Original interface Portal had two search pages—simple (keyword) and advanced.

26

27 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Pilot study with student teachers 23 users in honors-level C&I class Assignment: Use the site in preparing a lesson plan (high school social studies) __________ Introduced to “aggregated metadata” concept Focus group interviews conducted Students’ papers examined Transaction logs analyzed

28 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Results of initial user testing 1. Users expected all links pointed to digital objects Some records pointed to finding aids Some records pointed to collection’s web site Some records described analog objects 2. Users unable to make use of search results Simple searches produced 1000s of unranked results Advanced search (with limits) rarely used 3. Distinction between portal and data providers unimportant to users

29 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) What does “online access” mean? To librarian & curator To student teacher

30 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Response to test results EAD-derived records segregated Analog only collections excluded Categories of resource types reduced to 3: Images and Video Text, Sheet Music, and Websites Museums and Archival Collections

31 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Revised interface Simple keyword & advanced search put on one page Clarify “online access” Natural language in Boolean operators

32 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) Revised search results Link goes to finding aid or collection page? “Learn more.” Link displays object? “View item.” Subj/Desc expanded

33 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) IMLS Digital Collections & Content Build a registry of all National Leadership Grant collections with digital content. Assist and guide NLG projects in making item- level metadata sharable using OAI. Build a repository and search & discovery tools for integrated access to the content of NLG collections (unique metadata schema?). Research best practices for sharing metadata about diverse digital content and for supporting the interests of diverse user communities.

34 http://imlsdcc.grainger.uiuc.edu/

35 22 June 2003ALA 2003 / OAI-PMH Tim Cole (t-cole3@uiuc.edu) CIC OAI metadata harvesting Univ. of Illinois at UC will host an OAI-PMH metadata harvesting service for 10 CIC libraries Project Goals (3 year experimentation phase) Improve access to selected resources at CIC libraries Advertise these resources (internally & externally) Prepare member institutions for future grant- mandated OAI-based resource sharing Serve as a useful testbed for experimentation with OAI-PMH, development of metadata best practices, usability and user needs testing, etc.

36 Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources http://dli.grainger.uiuc.edu/Publications/TWCole/ALA2003OAI/ Timothy W. Cole (t-cole3@uiuc.edu) University of Illinois at Urbana-Champaign


Download ppt "Using OAI-PMH to Aggregate Metadata Describing Cultural Heritage Resources Timothy W. Cole University of Illinois at Urbana-Champaign."

Similar presentations


Ads by Google