Presentation is loading. Please wait.

Presentation is loading. Please wait.

Timothy W. Cole Jenny Benevento Muriel Foulonneau

Similar presentations


Presentation on theme: "Timothy W. Cole Jenny Benevento Muriel Foulonneau"— Presentation transcript:

1 OAI-PMH Projects in the University Library Briefing for GSLIS Digital Library Fellows
Timothy W. Cole Jenny Benevento Muriel Foulonneau University of Illinois at Urbana-Champaign Grainger Library | 28 October 2005

2 Projects IMLS Digital Collections & Content ( Funded by IMLS through September 2007 Project Coordinator: Jenny Benevento Near-term: Add LSTA funded projects (incl. spring workshop); integrate collection & item-level interfaces OAI-CIC Collaborative Metadata Sharing Project ( Funded by CIC through June 2006 Project Coordinator: Muriel Foulonneau Near-term: one-on-one usability testing; more research on enrichment of metadata; analysis of metadata 28 Oct 2005

3 IMLS DCC – Context Project began December 2002 Project Objectives:
Implement a collection registry of digital collections created or developed with funding from IMLS NLG program Use OAI-PMH to implement an item-level metadata repository for content objects contained in NLG collections Carry out associated research related to: Utility and usability of Registry & Repository Current metadata practices of IMLS NLG grantees Implications for interoperability (Framework of Guidance for Building Good Digital Collections) 28 Oct 2005

4 Accomplishments to Date
Collection Registry & item-level repository publicly accessible: Selected Research Publications to Date: Shreeves, S.L. & Cole, T.W Developing a Collection Registry for IMLS NLG Digital Collections [Poster Abstract]. In DC-2003: Proceedings of the International DCMI Metadata Conference and Workshop p Cole, T.W. & Shreeves, S.L Search and discovery across collections: The IMLS Digital Collections and Content Project. Library Hi Tech 22(3): Palmer, Carole L. and Ellen M. Knutson Metadata practices and implications for federated collections. In Proceedings of the 67th Annual Meeting of the American Society for Information Science and Technology , Edited by Linda Schamber & Carol L. Barry. Medford, NJ: Information Today, Inc: Shreeves, S.L., Knutson, E.M., Stvilia, B., Palmer, C.L., Twidale, M.B., & Cole, T.W. (2005). Is ‘quality' metadata ‘shareable' metadata? The implications of local metadata practice on federated collections. In H.A. Thompson (Ed.) Proceedings of the Twelfth National Conference of the Association of College and Research Libraries, April , Minneapolis, MN . Chicago, IL: Association of College and Research Libraries. 28 Oct 2005

5 IMLS-DCC Collection Registry
108 primary NLG collection records Includes additional 40 records for sub-collections Also 29 associated / 61 physical collections 158 projects represented Objects represented in registered collections Images: prints, posters, photographs Text: books, archival finding aids, newspapers Digital surrogates of physical objects: artifacts, specimens, textiles Sound: oral histories, sound files, wax cylinders Interactive resources Moving image Datasets 28 Oct 2005

6 Collection-Level Description
28 Oct 2005

7

8

9

10 IMLS-DCC Metadata Repository
30 collections represented 35 repositories harvested ~200,000 item-level records indexed Harvest in Simple Dublin Core & MARC 21 XML Testing with Qualified Dublin Core Issues explored: How metadata best practices are being adopted Interface customization based on audience Local context versus global context Intentional creation and collection development of digital collections 28 Oct 2005

11 Sample of Metadata Statistics http://imlsdcc. grainger. uiuc
Collection 1 Collection 2 Collection 3 Collection 4 Number of records in collection 27,444 14,425 1,599 35 Type of collection Large collaborative digitization project Large academic library Small academic library and public library collaboration Small academic library % of records with DC element <title> 100 99 % of records with DC element <creator> 59 57 83 % of records with DC element <subject> 97 28 Oct 2005

12

13

14

15

16 Plans for next 2 years Other themes for extension phase
IMLS has extended grant through 2007 Will add a handful of Illinois LSTA-funded projects in 2006 Workshop spring 2006 for LSTA projects More LSTA projects added in 2007 Other themes for extension phase Integration of item-level & collection-level services Metadata normalization, transformation, and enrichment Interface & metadata design for targeted audiences (GEM) Collection Identity & metadata granularity Knowledge diffusion of metadata best practices 28 Oct 2005

17 IMLS DCC project contact information Tim Cole PI, IMLS Digital Collections and Content University of Illinois Library at Urbana-Champaign Jenny Benevento Project Coordinator, IMLS Digital Collections and Content University of Illinois at Urbana-Champaign 28 Oct 2005

18 OAI-CIC collaborative metadata sharing project
Sept – June 2006 participants 19 data providers in 10 CIC institutions. University of Iowa has 2 repositories in test From 350,000 to 520,000 resources Objectives Reinforcing the regional collaboration Investigating metadata shareability OAI – PMH as technical infrastructure for metadata sharing A Web interface to CIC aggregated material 28 Oct 2005

19 Repositories behavior
By collection + regular increase By collection only Regular increase No change Dead 28 Oct 2005

20 Configuration of harvests
Handle deleted records Harvests set descriptions 3 metadata formats UDC QDC MODS Different degrees of XML validations Harvest either by set or full repository 28 Oct 2005

21 We have particularly encouraged
Use of rich metadata formats Division of repositories into sets Creation of set descriptions The use of resumption tokens 28 Oct 2005

22 Areas of ongoing research
Clustering Topicality? Multiple levels of granularity Multiple metadata formats Data normalization and enrichment Collection definition The resources behind the URLs 28 Oct 2005

23 Workflow : 3 streams Rebuild tables and indexes Normalize Data Harvest
SQL database archive Institution / repository Rebuild DLXS Indexes Classify for DLXS Institution / type Institution / collection Digital only 28 Oct 2005

24 Additional workflow for collections
Item database merge Collections.xml SQL db enrich Harvest set descriptions DC Coll. records DLXS MARC records Grab thumbshots of collection homepages 28 Oct 2005

25 Types of reprocessing Selection Cleaning Normalization Augmentation
Customization for ingest in applications Looking at ways to reprocess metadata to support specific services / functionalities for end-users 28 Oct 2005

26 Metadata processing by DL function
FIND CO-LOCATE 28 Oct 2005

27 More DL functions CO-LOCATE SELECT INTERPRET OBTAIN / IDENTIFY
28 Oct 2005

28 Faceted access points 28 Oct 2005

29 Potential objectives & obstacles
Objectives / Benefits Experiment with a Faceted interface Should maximize utility of metadata reprocessing Should avoid DP to repeat values for multiple purposes Retain provenance of metadata elements Problems Verbose enriched records with a lot of redundancy Not sure how to share this back out with others 28 Oct 2005

30 Resources behind the URLs
<title>My resource</title> <date>04 <title>My resource</title> <date>04 404 Page not found <title>My resource</title> <date>04 <title>My resource</title> <date>04 <title>My resource</title> <date>04 <title>My resource</title> <date>04 28 Oct 2005

31 Thumbnails Metadata schema enrichment The Thumbgrabber application
One element from the Picture Australia schema The Thumbgrabber application thumbnails and thumbshots – currently 35,000 How to convey information? Jump-off pages Additional metadata record ? Same problems trying to grab full content 28 Oct 2005

32 Integrated Access to CIC Metadata
4 views / filtering 28 Oct 2005

33 Geographic access to resources
28 Oct 2005

34 Experiment to use collection level descriptions
28 Oct 2005

35 Collection-enabled functions
Co-locating resources Grouping results Browsing collections Filtering results Selecting relevant search results Interpret item description Search granularity Search collections only Search items with collection information 28 Oct 2005

36 Adding context 28 Oct 2005

37 Filtering / selecting 28 Oct 2005

38 Searching / selecting source / co-locating
28 Oct 2005

39 Suggesting complementary resources
28 Oct 2005

40 Match cases for multiple terms queries
Item desc. Collection desc. Case A Part of Query Rest of Query Case B No match All of Query Case C Case D Case E Case F Case G Case H Case J 28 Oct 2005

41 Test with real-life queries
# of queries with at least 1 item-level match of the case % of queries with at least 1 item-level match of the case Case A 287 17.00% Case B 21 1.20% Case C 761 45.10% Case D 25 1.50% Case E 20 Case F 222 13.10% Case G 1,639 97.00% Case H 940 55.70% Case J 945 56.00% Partial match Rest of match Full match No match 28 Oct 2005

42 Outputs to date Contributions to DLF-NSDL best practices
Articles JCDL 2005 – Denver : Using Collection Descriptions to Enhance an Aggregation of Harvested Item-Level Metadata ECDL Vienna: Strategies for reprocessing aggregated metadata ICDAT Taipei : Metadata aggregation for digital libraries D-Lib – Jan 2006 – Automated capture of thumbnails and thumbshots for use by metadata aggregation services Sci Tech Lib : The CIC metadata portal: A collaborative effort in the area of digital libraries 28 Oct 2005

43 DLXS interface Integration of collection – item features in DLXS
Collection level descriptions And classification by Type and / or Subject? Faceted search Additional access points OAI provider repository Federated search target 28 Oct 2005

44 Metadata Topicality Granularity
At collection level Automatic classification (might work better with full content) Granularity A format and guidelines for collection / set descriptions? Can collections live without items? What about Websites -> descriptions? Is there anything we can do with EAD files? => the DLF-NSDL best practices for shareable metadata 28 Oct 2005

45 Usability testing Quantitative information on discoverability
Using OAIster logs User testing Indiana University contributed to build a plan to test Collections and context Thumbnails 28 Oct 2005

46 CIC metadata project contact information Tim Cole UIUC PI for OAI-CIC metadata sharing project University of Illinois Library at Urbana-Champaign Muriel Foulonneau Project Coordinator, OAI-CIC metadata harvesting service University of Illinois at Urbana-Champaign 28 Oct 2005


Download ppt "Timothy W. Cole Jenny Benevento Muriel Foulonneau"

Similar presentations


Ads by Google