Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Inquiry and Analysis of Metadata Utilization A Case Study of MARC 2005 ASIS&T Annual Meeting, November 1, 2005, Charlotte, North Carolina William E.

Similar presentations

Presentation on theme: "An Inquiry and Analysis of Metadata Utilization A Case Study of MARC 2005 ASIS&T Annual Meeting, November 1, 2005, Charlotte, North Carolina William E."— Presentation transcript:

1 An Inquiry and Analysis of Metadata Utilization A Case Study of MARC 2005 ASIS&T Annual Meeting, November 1, 2005, Charlotte, North Carolina William E. Moen School of Library and Information Sciences Texas Center for Digital Knowledge University of North Texas Denton, TX 72603

2 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 2 Two quality criteria Fullness/completeness Usefulness

3 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 3 Context for the initial analysis Z39.50 Interoperability Testbed project A Institute of Museum and Library Services National Leadership Grant Goal: Improve Z39.50 semantic interoperability among libraries for information access and resource sharing Interoperability across library online catalogs Indexing of MARC records to support searching Richness of MARC content designation available Inform indexing guidelines and policies

4 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 4 Indexing & MARC Indexing Guidelines to Support Z39.50 Profile Searches (available on Z-Interop website) Identified all MARC 21 fields/subfields that can contain author, title, or subject data Author-related fields/subfields : 119 AuthorTitle-related fields/subfields: 21 Title-related fields/subfields: 253 Subject-related fields/subfields: 144

5 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 5 Z-Interop test dataset Books: 91% Cartographic Materials: < 1% Electronic resources: < 1% Archival/Mixed Materials: <1% Sound recordings: 4% Visual Materials: 1% Serials: 3% Approximately 1% sample of MARC records from OCLC’s WorldCat database Weighted sampling based on number of libraries “holding” the object represented by the record 419,657 total MARC records 89% of records “full level” cataloging Formats represented in test dataset

6 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 6 MARC 21 content designation MARC 21 Field Groups Currently Defined ObsoleteTotalMARC 1972 (Books Format Only) 00x6173 0xx238724528 1xx6616740 2xx1373216915 3xx109321414 4xx690 37 5xx323383618 6xx184518966 7xx4524749941 8xx1412016136 TOTAL17251831908278

7 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 7 Content designation in dataset MARC 21 Field Groups Currently Defined ObsoleteUnlikely Used Total 00x6006 0xx96133130 1xx490251 2xx81019100 3xx236029 4xx1003040 5xx12813132 6xx10417112 7xx20505210 8xx10538116 TOTAL80712107926

8 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 8 Summary frequency results Frequency# of Fields/Subfields% of All Occurrences > 600,00014.4% 500,000 > 599,99900% 400,000 > 499,9991339.9% 300,000 > 399,999614.3% 200,000 > 299,999610.6% 100,000 > 199,9991010.3% TOTAL3679.5% Total number of fields/subfields occurring in dataset = 13,849,499 Only 4% of all fields/subfields account for 80% of all occurrences or 96% of all fields/subfields account for 20% of all occurrences

9 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 9 Characteristics of top 36 Most frequently occurring: 650 $a [Subject data] 2 nd most frequently occurring: 040 $d [Cataloging source] 3 rd & 4 th most frequently occurring: 260 $a & $b [Publication information] 5 th most frequently occurring: 245 $a [Title] Contain data useful to end users: 28 Contain control numbers, etc.: 5 Contain data useful to catalogers: 3 Top 36 fields/subfields

10 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 10 Implications for indexing 537 fields/subfields contain author, title, subject data 381 of these actually occur in Z-Interop dataset Total occurrences of the 381 = 4,397,712 19 of the 381 (5%) account for 80% of all occurrences 9 of 19 are subject-related 5 of 19 are author-related 5 of 19 are title-related Preliminary testing using only 19 indexed fields: 95% - 100% of correct records retrieved!

11 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 11 The MCDU Project The M ARC C ontent D esignation U tilization Project What is the extent of catalogers’ use of content designation available in MARC 21? Develop and implement systematic methods, procedures, and software tools to produce reliable and valid analysis of MARC 21 content designation use MARC record as artifact of cataloging enterprise FOR MORE INFORMATION, VISIT THE PROJECT WEBSITE…

12 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 12 The MCDU dataset & analysis 56 million MARC records – all WorldCat bib records Parsed and stored in MySQL 20 databases LC and Non-LC created records 10 databases each based on type of record/format Frequency counts of all fields/subfields Non-LC Book Format field occurrence results

13 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 13 Making sense of the numbers The numbers don’t stand on their own – contextualizing, qualifying, exploring, understanding Metadata quality – Fullness/completeness Identify core elements of bibliographic records based on the analysis of format-specific samples and compare with existing recommendations for core records Metadata quality – Usefulness Comparing the FRBR conceptual framework’s user tasks, MARC content designation supporting those tasks, and utilization of that content designation in the records

14 Moen ASIS&T 2005 -- Charlotte, NC-- November 1, 2005 14 References MARC Content Designation Utilization Project  Z39.50 Interoperability Testbed 

Download ppt "An Inquiry and Analysis of Metadata Utilization A Case Study of MARC 2005 ASIS&T Annual Meeting, November 1, 2005, Charlotte, North Carolina William E."

Similar presentations

Ads by Google