Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Library Science Perspective on Digitization Bryan Heidorn University of Arizona.

Similar presentations


Presentation on theme: "A Library Science Perspective on Digitization Bryan Heidorn University of Arizona."— Presentation transcript:

1 A Library Science Perspective on Digitization Bryan Heidorn University of Arizona

2 Library-Museum Parallels Intellectual Property Rights Physical/Digital Objects Sharing Descriptive Metadata Formats Preservation Metadata Transport Metadata Formats Communication Protocols (no so much) Similar Digitization Workflow OCR Challenges

3 Intellectual Property Rights Expanded to 75yrs in US from 25 Academic Publishing anomalies Attribution required (data no so much) Decoupling of Data from Text

4 Online Computer Library Center (OCLC) Collaborative Automation of libraries including copy cataloging Started 1967 Catalog 271 million items/year 72,000 libraries in 170 countries and territories use OCLC services to locate, acquire, catalog, lend and preserve library materials.

5 Descriptive Metadata Formats MARC(XML) 21 Standard METS Dublin Core (Interchange Format only)

6 Biodiversity Heritage Library Workflow Courtesy: Martin Kalfatovic Program Director, Biodiversity Heritage Library, Smithsonian Institution Libraries

7

8 MARC 21 Standard Formats: Bibliographic, Authority, Holdings, Classification, Community Bibliographic Material Types: – Books (BK) – Continuing resources (CR) – Computer files (CF) – Maps (MP) – Music (MU) – Visual materials (VM) – Mixed materials (MX) http://www.loc.gov/marc/

9 MARC Fields 00X: Control Fields 01X-09X: Numbers and Code Fields Heading Fields - General Information 1XX: Main Entry Fields 20X-24X: Title and Title-Related Fields 25X-28X: Edition, Imprint, Etc. Fields 3XX: Physical Description, Etc. Fields 4XX: Series Statement Fields 5XX: Note Fields 6XX: Subject Access Fields 70X-75X: Added Entry Fields 76X-78X: Linking Entry Fields 80X-83X: Series Added Entry Fields 841-88X: Holdings, Location, Alternate Graphics, Etc. Fields

10 MARC Book Example eader/00-23*****nam##22*****#a#4500 001 003 00519920331092212.7 007/00-01ta 008/00-39820305s1991####nyu###########001#0#eng## 020##$a0845348116 :$c$29.95 (£19.50 U.K.) 020##$a0845348205 (pbk.) 040##$a[organization code]$c[organization code] 05014$aPN1992.8.S4$bT47 1991 08204$a791.45/75/0973$219 1001#$aTerrace, Vincent,$d1948- 24510$aFifty years of television :$ba guide to series and pilots, 1937-1988 /$cVincent Terrace. 2461#$a50 years of television 260##$aNew York :$bCornwall Books,$cc1991. 300##$a864 p. ;$c24 cm. 500##$aIncludes index. 650#0$aTelevision pilot programs$zUnited States$vCatalogs. 650#0$aTelevision serials$zUnited States$vCatalogs.

11 Difference between Museum and Library Full Darwin code has parallels in MARC Many more commercial and custom products Larger installed base Library Entries somewhat more detailed There is a MARC(XML) and MARC Lite MARC differentiates among material types

12 Digital Content Transport METS – Metadata Encoding and Transmission Standard The METS schema is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language.

13 Courtesy: Martin Kalfatovic Program Director, Biodiversity Heritage Library, Smithsonian Institution Libraries

14 METS Components METS Header Descriptive Metadata Administrative Metadata File Section - The file section lists all files containing content which comprise the electronic versions of the digital object. elements may be grouped within elements, to provide for subdividing the files by object version. Structural Map Structural Links Behavior

15 I/O Submission Information Package (SIP), which is sent from the information producer to the archive; the Archive Information Package (AIP), which is the information package actually stored by the archive; and the Dissemination Information Package (DIP), which is the information package transferred from the archive in response to a request by a consumer.

16 Courtesy: Martin Kalfatovic Program Director, Biodiversity Heritage Library, Smithsonian Institution Libraries

17 Open Archives Initiative Protocol for Metadata Harvesting The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low- barrier mechanism for repository interoperability. Data Providers are repositories that expose structured metadata via OAI-PMH. Service Providers then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of six verbs or services that are invoked within HTTP.

18 OAI Verbs Get Identify ListIdentifiers ListMetadataFormats ListRecords ListSets

19 Get http://arXiv.org/oai2?verb=GetRecord&identif ier=oai:arXiv.org:cs/0112017&metadataPrefix =oai_dc http://arXiv.org/oai2

20 <OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"> 2002-02-08T08:55:46Z <request verb="GetRecord" identifier="oai:arXiv.org:cs/0112017" metadataPrefix="oai_dc">http://arXiv.org/oai2 oai:arXiv.org:cs/0112017 2001-12-14 cs math <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"> Using Structural Metadata to Localize Experience of Digital Content Dushay, Naomi Digital Libraries With the increasing technical sophistication of both information consumers and providers, there is increasing demand for more meaningful experiences of digital information. We present a framework that separates digital object experience, or rendering, from digital object storage and manipulation, so the rendering can be tailored to particular communities of users. Comment: 23 pages including 2 appendices, 8 figures 2001-12-14

21 Metadata Collection and Workflow (Macaw)

22 Physical/Digital Objects Sharing Books both part of an Edition and Unique 20 th century books have standard front matter LMS contained Metadata Only Journals indexed by article Most digital content is commercially owned and born digital 2011 author-publishing exceeded commercial Born analog digitization (Google Books and BHL)

23 Governance Libraries pay for OCLC OCLC is Participatory Close Collaboration with Library of Congress on Standards School System exists to train librarians Libraries are being cut in academic, public and school sectors


Download ppt "A Library Science Perspective on Digitization Bryan Heidorn University of Arizona."

Similar presentations


Ads by Google