Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata for Digitization and Preservation. Introduction What is metadata and why it matters The key elements How metadata is created Where metadata is.

Similar presentations


Presentation on theme: "Metadata for Digitization and Preservation. Introduction What is metadata and why it matters The key elements How metadata is created Where metadata is."— Presentation transcript:

1 Metadata for Digitization and Preservation

2 Introduction What is metadata and why it matters The key elements How metadata is created Where metadata is stored Metadata standards How much will it cost?

3 What is metadata? Tony Gill – ARTstor Metadata refers to structured descriptions, stored as computer data, that attempt to describe the essential properties of other discrete computer data objects. Big picture definition: the sum total of what can be said about any information object at any level of aggregation

4 What is metadata for? World Wide Web consortium say metadata is: to provide a means to discover that the data set exists and how it might be obtained or accessed to document the content, quality, and features of a data set, indicating its fitness for use. Therefore we need to think: content, context and structure

5 Why Does Metadata Matter? Doing research on the Web is like using a library assembled piecemeal by packrats and vandalized nightly. – R. Ebert, Internet Life Finding the needle in the haystack Managing 1000s of identical looking needles Finding visual materials without viewing them Expanding use Preserving content and context

6 Key Elements Administrative Metadata – used in managing and administering information resources Descriptive Metadata – used to describe or identify information resources Preservation Metadata – related to the preservation management of information resources Technical Metadata – related to how a system functions or metadata behave Use Metadata – related to the level and type of use of information resources

7 Structure of metadata CollectionCollection WorkWorkWork Item Item Item Item Item Item Item

8 How metadata is created By software tools From resource content e.g. catalogues or databases From creation tool e.g. digital camera or file header By human intervention Description by resource creator/owner Description by third party provider e.g. technical metadata Creating and maintaining good metadata is time consuming and high cost

9 Where metadata is stored Embedded in the resource XIF information with TIFF images – viewable in Photoshop File headers or invisible copyright watermarking Linked to resource Created as record in database format

10

11 Metadata Standards Dublin Core http://vads.ahds.ac.uk/guides/creating_guide/sect43.html DIG35 – for technical metadata www.i3a.org/I_dig35.html Categories for the Description of Works of Art (CDWA) www.getty.edu/research/institute/standards/cdwa/ Visual Resources Association Core Categories www.vraweb.org/ SEPIA working group www.knaw.nl/ecpa/sepia/workinggroups/wp5/cataloguing.html Resource Description Framework (RDF) Encoded Archival Description (EAD)

12 How much will it cost? How long is a piece of string? Depends upon the stop points There is no one-size-fits-all or one-cost framework Depends upon the description already in place and how well the collection is currently indexed Inhouse measurement Balance skill, time, and automation Photographs – descriptive metadata will not take 30 minutes

13 Traditional Functions Traditionally we applied these functions to: Paper based and microform based information resources Monographs, serials, photographs, etc. Access provided through local library services Including inter-library loan

14 New Functions Apply these functions to: Web documents, online serials, digital images, digital collections, web sites, digital audio and video, born digital material, etc. Access provided via the web and email

15 Why are these digital objects different? Information explosion Multiple versions Instant access Less physical control over collection Some are surrogates Increased user expectations Preservation is more complex

16 Why do we need metadata to do these things? Provides the necessary tools to manage, preserve and provide access to information in the digital environment Our jobs have not fundamentally changed; but our collections have and our users have

17 What is metadata? Metadata is data that facilitates the management, description, and preservation of a digital object or aggregation of digital objects. The creation of metadata is governed by a body of standards, best practices and schemas that, when appropriately applied, work together to facilitate the management, description, and preservation of digital objects.

18 Types of metadata Descriptive Technical Structural Administrative Preservation

19 About Metadata Sets Encoding standards/schema Metadata set = rules Encoding schema = representation

20 Metadata Sets AACR2 Dublin Core Visual Resources Association Metadata Object Descriptive Schema Text Encoding Initiative Encoded Archival Description

21 Encoding Standards/Schema HTML MARC Metadata Encoding Transmission Standards (METS) Resource Description Framework (RDF) XML Z39.50

22 Choosing Sets and Schema: Interoperability Why is interoperability important? How is it achieved? Crosswalks/mapping Standardization Schema Controlled vocabulary Open Archives Initiative (OAI) Common elements harvested and made searchable from one interface Very basic level of description, working to develop it to make it better

23 Choosing an Encoding Schema The more digitized objects you have; the more complex they are; the more data sharing you do; the more important it will be to utilize an encoding schema XML is the most prevalent encoding schema All metadata schema have XML based encoding schema already available

24 Factors in Metadata Decisions for Digitization Projects Audience Workflow and Timelines Preservation Interoperability Number of and complexity of digitized objects

25 What Do You Want To Do? Digitize for access only? Descriptive Some administrative Digitize for preservation? Descriptive Administrative Technical Eventually preservation

26 What Materials Are You Digitizing? The more complex the material, the more complex your metadata Structural metadata becomes vital For example….

27 Complex Digital Objects Original = 150 page book with 7 chapters Digitization results in 4 versions of the same content 150 master TIFF images 150 JPEG access images 150 JPEG thumbnail images 7 ASCII text transcripts (one per chapter) Files to manage = 457

28 Complex Digital Objects and Structure Which images belong in which chapter? Which digital version is which? Where is chapter 3 in each version? There is technical metadata for each digital version AND each digital file. How do we relate the correct metadata to the correct version/file?

29 Digitization and Metadata Descriptive metadata for access and administration Technical metadata for preservation Structural metadata for control over complex digitized objects Preservation metadata for management within a digital archive

30 Descriptive Metadata Information users will have to gain access to the digitized material Should facilitate access to the original source material whenever possible Access via a web interface search engine User friendly Standardized Well written

31 Common Descriptive Metadata Sets for Digitization Projects Visual Resources Association Metadata Object Descriptive Schema Encoded Archival Description Text Encoding Initiative Dublin Core MARC

32 Choosing a Set Should we use MARC? Integrated into existing work Rules for creation already exist Less technical infrastructure necessary Complex – more training Time consuming Should we use something else? Collaborating? Interoperability concerns? Staff expertise Size of project Exhibit and web access

33 Choosing a Schema Can we use both? MARC for collection level Metadata for item level MARC for all Crosswalked to web accessible database Database for all Crosswalked to MARC

34 Implementation What informational elements do you need? List them, making sure to think through web design, audience and access issues What descriptive schema schema will you use? MARC Dublin Core VRA MODS

35 Implementation Build database or implement content management system for metadata storage Map the fields to the schema you have chosen Document the mapping Create Style Guide for your project Staff creates the metadata manually according to Style Manual and established work processes Metadata is reviewed for quality

36 Implementation Metadata is stored and made web accessible XML (if supported) Back-ups, master metadata record, and/or web access

37 Dublin Core Title Creator Subject /Keywords Description Publisher Contributor Date Audience Resource Type Format Resource Identifier Source Language Relation Coverage Rights Management

38 Characteristics of the Dublin Core All elements optional All elements repeatable All elements displayable in any order Extensible (a starting place for richer description) International

39 Extensibility Refining mechanism for elements improve sharpness of description with qualifiers Means for extending element set complementary packages of other types of metadata (administrative, rights management, discipline-specific, etc)

40 Technical Metadata Information file that facilitates management and preservation of the file Technical information about: Master file (TIFF) Scanning specifications (resolution, bit depth, etc) Derivative Storage – compression

41 NISO Metadata Purpose: To define a standard set of metadata elements for digital images Facilitate interoperability Support long term management of and continuing access to digital images

42 Tagged Image File Format – Background and Metadata TIFF is a specification for a file format Spec includes a directory or header section which consists of several metadata fields A TIFF can consist of several images Directory/Header information is unique for each image

43 Tagged Image File Format – Background and Metadata The TIFF spec is implemented differently by different applications Scanning software Usually bundled with your scanner Controls the scanner or camera and passes information to computer storage or image editing software Outputs image files in specific image file formats Determines what flavor TIFF is produced Determines what metadata fields are utilized and how they are utilized

44 Tagged Image File Format – Background and Metadata Other software may add to the TIFF metadata, such as Photoshop Tags can be added, using particular software TiffKit (no longer supported) Black Ice Software Development Kit Captivas Input Accel Others

45 Technical Metadata -- Options Options? Use as much as you can; create manually using database and/or XML based on the NISO draft and the LC encoding schema or Use DC: Format element

46 Using DC Format for Technical Metadata Elements File size Quality (bit depth, resolution) Extent (pixel dimensions, play time, pagination) Compression Checksum value (error detection) Object producer (name of scanning technician, vendor who scanned) Creation Hardware (digital camera, flatbed scanner,etc) Creation Software (name and version)

47 Encoding: METS Metadata Encoding and Transmission Standard Product of Making of America project Digital Library Federation Initiative Provides an XML schema for encoding metadata necessary for: management of digital library objects exchange of those objects (OAIS) Brings all the metadata together

48 Encoding: METS Five Sections of a METS document Descriptive Administrative File Group Structural Map Behavior

49 Encoding: METS Five Sections Descriptive Metadata may point to descriptive metadata external to the METS document MARC may imbed the descriptive metadata within the METS document

50 Encoding: METS Alice's Adventures in Wonderland Lewis Carroll between 1872 and 1890 McCloughlin Brothers text MDI0ODdjam0gIDIyMDA1ODkgYSA0NU0wMDAxMDA...(etc.)

51 Encoding: METS Five Sections Administrative Metadata information regarding file creation and stored intellectual property metadata regarding the original information regarding provenance of the digital object (technical metadata) may be external or internally encoded

52 Encoding: METS image/tiff LZW 8 1 NYU Press http://dlib.nyu.edu/press/testimg.tif

53 Encoding: METS Five Sections File Groups used to group together related files One file group lists all of the files which comprise a single electronic version of the digital library object Master document (TIFF) Access copy or copies Perhaps a transcript

54 Encoding: METS http://dlib.nyu.edu/tamwag/beame.xml

55 Encoding: METS http://dlib.nyu.edu/tamwag/beame.wav

56 Encoding: METS http://dlib.nyu.edu/tamwag/beame.mp3

57 Encoding: METS Five Sections Structural Map outlines the intellectual structure of the content of the digital resource

58 Encoding: METS

59 Encoding: METS

60 Encoding: METS Five Sections Behavior used to associate executable behaviors with content defines the behaviors can contain executable code to run the behaviors

61 METS: Encoding

62 Preservation Metadata If you are digitizing with preservation in mind, ALL metadata is preservation oriented Metadata must be of the highest quality that is possible Incorporate the creation and management of metadata into your project at the planning stage

63 Preservation Metadata Designed to facilitate the process of preservation and management in a digital repository Generally implemented at the time a digital resource is moved to a digital archive Several schemas under development for particular operating environments and/or programs

64 Preservation Metadata Sets CEDARS – Consortium of University Research Libraries, Exemplars in Digital Archives project www.leeds.ac.uk/cedars/guideto/metadata/ NLA -- National Library of Australia www.nla.gov.au/preserve/pmeta.html NEDLIB – Networked European Deposit Library www.kb.nl/coop/nedlib/results/D4.2/D4.2.htm OCLC Digital Archive www.oclc.org/digitalarchive/about/works/metadata/

65 Preservation Metadata Inference that there is a core of metadata necessary for preservation regardless of the preservation strategy More work needs to be done to identify the particular elements necessary for particular preservation strategies

66 Metadata Wrap up New tools for new resources Metadata schema = rules Encoding schema = mark up and storage

67 Descriptive Metadata Use an established metadata schema Create a project style guide to facilitate standardized, high quality creation Store in content management software or database to provide web access Document the database design and map fields to DC (or other schema) within the documentation Encode and back up using XML, if technically feasible

68 Technical and Structural Use TIFF Document scanning software used as TIFF has many different flavors Use as much of the NISO draft standard as possible – watch for implementation developments, or… Use descriptive schema to collect technical information Structural metadata ( METS) to manage numerous, complex digital objects, or… Documented file naming and structures

69 Planning Plan for the costs associated with good metadata Creation and research Technical resources (staff, hardware, software, backups) Get a team of appropriate people together Identify goals, elements, and research appropriate schema and encoding Style Guide for descriptive metadata Create the highest quality, most thorough metadata possible in your situation Document mappings

70 Some Conclusions Metadata is a work in progress at both the community level and the project level Use standards Technical metadata will be easier to implement in time Structural metadata is vital for large projects with complex digital object Preservation metadata isnt standardized yet


Download ppt "Metadata for Digitization and Preservation. Introduction What is metadata and why it matters The key elements How metadata is created Where metadata is."

Similar presentations


Ads by Google