Oral History, METS and Fedora: Building a Standards-Compliant Audio Preservation Infrastructure.

Slides:



Advertisements
Similar presentations
1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton.
Advertisements

METS Awareness Training An Introduction to METS Digital libraries – where are we now? Digitisation technology now well established and well-understood.
Capacity Building Passing on the Experience Dr. Noha Adly World Digital Library Arab Peninsula Regional Group meeting.
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
Columbia University Libraries / Information Services Digital Asset Management Digital Preservation Digital Publishing Stephen Davis, October 28, 2010.
METS: An Introduction Structuring Digital Content.
Fedora Users’ Conference Rutgers University May 14, 2005 Researching Fedora's Ability to Serve as a Preservation System for Electronic University Records.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
Special collections and digital libraries: a new role for consortia? Dale Flecker Harvard University Library.
Selecting Preservation Strategies for Web Archives Stephan Strodl, Andreas Rauber Department of Software.
Common Use Cases for Preservation Metadata Deborah Woodyard-Robinson Digital Preservation Consultant Long-term Repositories:
Merrilee Proffitt e(X)literature / Digital Cultures Project April 2003 News from the Digital Library The Metadata Encoding and Transmission Standard; the.
WMS: Democratizing Data
THE RUTGERS WORKFLOW MANAGEMENT SYSTEM Mary Beth Weber Cataloging and Metadata Services Rutgers University Libraries August 3, 2007.
Use of METS in CDL Digital Special Collections Brian Tingle.
The British Library’s METS Experience The Cost of METS Carl Wilson
EMu and Archives NA EMu Users Conference – Oct Slide 1 EMu and Archives Experiences from the Canada Science and Technology Museum Corporation.
Putting it all together for Digital Assets Jon Morley Beck Locey.
Metadata standards, tools and processes for audio preservation at the British Library: An overview of new systems for audio description, preservation and.
Metadata for preservation Michael Day, UKOLN, University of Bath Chinese-European Workshop on Digital Preservation,
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
Audio Preservation at Indiana University Digital Library Program Brown Bag Series Mike Casey -Coordinator of Recording Services, Archives of Traditional.
Jenn Riley Metadata Librarian Indiana University Digital Library Program.
Copy cataloguing in Finland Juha Hakala The National Library of Finland
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
How to build your own Dark Archive (in your spare time) Priscilla Caplan FCLA.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
Sound Directions Digital Preservation and Access for Global Audio Heritage Indiana University Harvard University.
Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist.
Preservation Audio Using METS: The Sound Directions Project Robin Wendler Harvard University Library 7 May 2007.
Lifecycle Metadata for Digital Objects September 11, 2002 Major archival and digital library metadata schemes.
Digitization An Introduction to Digitization Projects and to Using the Montana Memory Project.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
BUILDING ON COMMON GROUND: EXPLORING THE INTERSECTION OF ARCHIVES AND DATA CURATION Lizzy Rolando & Wendy Hagenmaier 6/3/2015IASSIST 2015.
“SIPS, DIPS and Trips: How we will know if we've collected enough, or the right, metadata?” George Blood Audio, LP Safe Sound Archive Intellectual.
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
PREMIS Implementation Fair – SF 2009 PREMIS use in Rosetta Yair Brama – Ex Libris.
Introduction to metadata
ETD2006 Preserving ETDs With D.A.I.T.S.S. FLORIDA CENTER FOR LIBRARY AUTOMATION FC LA PAPER AUTHORS: Chuck Thomas Priscilla.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
ARKIVA The Digital Archive of the Society of Swedish Literature in Finland Jessica Parland-von Essen
Digital Library Storage Strategies Robert Cartolano, Director Library Information Technology Office November 14, 2008.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Collection Management Systems
NLW. Object Classes Class 1  1 MARC Record  1 Image  No METS Class 2  1 MARC Record  Many images  No METS Class 3  1 MARC Record  Many.
@ulccwww.ulcc.ac.uk IRMS Cymru October 2015 From EDRMS to digital archive: a wish-list for ways to preserve digital records.
Santi Thompson - Metadata Coordinator Annie Wu - Head, Metadata and Bibliographic Services 2013 TCDL Conference Austin, TX.
Chang, Wen-Hsi Division Director National Archives Administration, 2011/3/18/16:15-17: TELDAP International Conference.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
Data Stewardship Lifecycle A framework for data service professionals Protectors of data.
A strategic view of document and digital object management for the University of the Witwatersrand, Johannesburg Prof Derek W. Keats Deputy Vice Chancellor.
Building Digital Archives Mark Phillips Cathy Hartman June 6, 2008.
Joint Meeting of CSUL Committees,
FLORIDA CENTER FOR LIBRARY AUTOMATION
Building A Repository for Digital Objects
Information modeling and infrastructures for metadata
Bentley Project Reel Digitization Bentley Historical Library t
Essentials of Systems Analysis and Design Fourth Edition
DIGITAL ARCHIVES Into the Light
Implementing an Institutional Repository: Part II
Metadata to fit your needs... How much is too much?
Metadata in Digital Preservation: Setting the Scene
Medusa at the University of Illinois
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Presentation transcript:

Oral History, METS and Fedora: Building a Standards-Compliant Audio Preservation Infrastructure

Outcomes of Columbia University Libraries’ Mellon-funded Audio Preservation Project Stephen Davis –Director, Libraries Digital Program Division Janet Gertz –Director, Preservation & Digital Conversion Division

What we’ll cover Janet –Project background –Identifying and describing content and versions –Physical organization vs. intellectual organization Stephen –Metadata –Digital asset management, preservation, access –Conclusions

The Problem Mellon-funded survey in found 35,000 pieces of unique analog audio aging rapidly in Columbia special collections

Audio preservation standards as of 2008 Standards for digitization of audio well established –96 kHz, 24 bit, Broadcast Wave format Standards for structural, technical, and preservation metadata still evolving –No clear model for METS for digitized audio –Audio Engineering Society draft standard

AES-X098B Superseded By: AES f (2011) AES standard for audio metadata - Audio object structures for preservation and restoration AES f (2011) AES standard for audio metadata - Core audio metadata

Purpose of Columbia’s project Build a sustainable program for audio preservation at Columbia Reduce need for time-consuming custom metadata and ingest work by Libraries Digital Program Division staff Improve efficiency and consistency

Programmatic goals Aim for same quality product as achieved by Harvard & Indiana in Sound Directions But ─ digitization and metadata creation by external vendors Establish CUL infrastructure –Quality control procedures –Metadata requirements –Ingest into Fedora

Intellectual value High Low High Risk of loss Low Preservation priorities

Oral History collections Our highest preservation priority Unique recordings held only at Columbia More than 8,000 interviews since 1948 Over 15,000 physical objects Strong demand for access to the sound Many in poor condition

Project team Preservation & Digital Conversion Division –Project management, digitization, quality control Libraries Digital Program Division –R&D, METS, other metadata, Fedora ingest Columbia Center for Oral History –Selection, preparation, physical handling Bibliographic Control Division –Descriptive metadata, MARC records

Preservation results 555 interviews preserved –1,346 original audio objects –2,100+ hours of sound 1,841 digital audio files created 555 MARC records created 555 sets of METS records created

Infrastructure results System for METS records to describe –Files that represent the original objects –Files that represent the intellectual objects Incorporation of draft Audio Engineering Society metadata into METS records Procedures for ingest into Fedora

Project challenges Describing versions and formats Identifying the content Coping with the disconnect between physical organization and intellectual organization

Oral histories are complicated Original audio recording on a series of tapes or cassettes Digitized audio: arranged to put all parts in chronological sequence Transcript: edited to suit the interviewee; doesn’t perfectly match the original audio Digitized transcript

MARC records One record for analog versions –Paper transcript –Audio tapes, cassettes, etc. One record for digital versions –Digitized or born digital audio –Transcript in Word or other format

Identifying the content All we know is what someone has written on the container

Other information sources Transcripts Card file of interviewees Paper files of correspondence with interviewees and interviewers Staff memories Listening to the audio after digitization

If content identification is inaccurate, projects are difficult and more expensive Quality control must be slow and 100% Metadata requires significant revisions Vendor has to make many changes Version control is essential

Disconnect between physical organization and intellectual organization

Oral histories are complicated Session: basic unit of an oral history Single recorded sitting of ca hours Can number a few or more than 20 in one oral history Recorded over a period ranging from days to years

Mind / body disconnect One session = one tape One session = several tapes One tape = one session One tape = several sessions from one oral history One tape = sessions from several oral histories Several tapes = several sessions from several oral histories

Preservation practice: make a copy that accurately represents the original object Patron needs: a coherent sequence of files that contain all and only one oral history

Solution Preservation master file –Accurately captured from the physical object –96 kHz, 24 bit, Broadcast Wave format Rendered file –Concatenates all parts of an interview regardless of which master files they’re on –96 kHz, 24 bit, Broadcast Wave format ADL (Audio Decision List) –Metadata that tracks which minutes from which master files make up the rendered file

Over to Stephen ….

Metadata MARC, AES, MODS, DUBLIN CORE, PREMIS ADL (AES Audio Decision List) METS (Metadata Encoding and Transmission Standard) RDF (Resource Description Framework) ORE (OAI Object Reuse and Exchange)

ADL (Audio Decision List) ADL specified in AES 31-3 standard Records edit decisions Designed to be imported into audio editing software to recreate those decisions Imperfectly supported by commercial software platforms Migratable, human readable (sort of)

METS Relate master and rendered files to each other and to the ADL Include AES draft metadata for –Technical details of physical object –Technical details of digital object Technical details of capture process

RDF (Resources Description Framework) Version of METS for Complete Interview

RDF Graph of Single Set of Interviews (Visual)

Oral History Interviews Originally Targeted for Project [beginning of list]

Content Displayed From Columbia’s Fedora/Blacklight-based Staff Collection Viewer

Frank Capra (1960), Transcript vs. Audio ( Noticed when preparing this presentation) Audio version: “and treating people as numbers” (!)

Issues & Considerations What did we achieve? Were there other, simpler ways to do it? Do we really need all that metadata?

Specific Outcomes #1 the content is reliably preserved for the future the content is preserved as an original "content artifact" the content has been reorganized to provide a coherent "interview narrative“ the process for rendering access files is replicable and correctable

Specific Outcomes #2 technical ‘provenance’ has been documented content can be validated as “authentic” content is structured so that it can be managed bibliographically content (10 TB) has been ingested into our long-term preservation repository (Fedora) content can be made publicly accessible to the extent permitted by author agreements

Strategic Outcomes #1 Well-developed structural model for digitization of oral history audio Fully-developed support by a vendor who can now produce standards-based output Solid procedures for local cataloging of analog and digital oral histories

Strategic Outcomes #2 Built out tools and workflows for ingesting complex content into Fedora repository Developed new features for Fedora/Blacklight Staff Collection Viewer to accommodate complex content Columbia can share approach with other institutions starting similar projects

Were there other simpler ways to do it? Preserve only “content artifacts” –Leave providing good access to the future Preserve and provide access only to “interview narrative” –Do not store, describe or map master files

Right Decisions for This Project? Our choices for this project did in fact fully meet our goals for preservation and access. It was also a good choice to build out our existing Fedora / Blacklight and metadata environment rather than develop new, ad hoc approaches for this project.

Did we really need all that metadata? Only time will tell ….

Questions?