The worlds libraries. Connected. Preservation Health Check: work in progress Workshop PREMIS implementation fair 2013 IPRES 2013 Lisbon, September 5 th.

Slides:



Advertisements
Similar presentations
Adding OAI-ORE Support to Repository Platforms Alexey Maslov, Adam Mikeal, Scott Phillips, John Leggett, Mark McFarland Texas Digital Library TCDL09.
Advertisements

DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
Curating Research: problems and policy Dale Peters Scientific Technical Manager DRIVER II.
Applying preservation metadata to repositories For JISC KeepIt course on Digital Preservation Tools for Repository Managers Module 3, Primer on preservation.
Joint Information Systems Committee 11/03/07 | | Slide 1 Joint Information Systems CommitteeSupporting education and research JISC Conference 2007 Managing.
Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From.
The Case of Health for All by 2000: The Alma-Ata Declaration 1978 and primary health care in developing countries.
The future’s so bright…. DAITSS DIGITAL PRESERVATION SYSTEM: RE-ARCHITECTED, RE- WRITTEN, AND OPEN SOURCE Priscilla Caplan Florida Center for Library Automation.
Institutional Repositories It’s not Just the Technology New England Archivists Boston College March 11, 2006 Eliot Wilczek University Records Manager Tufts.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
Dr Gordon Russell, Napier University Unit Data Dictionary 1 Data Dictionary Unit 5.3.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
KEEP Project Overview House Government Efficiency Committee March 14, 2012 Matt Veatch State Archivist & KEEP Project Manager
1 Extending the Implementation of PREMIS to Geospatial Resources in the Stanford Digital Repository: An Exploration By Nancy J. Hoebelheinrich Metadata.
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
Common Use Cases for Preservation Metadata Deborah Woodyard-Robinson Digital Preservation Consultant Long-term Repositories:
3. Technical and administrative metadata standards Metadata Standards and Applications.
PREMIS What is PREMIS? – Preservation Metadata Implementation Strategies When is PREMIS use? – PREMIS is used for “repository design, evaluation, and archived.
Current Thinking on Digital Preservation: Role of Metadata Oya Y. Rieger Coordinator, Library Office of Distributed Learning Cornell University Library.
US GPO AIP Independence Test CS 496A – Senior Design Fall 2010 Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong.
PREMIS What is PREMIS? o Preservation Metadata Implementation Strategies When is PREMIS use? o PREMIS is used for “repository design, evaluation, and archived.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
AIP Archival Information Package – Defines how digital objects and its associated metadata are packaged using XML based files. METS (binding file) MODS.
Descriptive Metadata o When will mods.xml be used by METS (aip.xml) ?  METS will use the mods.xml to encode descriptive metadata. Information that describes,
US GPO AIP Independence Test CS 496A – Senior Design Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong Faculty advisor: Dr. Russ.
Statewide Digitization and the FCLA Digital Archive Priscilla Caplan, Florida Center for Library Automation Statewide Digitization Planners Meeting OCLC,
US GPO AIP Independence Test CS 496A – Senior Design Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong Faculty advisor: Dr. Russ.
Portfolio based assessment - options for the new CGEA.
Chinese-European Workshop on Digital Preservation, Beijing, July 14 – Network of Expertise in Digital Preservation Preservation Planning, Institutional.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
Digital Preservation: Lessons learned through national action Digital Preservation Interoperability Framework Workshop April 2010.
OAIS Open Archival Information System. “Content creators, systems developers, custodians, and future users are all potential stakeholders in the preservation.
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
PREMIS Rathachai Chawuthai Information Management CSIM / AIT.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Digital Preservation MetaArchive Cooperative.  9:00-9:45 - Session 1: Digital Preservation Overview  9:45-11:00 - Session 2: Policy & Planning Overview.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
OCLC Online Computer Library Center Preservation Metadata Standards PREMIS & METS Taylor Surface, OCLC.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
PREMIS Data Dictionary and the Future of Preservation Metadata Brian Lavoie Research Scientist OCLC Research Society of American Archivists.
Fedora and the Preservation of University Electronic Records Project NHPRC Electronic Records Research Grant Kevin L. Glick Manuscripts and Archives, Yale.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Digitization & Digital Preservation
The OAIS Reference Model Michael Day, Digital Curation Centre UKOLN, University of Bath Reference Models meeting,
©MIT LKTR Workshop, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego Supercomputer.
Implementing PREMIS in DigiTool Michael Kaplan ALA 2007 Update.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
The OAIS Reference Model and Trustworthy Repositories Josh Lubell Manufacturing Engineering Laboratory NIST
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Dr. G. U Ahsan PhD Chairman Department of Public Health Dr. G.U. Ahsan, Ph.D North South University.
Cedars work on metadata Michael Day UKOLN, University of Bath Cedars Workshop Manchester, February 2002.
An overview of the Reference Model for an Open Archival Information System (OAIS) Michael Day, Digital Curation Centre UKOLN, University.
The world’s libraries. Connected. OCLC Research in Europe OCLC Research Library Partnership briefing Philadelphia, 6 June 2012 Titia van der Werf Senior.
Applying preservation metadata to repositories The British Library, 21 January 2008 Led by Steve Hitchcock With Bill Hubbard, Gareth Johnson.
Joint Meeting of CSUL Committees,
Digital Asset Management at Michigan Tech
Identifying Barriers To File Rendering In Bit-level Preservation Repositories A Preliminary Approach Kyle R. Rimkus, University Library Scott D. Witmer,
Dependency Management
DAITSS: Dark Archive in the Sunshine State
An Introduction to Tessella and The Safety Deposit Box Platform
Statewide Digitization and the FCLA Digital Archive
Open Archival Information System
How to conduct Effective Stage-1 Audit
Presentation transcript:

The worlds libraries. Connected. Preservation Health Check: work in progress Workshop PREMIS implementation fair 2013 IPRES 2013 Lisbon, September 5 th 2013 Titia van der Werf Senior Program Officer OCLC

The worlds libraries. Connected. Joint initiative: -Open Planets Foundation (OPF) A community hub for digital preservation whose main goal is to jointly manage and improve tools and research outcomes for practical use. -OCLC Research A community resource for shared R&D that addresses challenges facing libraries and archives in a rapidly changing information technology environment. What is the Preservation Health Check Pilot?

The worlds libraries. Connected. As part of their preservation management task, repository managers need to be able to monitor the preservation status of the content of their repository. We are looking at regular routine check-ups that can support this monitoring task. Monitoring should be made easy (automatically generated reports or dashboard) Monitoring should be based on objective data, generated by the repository (e.g. preservation metadata) The proposition

The worlds libraries. Connected. The analogy

The worlds libraries. Connected. If a Preservation Health Check is a monitoring activity to be performed on a repository with digital content 1.What are empirical indicators (i.e. measures) for PHCs? 2.Are preservation metadata recorded by repositories useful as health indicators for PHCs? Monitoring is about tracking change... intentional and unintentional change. The research question

The worlds libraries. Connected. Analogy with a car dashboard, involving sensors, thresholds, and triggers. The analogy

The worlds libraries. Connected. The design of the Preservation Health Check is based on both top-down & bottom-up approaches: Top-down: work with existing models (PREMIS and SPOT) that define properties of successful preservation and indicators (threats) of what theoretical could go wrong;PREMISSPOT Bottom-up: work with real metadata and assess their applicability for sensing what needs attention and for triggering preventive actions. The research methodology

The worlds libraries. Connected. Goal: To develop an implementable logic (or protocol) to support PHCs, and to test this logic against the store of preservation metadata maintained by an operational preservation repository. The research methodology

The worlds libraries. Connected. The BnF runs a fully operational trusted digital repository (SPAR). They volunteered to become a PHC-pilot site.SPAR The empirical data consists of: 1.A sample (200 GB) of the PREMIS data (AIP-METS files), covering the following collections: Gallica = digitised periodicals, monographs, still images and manuscripts (TIFF + OCR-files) Legal deposit Web harvests (warc files) 3 rd party collection (Centre Pompidou) The pilot site

The worlds libraries. Connected. The empirical data consists of (continued): 2.All the Reference Information packages in SPAR that contain reference information/code/specifications of (external) tools used during INGEST (ex. JHOVE) and of formats ingested; 3.Per collection: SLAs defining policy agreements with SIP suppliers concerning the preservation regime to be applied at the INGEST and ARCHIVAL STORAGE stages. The pilot site

The worlds libraries. Connected. Mapping PREMIS semantic units to SPOT properties: which semantic units address each of the 6 basic properties of successful preservation defined in SPOT? Example: SPOT property: Persistence associated threats (e.g. storage medium deterioration) PREMIS semantic units storageMedium = magneticTape eventType = mediaRefreshment eventDateTime = PHC-pilot stage 1: Top-down approach

The worlds libraries. Connected. Mapping PREMIS on to SPOT PREMIS Data Model Int. Ent. SPOT Model Availability Identity Persistence Renderability Understandability Authenticity Objects Agents Rights Events Semantic Units Threats

The worlds libraries. Connected.

Findings: coverage SPOT property# of PREMIS semantic units* Availability16 Identity19 Persistence10 Renderability15 Understandability14 Authenticity16 *Container level only; Agents, Events, Rights considered one semantic unit

The worlds libraries. Connected. Findings: coverage What does the question Is this SPOT property well- covered by PREMIS mean? More meaningful: Do the PREMIS semantic units address the threats associated with a SPOT property? Example of a gap between SPOT and PREMIS: SPOT property: Understandability We found no PREMIS semantic units that provide information that aids in the understanding or interpretation of the content of the archived digital object.

The worlds libraries. Connected. A repository usually implements a large number of explicit and implicit policy decisions; however, PREMIS currently makes few provisions for recording these in preservation metadata (the semantic unit preservationLevel being a notable exception). The issue is exacerbated if there are numerous policies applied at the collection level, rather than repository wide. Findings: preservation policies

The worlds libraries. Connected. The PREMIS Data Dictionary seems to be designed around an implicit assumption that the repository is a self- contained system, and that all digital preservation processes are controlled in-house. Example: SPOT property: Identity Recommended practice is for repositories to use identifiers automatically created by the repository as the primary identifier in order to ensure that identifiers are unique and usable by the repository. Externally assigned identifiers can be used as secondary identifiers in order to link an object to information held outside the repository. [PREMIS DD 2.2 p.28] Findings: autonomy of the repository

The worlds libraries. Connected. PREMIS conformance does not require explicit encoding of metadata if the information applies to all objects in the repository. This impedes the provision of automated PHC services (by a third-party provider) because efficient provision of this service would likely require the information in semantic units to be explicitly recorded, and implemented in a standard way. Findings: explicit encoding

The worlds libraries. Connected. What is the entity we are monitoring during a PHC? Digital object? Collection? Repository? We observe that the threat assessment level depends on the nature of the specific threat. Examples: Identity => repository-wide; Renderability => collection of objects sharing same HW/SW environment The SPOT model does not explicitly specify this granularity of analysis for the properties and threats it covers. Findings: assessment level

The worlds libraries. Connected. Despite some gaps in both models used, there is indeed opportunity to use PREMIS preservation metadata as an evidence base to support a threat assessment exercise based on the SPOT model. We will continue this work as a basis for the PHC design. Conclusion

The worlds libraries. Connected. Choose a SPOT property that is well addressed by PREMIS (persistence?) Develop a generalized logic that makes threat assessment statements Example of Logic : Compute elapsed time between last media refreshment event and current date. If (MTtoF – elapsed time) > Critical period, return Green If (MTtoF – elapsed time) < Critical period, return Red Next steps

The worlds libraries. Connected. Test the implementability of this logic on a set of real-world preservation metadata Construct a decision-tree-based PHC-dashboard Next steps

The worlds libraries. Connected. Wikipedia entry for PHC Primary health care, often abbreviated as "PHC", has been defined as "essential health care based on practical, scientifically sound and socially acceptable methods and technology, made universally accessible to individuals and families in the community. It is through their full participation and at a cost that the community and the country can afford to maintain at every stage of their development in the spirit of self-reliance and self-determinationhealth careuniversally accessible World Health Organization. Declaration of Alma-Ata. Adopted at the International Conference on Primary Health Care, Alma-Ata, USSR, 6–12 September 1978.Declaration of Alma-Ata. The analogy

The worlds libraries. Connected. Q&A Titia van der Werf