Digital Preservation Tools for Repository Managers A practical course in five parts Revision with Steve Hitchcock By Chris Blakeley A rapid recap of tools.

Slides:



Advertisements
Similar presentations
Preserv Preservation Eprint Services Simple Preservation Services – towards Proactive Support for the Institutional Repository.
Advertisements

Introduction to Planets Hans Hofman Nationaal Archief Netherlands Prague, 17 October 2008.
The Planets Preservation Planning workflow and the planning tool Plato Hannes Kulovits Vienna University of Technology
LIFE 2 LIFE2 Conference The Life Model Paul Wheatley Digital Preservation Manager The British Library.
The AIDA toolkit: Assessing Institutional Digital Assets Ed Pinsent, ULCC.
Digital Preservation Tools for Repository Managers A practical course in five parts presented by the KeepIt project in association with School of ECS,
Digital Preservation Tools for Repository Managers A practical course in five parts presented by the KeepIt project in association with Module 4, Putting.
Digital Preservation: Logical and bit-stream preservation using Plato and Eprints Physical preservation with Eprints: 2 File Formats and Risk Analysis.
Preservation as a Process of a Repository David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
Digital Preservation: Logical and bit-stream preservation using Plato and Eprints Introduction: Digital Preservation Recap Hannes Kulovits Andreas Rauber.
Applying preservation metadata to repositories For JISC KeepIt course on Digital Preservation Tools for Repository Managers Module 3, Primer on preservation.
Digital Preservation Tools for Repository Managers A practical course in five parts presented by the KeepIt project in association with Module 5, Trust.
Supporting education and research Repositories in Context Digital repositories as components of an integrated infrastructure for education Leona Carpenter.
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
Joint Information Systems Committee 11/03/07 | | Slide 1 Joint Information Systems CommitteeSupporting education and research JISC Conference 2007 Managing.
LIFE 3 LIFE 3 : Predicting Long Term Preservation Costs Brian Hole LIFE 3 Project Manager The British Library IFLA conference 27/02/10.
SHERPA: institutional repositories Bill Hubbard SHERPA Project Manager University of Nottingham.
ECM RFP 101 Presented by: Carol Mitchell C.M. Mitchell Consulting.
A centre of expertise in data curation and preservation MIS Seminar :: University of Edinburgh :: 2 October 2006 Funded by: This work is licensed under.
Tools for assessing trustworthy repositories A quick overview of TRAC leading to DRAMBORA by Steve Hitchcock by eurovision_nicolaeurovision_nicola Haven’t.
1 Phases & Impact on other Projects Definition and Scope –Relationship between Appraisal Policy/ Procedure, Technology and Management Overview, Components.
Selecting Preservation Strategies for Web Archives Stephan Strodl, Andreas Rauber Department of Software.
Choosing an Optimal Digital Preservation Strategy Andreas Rauber Department of Software Technology and.
Common Use Cases for Preservation Metadata Deborah Woodyard-Robinson Digital Preservation Consultant Long-term Repositories:
Preservation and Long-term access through Networked Services Adam Farquhar, The British Library iPres2006 Cornell University, October 2006.
"Keeping alert: issues to know today for long-term digital preservation with repositories" Neil Beagrie Fedora Users Group Open Repositories Southampton.
The Tower Hotel, November 26, 2009 Research Data Management Infrastructure Programme Launch Event SUpporting Data Management Infrastructure for the Humanities.
EPrints Preservation. What will you know after this tutorial?  Understand the challenges in digital Preservation  Understand why we need to plan preservation.
Trustworthy Preservation Planning with Plato Andreas Rauber Department of Software Technology and Interactive.
AICT5 – eProject Project Planning for ICT. Process Centre receives Scenario Group Work Scenario on website in October Assessment Window Individual Work.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
LIBER Digitisation Conference, Copenhagen The cost of digitisation and preservation: The LIFE Project October 2007 Richard Davies LIFE 2 Project.
LIFE 3 LIFE3: Predicting Long Term Preservation Costs Paul Wheatley Digital Preservation Manager The British Library.
LIFE 3 LIFE 3 : Predicting Long Term Preservation Costs Brian Hole LIFE 3 Project Manager The British Library KeepIt training course 05/02/10.
Transforming repositories: from repository managers to institutional data managers ECA 2010, 8th European Conference on Digital Archiving, Geneva, 28 -
Ensuring Enduring Access: A Forum on Digital Preservation, July 21, 2009.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
… because good research needs good data DAF at KeepIt Digital preservation tools for repositories, 19/01/10, Southampton Funded by: This work is licensed.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
An Online Knowledge Base for Sustainable Military Facilities & Infrastructure Dr. Annie R. Pearce, Branch Head Sustainable Facilities & Infrastructure.
Building a Business Case: or, why undertake digital preservation? Patricia Sleeman Archivist.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
1 Strategic Plan for Digital Archives Programme DAP PROJECT SCOPE OVERVIEW STATUS.
Small steps and lasting impact: making a start with preservation or It’s not all NASA Patricia Sleeman Digital Archives and Repositories University of.
Connecting Preservation Planning and Plato with Digital Repository Interfaces David Tarrant
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
NDSR Boston webinar: Digital Preservation Introduction Presenter: Nancy Y McGovern October 2015.
Census Processing Baku Training Module.  Discuss:  Processing Strategies  Processing operations  Quality Assurance for processing  Technology Issues.
Preservation metadata and the Cedars project Michael Day UKOLN: UK Office for Library and Information Networking University of Bath
NDSR Boston webinar: Provide module Presenter: Nancy Y McGovern December 2015.
A centre of expertise in digital information management UKOLN is supported by: Functional Requirements Eprints Application Profile Working.
11 Researcher practice in data management Margaret Henty.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Company LOGO. Company LOGO PE, PMP, PgMP, PME, MCT, PRINCE2 Practitioner.
Statistical process model Workshop in Ukraine October 2015 Karin Blix Quality coordinator
Fourth UNICA Scholarly Communication Seminar, Prague The LIFE Project Costing Digital Preservation May 2008 Richard Davies LIFE 2 Project Manager,
Digital Preservation Tools for Repository Managers A practical course in five parts presented by the KeepIt project in association with Module 3, Primer.
Digital Preservation What, Why, and How? Dan Albertson’s Digital Libraries Class April 13, 2016 Jody DeRidder Head, Metadata & Digital Services University.
Hannes Kulovits, Andreas Rauber Vienna University of Technology
Exercise: understanding authenticity evidence
Digital Preservation In Practice
EPrints Preservation.
Implementing an Institutional Repository: Part II
Using the LIFE Costing Model Case studies from DK
AICT5 – eProject Project Planning for ICT
Nancy Y. McGovern Digital Preservation Officer, ICPSR IASSIST 2007
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
EPrints Preservation.
Presentation transcript:

Digital Preservation Tools for Repository Managers A practical course in five parts Revision with Steve Hitchcock By Chris Blakeley A rapid recap of tools from the KeepIt course: what they do, what they look like, what we did with them

Tools Module 1 The Data Asset Framework (DAF), Sarah Jones, University of Glasgow, and Harry Gibbs, University of Southampton The AIDA toolkit: Assessing Institutional Digital Assets, Ed Pinsent, University of London Computer Centre

… because good research needs good data DAF at KeepIt Digital preservation tools for repositories, 19/01/10, Southampton Themes addressed in DAF surveys Data: type / format, volume, description, creator, funder Creation : policy, naming, versioning, metadata & documentation Management : storage, backup, roles and responsibilities, planning Access: restrictions, rights, security, frequency, ease of retrieval, publish Sharing: collaborators, requirements to share, methods, concerns Preservation : selection / retention, repository services, obsolescence Gaps / needs : services, advice, support, infrastructure

… because good research needs good data DAF at KeepIt Digital preservation tools for repositories, 19/01/10, Southampton The methodology

… because good research needs good data DAF at KeepIt Digital preservation tools for repositories, 19/01/10, Southampton How would you scope: 1) the range of data being created at your institution? 2) user expectations / requirements on the repository to help manage and preserve those data? What would you want to find out? -what would your key questions be? How would you go about collecting information? How would you ensure participation?

Relevance to this Course AIDA can… –Measure your ability to manage digital content effectively –Show how good you are sustaining continued access –Be directly relevant to managing a repository (access, sharing, and usage) –Helps you find out where you are –Help you decide what to do next

Exercise Divide into four teams One element from each leg, relating to one activity Agree on the scope of what you will assess - work on a single Institution (real or imaginary) Assess the capacity for this activity Expected results: –A score for the element in each leg and at each level (6 scores in all) –Explain why you arrived at that decision –Roles / job titles of people consulted –Outline evidentiary sources that might help

Tools Module 2 Keeping Research Data Safe (KRDS), Costs, Policy, and Benefits in Long-term Digital Preservation, Neil Beagrie, Charles Beagrie Ltd consultancy LIFE 3 : Predicting Long Term Preservation Costs, Brian Hole, The British Library

What was Produced? A cost framework consisting of: – activity model in 3 parts: pre-archive, archive, support services –Key cost variables divided into economic adjustments and service adjustments –Resources template for Transparent Costing (TRAC) 4 detailed case studies (ADS, Cambridge, KCL, Southampton) Data from other services.

Benefits Framework KRDS2 Benefits Taxonomy Dimension 1(Type of Outcome) DirectIndirect (costs avoided) Dimension 2 (When) Near-Term BenefitsLong-term Benefits Dimension 3 (Who) PrivatePublic

Group Exercise Agree a spokesperson and “recorder” Using KRDS2 Benefits Taxonomy: –Q1 Identify which benefits can be costed? –Q2 Select 3 Key benefits (include costed and uncosted) –Q3 Identify the information you might need for measuring them Report back at !

LIFE 3 13 LIFE 3 : Estimating preservation costs The LIFE 3 Project: Aim: To develop the ability to estimate preservation costs across the digital lifecycle The Project is developing: A series of costing models for each stage and element of the digital lifecycle An easy to use costing tool Support to enable easy input of data Integration to facilitate use of the results Organisational Profile Predicted Lifecycle Cost Cost Estimation Tool Context Content Profile

LIFE 3 14 LIFE 3 costing tool outputs – estimated costs Reference Linking Disposal Check-in InspectionObtaining Backup Holdings Update Ordering & Invoicing.... User Support RefreshmentDeposit IPR & Licensing.... Access Control Storage Provision Metadata Submission Agreement.... Access Provision Repository Admin Quality Assurance Selection.... Lifecycle Elements Access Re-ingest Preservation Action Preservation Planning Preservation Watch Content Preservation Bit-stream Preservation IngestAcquisition Creation or Purchase LifecycleStage

LIFE 3 15 Exercise Excel model The Content Profile Refining the calculations Feedback Do you feel that this approach is sound? Have we included all relevant factors? Is the model suitable for the kind of content your repository deals with? Are we making correct assumptions, and is it clear what these are? How could we improve it?

Tools Module 3 Significant characteristics, Stephen Grace and Gareth Knight, King’s College London PREMIS, Open Provenance Model

Analyse Check Action Migration Emulation Storage selection Format identification, versioning File validation Virus check Bit checking and checksum calculation Tools e.g. DROID JHOVE FITS Preservation planning Characterisation: Significant properties and technical characteristics, provenance, format, risk factors Risk analysis Tools Plato (Planets) PRONOM (TNA) P2 risk registry (KeepIt) INFORM (U Illinois) KB Preservation workflow

A group task on format risks 1.Choose two formats to compare (e.g. Word vs PDF, Word vs ODF, PDF vs XML, TIFF vs JPEG) 2.By working through the (surviving) list of format risks select a winner (or a draw) between your chosen formats for each risk category (1 point for win) 3.Total the scores to find an overall winning format 4.Suggest one reason why the winning format using this method may not be the one you would choose for your repository

19 Determine expected behaviours What activities would a user – any type of stakeholder – perform when using an ? Draw upon list of property descriptions performed in the previous step, formal standards and specifications, or other information sources. Task 2: Identify the type of actions that a user would be able to perform using the (Groups. 15 mins). E.g. Establish name of person who sent E.g. May want to confirm that originated from stated source. 

20 Exercise overview Analyse the content of an Analyse structure of message Determine purpose that each technical property performs Consider how will be used by stakeholders Identify set of expected behaviours Classify set of behaviours into functions for recording

21

22 JHOVE Demo

Define Sample Objects

Some revision from KeepIt Module 3 Preservation workflow – Recognised we have digital objects with formats and other characteristics we need to identify and record. These can change over time, or may need to be changed pre-emptively depending on a risk assessment, using a preservation action. Risk is subjective. Significant properties – We considered which characteristics might be significant using the function- behaviour-structure (FBS) framework, and classifying the functions of formatted s – We recognised that assessment of behaviour, and so of significance, can vary according to the viewpoint of the stakeholder – e.g. creator, user, archivist Documentation – We looked at two means to document these characteristics, and the changes over time 1.Broad and established (PREMIS) 2.Focussed, and work-in-progress (Open Provenance Model) Provenance in action: transmission and recording – Through a simple game we learned that if we don’t recognise the necessary properties at the outset, and maintain a record through all stages of transmission, the information at the end of the chain will likely not be the same as you started with

Tools Module 4 Eprints preservation apps, including the storage controller, Dave Tarrant and Adam Field, University of Southampton Plato, preservation planning tool from the Planets project, Andreas Rauber and Hannes Kulovits, TU Wien

Hybrid Storage Policies

EPrints Storage Manager

Preservation - Analyse EPrints File Classification + Risk Analysis Risk Analysis Risk Analysis In EPrints

Preservation - Action Mock up Transformation Interface Transformation? Tool Preservation Level PPT -> PPTX PPT -> PDF Migration Tools Risk Analysis In EPrints Migration?

Viewing high-risk objects

Exercise: EPrints Adding ‘at risk’ image collection

Preservation Planning

Plato  Assists in analyzing the collection -Profiling, analysis of sample objects via Pronom and other services  Allows creation of objective tree -Within application or via import of mindmaps  Allows the selection of Preservation action tools Preservation Planning with Plato

Plato  Runs experiments and documents results  Allows definition of transformation rules, weightings  Performs evaluation, sensitivity analysis,  Provides recommendation (ranks solutions) Preservation Planning with Plato

Exercise Time! The Scenario  National library  Scanned yearbooks archive  GIF images  The purpose of this plan is to find a strategy on how to preserve this collection for the future, i.e. choose a tool to handle our collection with.  The tool must be compatible with our existing hardware and software infrastructure, to install it within our server and network environment.  The files haven't been touched for several years now and no detailed description exists. However, we have to ensure their accessibility for the next years.  Re-scanning is not an option because of costs and some pages from the original newspapers do not exist anymore.

Exercise: EPrints Adding ‘at risk’ image collection

Exercise: Plato-EPrints Plan-migrate-review

Tools Module 5 TRAC, Trusted Repository Audit and Certification: criteria and checklist DRAMBORA, Digital Repository Audit Method Based On Risk Assessment, Martin Donnelly, Digital Curation Centre, University of Edinburgh

… because good research needs good data DRAMBORA and DAF, EDINA, 27th October Trustworthy Repositories Audit & Certification (TRAC) Criteria and Checklist RLG/NARA assembled an International Task Force to address the issue of repository certification TRAC is a set of criteria applicable to a range of digital repositories and archives, from academic institutional preservation repositories to large data archives and from national libraries to third-party digital archiving services Provides tools for the audit, assessment, and potential certification of digital repositories Establishes audit documentation requirements required Delineates a process for certification Establishes appropriate methodologies for determining the soundness and sustainability of digital repositories

TRAC Criteria Checklist Within TRAC, there are 84 individual criteria Only 82 criteria to go!

To certify or not to certify? That is the question 1.Take a spreadsheet with all 84 TRAC criteria. 2.Select one. 3.Decide whether you could certify your repository for this, based on where your repository is now or where you think it might be after participating in this course. by CayusaCayusa by fabiuxfabiux

… because good research needs good data KeepIt #5: University of Northampton, 30 March DRAMBORA Method Discrete phases of (self-)assessment, reflecting the realities of audit Preservation is fundamentally a risk management process: Define Scope Document Context and Classifiers Formalise Organisation Identify and Assess Risks Builds audit into internal repository management procedures

… because good research needs good data KeepIt #5: University of Northampton, 30 March Repository Administration

… because good research needs good data KeepIt #5: University of Northampton, 30 March Part I – Identify a risk (30 minutes) Each group should identify one risk (based on your own experiences wherever possible), and complete the DRAMBORA worksheet. Groups should complete: name and description of the risk; example manifestations of the risk; nature of the risk; risk owner(s); stakeholders who would be affected; if possible, relationships with other risks.

… because good research needs good data KeepIt #5: University of Northampton, 30 March Part II – Mitigate the risk (30 minutes) Now identify what steps your archive might take to manage and mitigate the identified risk over time… Each group should complete: Risk management strategy/-ies; Risk management activities; Risk management activity owner(s).