Presentation is loading. Please wait.

Presentation is loading. Please wait.

Where are we with Digital Preservation? Andrew Waugh Public Record Office Victoria.

Similar presentations

Presentation on theme: "Where are we with Digital Preservation? Andrew Waugh Public Record Office Victoria."— Presentation transcript:

1 Where are we with Digital Preservation? Andrew Waugh Public Record Office Victoria

2 Where are we? It is not the end. It may not even be the beginning of the end. But it is undoubtedly the end of the beginning –Winston Churchill This talk will cover –Consensus views on digital presevation –Open questions and future challenges

3 What this presentation will cover Understanding (building systems) Storage (preserving the bit strings) Access (preserving the meaning) Metadata (preserving the context & authenticity) Transfer (overcoming system senescence)

4 Understanding Communication requires shared terminology and concepts Open Archival Information System (OAIS) reference model (IS 14721:2003) – –High level terminology very widely used, but few use the detail in the model –Does not cover preservation –Pre web and detail does not reflect actual implementations –Currently under review

5 Trusted digital repositories How can you be sure if an organisation (& its system) is up to holding your digital objects? Trustworthy Repositories Audit and Certification – CRL/NARA (2007) –Administrative focus rather than technical –high level (cannot be tested) –Based on OAIS, basis for audit checklists

6 Audit checklists Provide tests to see if a repository can be trusted –Drambora: DCC/DPE (2007) Risk based, self certification

7 Public domain digital repositories Public domain digital repository code –D-Space ( –Fedora ( Both came out of the academic community and primarily support institutional repositories

8 Storage – preserving the bit string Fundamental task of digital preservation is ensuring that the bits that make up the digital objects are preserved “Solved” problem – large scale data repositories have existed for decades and there is lots of operational experience Archival twist: actively monitor health of stored objects using hashes

9 Storage - future challenges Reducing storage cost (and chance for error) –Swedish National Archives estimated in 2005 between 4 and 8 Euro per digitised page mostly in system and support costs – Reducing risks –Administrator risk vs packaged risk Ideal storage system –Packaged (i.e. built in administration such as the Centera) –Open so that you can trust it and replace components CLOCKSS –Uses redundant copies at participating institutions to ensure preservation (LOCKSS) –

10 Access – preserving the meaning What do you do when you no longer have an application to open the data files? Current approach is either –Do nothing now with eventual migration –Normalisation upon accession Future approach might be emulation

11 Migration Save what you capture now and convert to new formats as required –Web harvesting (studies show web sites are mostly safe formats – HTML, XML, jpeg, gif, etc) –Formats (and software) proving surprisingly resilient

12 Normalisation Convert upon accession to small number of long term preservation formats –E.g. PDF/A (PROV), ODF (NAA) –Immediate cost upon accession, but expected lower long term management cost –Criteria for good LTPF (Library of Congress)

13 Challenges What is it? Tools to determine file formats –Pronom – repository of format descriptions and DROID (format classifier) –JHOVE (Harvard) classifier and simple validation How accurate is the conversion? Is it a valid file according to the standard?

14 Metadata is better data Metadata is information about the bit string –What it is (semantic) –What it is (technical) –How it relates to other digital objects –What is its history? –How is it to be managed? Unfortunately, lots and lots of large metadata standards

15 Metadata standards For an excellent summary of metadata standards see the Metadata chapter in the DCC Digital Curation Manual – manual/chapters/metadata/metadata.pdf

16 Digital preservation metadata Data Dictionary for Preservation Metadata (PREMIS) –little descriptive information and nothing format specific – ISO 23081 (Metadata for records) National Archives Australia Recordkeeping Metadata Standard –

17 Future challenges Too many competing standards –Which do I implement? Too many elements –Increases cost of standard development and software implementation Few elements ever used –Too expensive and too hard to capture metadata

18 Transfer Overcoming system senescence Digital objects have a much longer life than the systems that hold them –Move objects to digital repositories where they can be properly managed –Move them from one digital repository to its replacement Storage is so cheap that holders may be tempted to keep digital objects (until it is too late)

19 Future challenges Current systems are not designed around the assumption that digital objects must be relocated –AIHT, Conceptual Issues from Practical Tests, Clay Shirky, D-Lib Magazine, Vol 11 No 12, December 2005, ml ml ADRI-UN/CEFACT work on a standard to transfer custody of digital records

20 More information If I have whetted your appetite... –PADI Annotated bibliography of digital preservation ( –D-Lib Magazine (

21 Final thoughts We know about compasses, and we have some charts, but there are a lot of rocks out there… We are a long way from satellite navigation What about small/medium archives… personal archives? Are photographs better digital or as negatives? –

Download ppt "Where are we with Digital Preservation? Andrew Waugh Public Record Office Victoria."

Similar presentations

Ads by Google