Presentation is loading. Please wait.

Presentation is loading. Please wait.

"Keeping alert: issues to know today for long-term digital preservation with repositories" Neil Beagrie Fedora Users Group Open Repositories Southampton.

Similar presentations


Presentation on theme: ""Keeping alert: issues to know today for long-term digital preservation with repositories" Neil Beagrie Fedora Users Group Open Repositories Southampton."— Presentation transcript:

1 "Keeping alert: issues to know today for long-term digital preservation with repositories" Neil Beagrie Fedora Users Group Open Repositories Southampton April 2008

2 Focus of this lecture Research Data JISC Research Data Digital Preservation Costs Study Long-term costs and sustainability of Repositories

3 Trends

4 Computer Processing Power and Storage

5 Growth of Scientific Data and Data Curation In next 5 years e-Science will produce more data than has been collected in the whole of human history Data growth – Protein Data Bank (1972- 07/2005)

6 e-Research and preservation (UK Science and Innovation Investment Framework 2004 – 2014)

7 Information Infrastructure 2.23 The growing UK research base must have ready and efficient access to information of all kinds – such as experimental data sets, journals, theses, conference proceedings and patents…. 2.24 It is clear that the research community needs access to information mechanisms which: systematically collect, preserve and make available digital information;…. 2.25 The Government [via DTI] will therefore work with interested funders and stakeholders to consider the national e-infrastructure (hardware, networks, communications technology) necessary to deliver an effective system.

8 e-research data and repositories EU Studies: Driver2 E-SCIDR (e-science repositories) Current UK Studies: Data Scientist careers/skills Data Audit Framework and institutional pilots UK Research Data (shared)Service Feasibility Study Costs for long-term preservation of research data

9 Keeping Research Data Safe JISC Research Data Digital Preservation Costs Study

10 Overview Aim – investigate costs, develop model and recommendations Project team – Me, Julia Chruszcz, Brian Lavoie (OCLC), Cambridge, KCL, Southampton Method – detailed analysis of 2 cost models (LIFE & NASA CET) in combination with OAIS and TRAC; literature review;12 interviews; 4 case studies. 4 month study Draft final report in peer review

11 What have we Produced? A cost framework consisting of: – activity model in 3 parts: pre-archive, archive, support services –Key cost variables divided into economic adjustments and service adjustments –Resources template for TRAC –Used in combination to generate cost/charging models 4 detailed case studies (ADS, Cambridge, KCl, Southampton) Spreadsheet supplement Data from other services.

12 Some Tentative Findings

13 Findings Institutional Repository (e- publications): Staff Equipment (capital depreciated over 3 years) Annual recurrent costs 1 FTE£1,300 pa Federated Institutional Repository (data): Annual recurrent costs Staff Equipment (capital depreciated over 3 years) Cambridge4 FTE£58,764 pa KCL2.5 FTE£27,546 pa

14 Findings Timing. costs c. 333 euros for the creation of a batch of 1000 records. Once 10 years have passed since creation it may cost 10,000 euros to ‘repair’ a batch of 1000 records with badly created metadata (Digitale Bewaring Project) Efficiency Curve effects – start-up to operational Economy of scale effects – Accession rates of 10 or 60 collections - 600% increase in accessions will only increase costs by 325% (ULCC)

15 Findings National subject repositories costs Acquisition and Ingest Archival Storage and Preservation Access c. 42%c. 23%c. 35%

16 Findings ADS project of long-term preservation costs Implications for sustainability via project charges Preservation interventions (file format migrations) Long-term storage costs Assumptions of archive growth (economies of scale) Assumptions on “first mover innovation”

17 What’s New? FEC based – not in or partial in other models but –Requirement for HEIs –Absence of FEC (a) distorts business cases eg for automation (b) cannot accurately compare in-house or out- source costs Not just DIY – application neutral – can cost for in- house archive, full or partial shared service(s), national/subject data centre archive charges Preservation: archival storage, preservation planning, data management, “first mover innovation” Tailored for research data: different collection levels, documentation+ metadata, products from data, etc

18 Conclusions

19 Cost Observations for Repositories Not just formula of function costs Can illustrate effect of some choices on costs Sustainable project archive funding model? Start-up v running costs bleeding-edge costs – “first mover innovation” Audit/capacity planning Not last word on costs....

20 Questions?


Download ppt ""Keeping alert: issues to know today for long-term digital preservation with repositories" Neil Beagrie Fedora Users Group Open Repositories Southampton."

Similar presentations


Ads by Google