Presentation is loading. Please wait.

Presentation is loading. Please wait.

Can a Consortium Build a Viable Preservation Repository? Presentation at CNI March 31, 2014 Bradley Daigle (APTrust – University of Virginia) Stephen Davis.

Similar presentations


Presentation on theme: "Can a Consortium Build a Viable Preservation Repository? Presentation at CNI March 31, 2014 Bradley Daigle (APTrust – University of Virginia) Stephen Davis."— Presentation transcript:

1 Can a Consortium Build a Viable Preservation Repository? Presentation at CNI March 31, 2014 Bradley Daigle (APTrust – University of Virginia) Stephen Davis (Columbia University) Linda Newman (University of Cincinnati) Suzanne Thorin (APTrust – University of Virginia) Scott Turnbull (APTrust – University of Virginia) www.aptrust.org

2 Academic Preservation Trust Academic Preservation Trust, a consortium of 17 institutions, is taking a community approach in building and managing a repository infrastructure that will provide long-term preservation of the scholarly record. APTrust will also be a DPN first node. www.aptrust.org

3 APTrust Institutions Columbia University Johns Hopkins University Indiana University North Carolina State University Penn State University Stanford University Syracuse University University of Chicago University of Cincinnati www.aptrust.org University of Connecticut University of Maryland University of Miami University of Michigan University of North Carolina University of Notre Dame University of Virginia Virginia Tech

4 APTrust is hosted by the University of Virginia, which fully supports 5 ½ staff, including space and equipment. Program Director Lead Engineer Junior Engineer Systems Engineer Content Lead (1/2 time) www.aptrust.org

5 Membership Dues Member dues: $20,000 annually Supports partner meetings, conference travel, contract and cloud services, marketing, and the web site www.aptrust.org

6 What is the problem we are trying to solve? Columbia University University of Cincinnati University of Virginia www.aptrust.org

7 Columbia University – Use Case 1 Columbia University Libraries / Information Services has made commitments … to granting agencies to provide long-term digital archiving for digital content created with grant funds to third-party content creators to provide permanent access to born-digital content acquired from them to continuing to collect and preserve archival collections, now partly or wholly born-digital content to permanently preserve University-generated archival and research content

8 Columbia University – Use Case 2 We must preserve the content of … Local Digitization Projects Preservation-Related Digitization Institutional Repository / Data Sets Born Digital Archival Content Archived Web Sites Super Dark Archives – highly secure

9

10

11 Columbia University – Questions Why create our own single-institution long-term preservation repository? Why divert scarce existing CUL/IS internal equipment funds to storage on a permanent basis? Why divert scarce existing CUL/IS staff time to creation, enhancement and maintenance of our own local preservation repository, permanently? Why undergo the costs and staff investment in obtaining local TRAC certification?

12 Question: Why is digital preservation important to us? Answer: We have digital collections where the original source material has deteriorated or is about to be intentionally destroyed. (Magnetic tapes, nitrate negatives considered flammable). The digital object is THE ONLY object. Magnetic tape image by Daniel P. B. Smith. Released under the GNU Free Documentation License. http://en.wikipedia.org/wiki/File:Magtape1.jpghttp://en.wikipedia.org/wiki/File:Magtape1.jpg Nitrate negative from Cincinnati Subway and Street Improvements (digital collection) http://drc.libraries.uc.edu/handle/2374.UC/702759http://drc.libraries.uc.edu/handle/2374.UC/702759 University of Cincinnati – Use Case www.aptrust.org

13 University of Cincinnati – Use Case Question: Why is digital preservation important to us? Answer: We just moved a repository system from Columbus Ohio to our Cincinnati campus. 10 TBs of data, in 16 different VMDKs (virtual machine disk images) was transferred over the internet pipeline Checksums were created for each VMDK and verified upon receipt, some taking 24 hours to calculate. Checksums were also created for one-million+ files, compared with info in the repository database, and re- compared after the storage format was changed (from VMDK to NFS). www.aptrust.org

14 University of Cincinnati – Use Case Question: Why is digital preservation important to us? Answer: (continued) We decided to test a full backup and restore. This took over a week, and we discovered that 16 of our digital assets were corrupt. We diagnosed the cause, adjusted, and repeated without error – but if we had not been comparing before and after checksums of all files we would not have known about the corruption. This process took a 1.5 months and offered a striking example of the care that must be taken to avoid losing data when moving large amounts of it. www.aptrust.org

15 University of Cincinnati – Use Case Question: Why is digital preservation important to us? Answer: Our credibility is at stake. We want to be believed. www.aptrust.org Photograph; President Nixon with Elvis Presley; 20 Dec 1970; Richard Nixon Presidential Library and Museum, Yorba Linda, California. http://www.nixonlibrary.gov/forresearchers/find/av/photo/images/12_20_70_3.gif http://www.nixonlibrary.gov/forresearchers/find/av/photo/images/12_20_70_3.gif

16 University of Cincinnati – Use Case Question: Why is digital preservation important to us? Answer: (continued) We are promoting a new digital repository to our faculty. Its raison d'être – why researchers should deposit their digital assets in this repository rather than or in addition to several short-term delivery systems on our campus – is long term persistence. We have promised that their assets will also be preserved in a dark archive such as the Academic Preservation Trust. We have stated that preservation means bit-level integrity and format migration. We have asserted that the Libraries’ traditional mission of preservation of the cultural record now applies to the digital scholarly record. www.aptrust.org

17 University of Virginia Use Case Integral part of our preservation and curatorial landscape Soup to nuts process for analogue materials ◦ Selection ◦ Digitization ◦ Management ◦ Stewardship

18 UVa - continued Born Digital ◦ It is all about transfer ◦ Disk images awaiting arrangement ◦ Need and I/O space ◦ Digital Scholarship  Wish we had this years ago

19 UVa Landscape Local disk (please only temporary) / scratch disk Spinning disk – still only backup Local HSM – local tape backup APTrust – more robust preservation actions DPN – dark archive

20 Basic Technology Goals Simple submission packaging – BagIt Strong Chain of Custody – Logging Format agnostic basic preservation - Fixity Strong auditing and reporting - PREMIS Easily reference items between systems – Identifiers Simple distribution package for restoration - BagIt

21 Flow of Content in APTrust Intellectual Object Generic File1 Generic File2 Generic File3 Submission Bag Metadata (TagFiles) Preservation Files data/File1 data/File2 data/File3 DPN Bag Break apart bag and manage as separate fedora objects Repackage to same bag format Ingest Restore Bagged separately in DPN to support versioning Related Fedora Objects

22 Challenges Abstracting away from specific repository software Identifying content across distributed systems Scaling solutions are still a mixed bag Managing dependencies in a consortium Deleting content requires some more work

23 Sustainability of Service Common development frameworks – Hydra Use available cloud services - AWS Align with evolving preservation ecosystem – OAIS & DDP ◦ Fedora 4 ◦ Standards like OAIS and DDP

24 APTrust and TRAC Certification APTrust is committed to working toward TRAC certification, APTrust is the first ever repository to be built from the ground up taking TRAC into account. A Certification Working Group has been established and will be advising and consulting with the APTrust staff and partners on TRAC objectives. Initial development work is proceeding at the level of Digital Object Management and Infrastructure.

25 Examples of TRAC Requirements “The repository shall have an appropriate succession plan, contingency plans, and/or escrow arrangements in place in case the repository ceases to operate or the governing or funding institution substantially changes its scope.” “The repository shall have short- and long-term business planning processes in place to sustain the repository over time.” “The repository shall have contracts or deposit agreements which specify and transfer all necessary preservation rights, and those rights transferred shall be documented.” “The repository shall have the appropriate number of staff to support all functions and services.” “The repository shall have and use a convention that generates persistent, unique identifiers.”

26 Academic Preservation Trust – part of the evolving national digital preservation infrastructure “The Task Force envisions the development of a national system of digital archives, which it defines as repositories of digital information that are collectively responsible for the long-term accessibility of the nation’s social, economic, cultural and intellectual heritage instantiated in digital form.” Preserving Digital Information. Report of the Task Force on Archiving of Digital Information, commissioned by The Commission on Preservation and Access and the Research Libraries Group. May 1, 1996. Executive Summary, iii.


Download ppt "Can a Consortium Build a Viable Preservation Repository? Presentation at CNI March 31, 2014 Bradley Daigle (APTrust – University of Virginia) Stephen Davis."

Similar presentations


Ads by Google