Can a Consortium Build a Viable Preservation Repository? Presentation at CNI March 31, 2014 Bradley Daigle (APTrust – University of Virginia) Stephen Davis.

Slides:



Advertisements
Similar presentations
Deconstructing and Reconstructing a Repository – Strategies for Nimble Rebuilding Linda Newman Head of Digital Collections and Repositories University.
Advertisements

What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
The Future of Scholarship in the Digital Age: The Role of Institutional Repositories Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
Update at CNI, April 14, 2015 Chip German, APTrust Andrew Diamond, APTrust Nathan Tallman, University of Cincinnati Linda Newman, University of Cincinnati.
The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Sustainable Preservation Services for Archivists through Distributed Custody Caryn Wojcik State of Michigan Records Management Services.
HATHITRUST A Shared Digital Repository A Preservation Infrastructure Built to Last: Preservation, Community, and HathiTrust UNESCO Memory of the World.
By Eileen Clegg Digital Preservation at Columbia in the Old Days (2009)
TRAC / TDR ICPSR Trustworthy Digital Repositories.
DCAPE Project Update Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management.
Richard MARCIANO Chien-Yi HOU School of Information and Library Science (SILS) Sustainable Archives & Leveraging Technologies Group (SALT) University of.
Depositing and Disseminating Digital Resources Alan Morrison Collections Manager AHDS Subject Centre for Literature, Linguistics and Languages.
Working Together Revisited: Diverse Skills for Sustainability Robert P. Spindler Arizona State University December 5 th, 2006.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
Richard MARCIANO Chien-Yi HOU School of Information and Library Science (SILS) Sustainable Archives & Leveraging Technologies Group (SALT) University of.
Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.
High Water Raises All Boats Leveraging Partnerships on Campus to Build a Repository Mary Molinaro University of Kentucky Libraries.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
HATHITRUST A Shared Digital Repository HathiTrust: Putting Research in Context HTRC UnCamp September 10, 2012 John Wilkin, Executive Director, HathiTrust.
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
Preserving Electronic Mailing Lists: The H-Net Archive H-Net Mapped to the OAIS Model Preservation AssessmentPreservation improvementsOverview How H-Net.
Finding a New Way Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records Using.
World Data Center for Human Interactions in the Environment Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as.
Data Archiving and Networked Services DANS is an institute of KNAW en NWO Trusted Digital Archives and the Data Seal of Approval Peter Doorn Data Archiving.
Digital Preservation 101, or, How to Keep Bits for Centuries Julie C. Swierczek Digital Asset Manager and Digital Archivist Harvard Art Museums.
Digital Preservation: Lessons learned through national action Digital Preservation Interoperability Framework Workshop April 2010.
Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management ServicesSALT DCAPE.
Academic Preservation Trust Introducing APTrust. HOW THE DISCUSSION BEGAN In the beginning…
HATHITRUST A Shared Digital Repository HathiTrust and TRAC DigitalPreservation 2012 July 25, 2012 Jeremy York, Project Librarian, HathiTrust.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Digital Preservation MetaArchive Cooperative.  9:00-9:45 - Session 1: Digital Preservation Overview  9:45-11:00 - Session 2: Policy & Planning Overview.
Newfound Press: UT’s Scholarly Publishing Demonstration Linda L. Phillips University of Tennessee ASERL 2008 Fall Membership Meeting.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
A consortium committed to digital preservation Academic Preservation Trust CNI Fall Member Meeting 12/10/12 1 Robin Ruggaber, University of Virginia Michele.
Katherine Skinner, Executive Director, Educopia Institute ESOPI 2013 Chapel Hill, NC April 19, 2013.
Implementing an Institutional Repository: Part III 16 th North Carolina Serials Conference March 29, 2007 Resource Issues.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
This presentation describes the development and implementation of WSU Research Exchange, a permanent digital repository system that is being, adding WSU.
APT Trustworthy Digital Repository / Certification Working Group Progress Report, October 2015 Stephen Paul Davis, Columbia University Libraries.
Digital Preservation across the technologies, strategies, open standards & interoperability aspects including the legal issues Pratik Shrivastava Scientist.
Carcanet Case Study Fran Baker, John Rylands University Library University of Manchester SPRUCE event 19 January 2012.
From ePrints to eSPIDA: Digital Preservation at the University of Glasgow William J Nixon, Service Development DAEDALUS, University of Glasgow DPC: Digital.
The Importance of Standards in Digital Preservation Tina Norris Kayla Payne Jennifer
Institutional Repositories: the DSpace Experience Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
Archiving and Preservation Michele Kimpton CEO, DuraSpace Bryan Beecher Director, ICPSR DuraSpace Webinar November 2, 2011.
DEEP BLUE University of Michigan Institutional Repository.
Managing Access at the University of Oregon : a Case Study of Scholars’ Bank by Carol Hixson Head, Metadata and Digital Library Services
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
Store, Manage, and Archive Content in the Cloud Michele Kimpton, DuraSpace CEO DuraSpace Nate Klingenstein, Internet 2 Internet 2 meeting, April 2013.
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Making the Case for Curation: The Practical Experiment of DSpace Managing Digital Assets February 5-6, 2005 Charleston, SC Ann J. Wolpert, Director of.
The R EPOSITORY AS P UBLISHER OPPORTUNITIES AND CHALLENGES IN A DUAL ROLE BEN HOCKENBERRY SYSTEMS LIBRARIAN | ST. JOHN FISHER COLLEGE.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
DPLAfest, April 15, 2016 Chip German, Program Director, APTrust and Senior Director, Content Stewardship, at the University of Virginia Library
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Digital Preservation Initiatives in the United States A Summary Deanna B. Marcum.
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
Building Foundations: Fedora, Fez, and the ADR prepared by Jessica Branco Colati ADR Project Director, Colorado Alliance of Research Libraries
8 November 2012, Penn State Harrisburg Linda Friend University Libraries Publishing & Curation Services.
Chip German, Program Director, APTrust
APTrust and Georgetown University Library
Trusted Repository Systems Overview
Implementing an Institutional Repository: Part II
Implementing an Institutional Repository: Part III
Implementing an Institutional Repository: Part II
Local Rules Apply: Creating and Sustaining a Cost Effective Digital Preservation System on a Limited Budget Matt Ransom, Digital Assets Manager Belk Library.
Presentation transcript:

Can a Consortium Build a Viable Preservation Repository? Presentation at CNI March 31, 2014 Bradley Daigle (APTrust – University of Virginia) Stephen Davis (Columbia University) Linda Newman (University of Cincinnati) Suzanne Thorin (APTrust – University of Virginia) Scott Turnbull (APTrust – University of Virginia)

Academic Preservation Trust Academic Preservation Trust, a consortium of 17 institutions, is taking a community approach in building and managing a repository infrastructure that will provide long-term preservation of the scholarly record. APTrust will also be a DPN first node.

APTrust Institutions Columbia University Johns Hopkins University Indiana University North Carolina State University Penn State University Stanford University Syracuse University University of Chicago University of Cincinnati University of Connecticut University of Maryland University of Miami University of Michigan University of North Carolina University of Notre Dame University of Virginia Virginia Tech

APTrust is hosted by the University of Virginia, which fully supports 5 ½ staff, including space and equipment. Program Director Lead Engineer Junior Engineer Systems Engineer Content Lead (1/2 time)

Membership Dues Member dues: $20,000 annually Supports partner meetings, conference travel, contract and cloud services, marketing, and the web site

What is the problem we are trying to solve? Columbia University University of Cincinnati University of Virginia

Columbia University – Use Case 1 Columbia University Libraries / Information Services has made commitments … to granting agencies to provide long-term digital archiving for digital content created with grant funds to third-party content creators to provide permanent access to born-digital content acquired from them to continuing to collect and preserve archival collections, now partly or wholly born-digital content to permanently preserve University-generated archival and research content

Columbia University – Use Case 2 We must preserve the content of … Local Digitization Projects Preservation-Related Digitization Institutional Repository / Data Sets Born Digital Archival Content Archived Web Sites Super Dark Archives – highly secure

Columbia University – Questions Why create our own single-institution long-term preservation repository? Why divert scarce existing CUL/IS internal equipment funds to storage on a permanent basis? Why divert scarce existing CUL/IS staff time to creation, enhancement and maintenance of our own local preservation repository, permanently? Why undergo the costs and staff investment in obtaining local TRAC certification?

Question: Why is digital preservation important to us? Answer: We have digital collections where the original source material has deteriorated or is about to be intentionally destroyed. (Magnetic tapes, nitrate negatives considered flammable). The digital object is THE ONLY object. Magnetic tape image by Daniel P. B. Smith. Released under the GNU Free Documentation License. Nitrate negative from Cincinnati Subway and Street Improvements (digital collection) University of Cincinnati – Use Case

University of Cincinnati – Use Case Question: Why is digital preservation important to us? Answer: We just moved a repository system from Columbus Ohio to our Cincinnati campus. 10 TBs of data, in 16 different VMDKs (virtual machine disk images) was transferred over the internet pipeline Checksums were created for each VMDK and verified upon receipt, some taking 24 hours to calculate. Checksums were also created for one-million+ files, compared with info in the repository database, and re- compared after the storage format was changed (from VMDK to NFS).

University of Cincinnati – Use Case Question: Why is digital preservation important to us? Answer: (continued) We decided to test a full backup and restore. This took over a week, and we discovered that 16 of our digital assets were corrupt. We diagnosed the cause, adjusted, and repeated without error – but if we had not been comparing before and after checksums of all files we would not have known about the corruption. This process took a 1.5 months and offered a striking example of the care that must be taken to avoid losing data when moving large amounts of it.

University of Cincinnati – Use Case Question: Why is digital preservation important to us? Answer: Our credibility is at stake. We want to be believed. Photograph; President Nixon with Elvis Presley; 20 Dec 1970; Richard Nixon Presidential Library and Museum, Yorba Linda, California.

University of Cincinnati – Use Case Question: Why is digital preservation important to us? Answer: (continued) We are promoting a new digital repository to our faculty. Its raison d'être – why researchers should deposit their digital assets in this repository rather than or in addition to several short-term delivery systems on our campus – is long term persistence. We have promised that their assets will also be preserved in a dark archive such as the Academic Preservation Trust. We have stated that preservation means bit-level integrity and format migration. We have asserted that the Libraries’ traditional mission of preservation of the cultural record now applies to the digital scholarly record.

University of Virginia Use Case Integral part of our preservation and curatorial landscape Soup to nuts process for analogue materials ◦ Selection ◦ Digitization ◦ Management ◦ Stewardship

UVa - continued Born Digital ◦ It is all about transfer ◦ Disk images awaiting arrangement ◦ Need and I/O space ◦ Digital Scholarship  Wish we had this years ago

UVa Landscape Local disk (please only temporary) / scratch disk Spinning disk – still only backup Local HSM – local tape backup APTrust – more robust preservation actions DPN – dark archive

Basic Technology Goals Simple submission packaging – BagIt Strong Chain of Custody – Logging Format agnostic basic preservation - Fixity Strong auditing and reporting - PREMIS Easily reference items between systems – Identifiers Simple distribution package for restoration - BagIt

Flow of Content in APTrust Intellectual Object Generic File1 Generic File2 Generic File3 Submission Bag Metadata (TagFiles) Preservation Files data/File1 data/File2 data/File3 DPN Bag Break apart bag and manage as separate fedora objects Repackage to same bag format Ingest Restore Bagged separately in DPN to support versioning Related Fedora Objects

Challenges Abstracting away from specific repository software Identifying content across distributed systems Scaling solutions are still a mixed bag Managing dependencies in a consortium Deleting content requires some more work

Sustainability of Service Common development frameworks – Hydra Use available cloud services - AWS Align with evolving preservation ecosystem – OAIS & DDP ◦ Fedora 4 ◦ Standards like OAIS and DDP

APTrust and TRAC Certification APTrust is committed to working toward TRAC certification, APTrust is the first ever repository to be built from the ground up taking TRAC into account. A Certification Working Group has been established and will be advising and consulting with the APTrust staff and partners on TRAC objectives. Initial development work is proceeding at the level of Digital Object Management and Infrastructure.

Examples of TRAC Requirements “The repository shall have an appropriate succession plan, contingency plans, and/or escrow arrangements in place in case the repository ceases to operate or the governing or funding institution substantially changes its scope.” “The repository shall have short- and long-term business planning processes in place to sustain the repository over time.” “The repository shall have contracts or deposit agreements which specify and transfer all necessary preservation rights, and those rights transferred shall be documented.” “The repository shall have the appropriate number of staff to support all functions and services.” “The repository shall have and use a convention that generates persistent, unique identifiers.”

Academic Preservation Trust – part of the evolving national digital preservation infrastructure “The Task Force envisions the development of a national system of digital archives, which it defines as repositories of digital information that are collectively responsible for the long-term accessibility of the nation’s social, economic, cultural and intellectual heritage instantiated in digital form.” Preserving Digital Information. Report of the Task Force on Archiving of Digital Information, commissioned by The Commission on Preservation and Access and the Research Libraries Group. May 1, Executive Summary, iii.