Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation.

Similar presentations


Presentation on theme: "Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation."— Presentation transcript:

1 Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation

2 From Silos to Interoperability Digital preservation is still an emerging field Two successful approaches: – Integrated Rule-Oriented Data System (iRODS) – Lots of Copies Keep Stuff Safe (LOCKSS) Powerful technologies, currently isolated Seeking to bridge the gap and foster interoperability.

3 Presentation sections Chronopolis Program overview MetaArchive Cooperative overview Current and proposed work to automate the exchange of data between the systems.

4 Chronopolis Basic Facts Three node federated data grid at UCSD/SDSC, NCAR and UMIACS with capacity for up to 50 TB of data per node (150 TB total) Using the Storage Resource Broker (SRB) for data management (moving to iRODS) Using BagIt file packaging format and SRB tools to ingest and transfer data Using Auditing Control Environment (ACE) for integrity checking.

5 Current Chronopolis collections Spring 2010 Data Providers: Inter-university Consortium of Political and Social Research – preservation copy of collections including 40 years of social science data and Census California Digital Library – political and government web crawls, Web-at-risk collection SIO Explorer – data from 50 years of research voyages NCSU Libraries -- state and local geospatial data http://chronopolis.sdsc.edu

6 MetaArchive Basic Facts Established in 2004, preserving content for 15 member institutions Uses LOCKSS software to provide long-term care for materials in a distributed digital preservation network Sustainable organizational framework: Membership organization with a 501c3 host (Educopia Institute) 254 TB network capacity (adding more as new members join) Compliant as a Trustworthy Digital Repository (2009 TRAC audit available on our site).

7 MetaArchive Collections. Current Members/Contributors, Spring 2010 Auburn University Boston College Clemson University Florida State University Folger Shakespeare Library Georgia Tech Library of Congress Penn State University PUC Rio de Janeiro Rice University University of Hull University of Louisville University of North Texas University of South Carolina Virginia Tech Current Affiliates Library of Congress NDLTD SDSC Chronopolis We welcome new members!

8 Collaboration Roadmap Chronopolis and MetaArchive realize the value in looking at inter-institutional preservation Have been pursuing informally Looking at ways of formalizing this process for long-term preservation goals.

9 The Plan Develop tools and methods to automate exchange of data between MetaArchive Cooperative (LOCKSS-based) and Chronopolis (iRODS-based) Examine data transfer tools/protocols from: – California Digital Library micro-services – iRODS protocols for data transfer – LOCKSS “plug-in” approach for data transfer Goal: A highly robust, easy to use preservation “system,” allowing digital objects to be shared between several major preservation networks in the U.S.

10 Focus Issues What does it mean to unite systems? Ability to export data between systems – Verify appropriate fixity – Transparency for system administrators Ability to track collections between systems – Verify collections are retrievable – Verify collections retain original characteristics.

11 Technical Issues What are the best ways to have an SRB/iRODS datagrid and a LOCKSS PLN interact? What does it mean to have an active system (MetaArchive) and an archival system (Chronopolis) work together? What are the appropriate transfer technologies? – iRODS and LOCKSS native tools – CDL Micro-services, e.g. BagIt.

12 The Process Identify the atomic units in our process – E.g. ingest, verification, data transfer, fixity checking Identify commonalities and differences Resolve needed issues.

13 Transfer technology: BagIt Hierarchical file packaging format for exchanging digital content – There is no software to install – Consists of base directory with manifest file & subdirectory with content – Manifest file has a row for each content file with: Full path in content directory A checksum for file “Holey” Bags – Have additional ‘fetch.txt’ file in base directory & empty content directory – URLs for each content file are listed in fetch.txt file. – Can reduce transfer time by fetching content in parallel http://www.digitalpreservation.gov/library/resources/tools/docs/bagitspec.pdf.

14 Initial development goals XML-standardized representation of common technical data that needs to be tracked for exchange and preservation of data and metadata Ingestion reference model and framework to enable automated and interoperable capture of metadata from files in MetaArchive and Chronopolis.

15 Procedural Issues What exactly are the inter-institutional ties? – “Just” backup? – Added service for our customers/members? – Will all customers want this? Legal issues with data owners MetaArchive and Chronopolis have very different management approaches. How do cross- institutional decisions get made?.

16 Organizational Issues Having a “seat at the table” at meetings and planning processes Working together on staffing and hiring Working together to identify customers and new opportunities.

17 The Big Win Important data preservation demonstration – No single system can solve all problems – No single system appeals to all user needs Practical, useful process for our organizations – Makes us individually stronger – Provides LOCKSS and iRODS systems with exit strategies if they ever prove necessary – Enables tools built for one system to be used by both.

18 Contacts MetaArchive: http://www.metaarchive.org/ Chronopolis: http://chronopolis.sdsc.edu/ Katherine Skinner: katherine.skinner@metaarchive.org David Minor: minor@sdsc.edu.


Download ppt "Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation."

Similar presentations


Ads by Google