Towards smart storage for repository preservation services Steve Hitchcock, David Tarrant, Adrian Brown 1, Ben O’Steen 2, Neil Jefferies 2 and Leslie Carr.

Slides:



Advertisements
Similar presentations
Adding OAI-ORE Support to Repository Platforms Alexey Maslov, Adam Mikeal, Scott Phillips, John Leggett, Mark McFarland Texas Digital Library TCDL09.
Advertisements

What is intraLibrary Connect? Martin Morrey Product Director, Intrallect Ltd
EPrints - Introducing EPrints 3 Software William J Nixon Digital Library Development Manager, University of Glasgow With many thanks to Les Carr and the.
Preserv Preservation Eprint Services Simple Preservation Services – towards Proactive Support for the Institutional Repository.
PRESERV PReservation Eprint SERVices A two-year JISC 4/04 project: iii Institutional repository infrastructure development Steve Hitchcock and Jessie Hey.
Engaging repository policy with preservation Steve Hitchcock and Neil Jefferies* Preserv 2 Project School of Electronics and Computer Science (ECS), Southampton.
Engaging repository policy with preservation Steve Hitchcock and Neil Jefferies* Preserv 2 Project School of Electronics and Computer Science (ECS), Southampton.
Preserv Preservation Eprint Services Scenario: Digital lifecycle begins with author creation and deposit of paper or data content into the institutional.
Reshaping Preserv 2 from a Life(cycle) perspective Steve Hitchcock and Dave Tarrant Preserv 2 Project School of Electronics and Computer Science (ECS),
Repository preservation services: divisible, viable and sustainable? Steve Hitchcock Preserv 2 Project Intelligence Agents Multimedia Group, School of.
Heinrich Stamerjohanns Institute for Science Networking Distributed Open Archives Dr. Heinrich Stamerjohanns Institute for Science Networking at the University.
Creating Institutional Repositories Stephen Pinfield.
Digital Preservation: Logical and bit-stream preservation using Plato and Eprints Physical preservation with Eprints: 2 File Formats and Risk Analysis.
Preservation as a Process of a Repository David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
Digital Preservation for Digital Repositories David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
KeepIt Kultur, eCrystals, EdShare (and NECTAR) – Preserve It! David Tarrant School of Electronics.
Copying Archives Project Group Members: Mushashu Lumpa Ngoni Munyaradzi.
Principles of Personalisation of Service Discovery Electronics and Computer Science, University of Southampton myGrid UK e-Science Project Juri Papay,
Repositories: Disruptive Technology or Disrupted Technology? Sandy Payette, Executive Director DORSDL Workshop at ECDL 2008 September 2008.
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
SOAPI: a flexible toolkit for implementing ingest and preservation workflows Mark Hedges Centre for e-Research, King’s College London Arts and Humanities.
Sally Rumsey ORA Service & Development Manager Why ORA? Why Fedora?
Dspace – Digital Repository Dawn Petherick, University Web Services Team Manager Information Services, University of Birmingham MIDESS Dissemination.
Role of Contributing Institutions – The NDL Movement Presented By Dr. B. Sutradhar, Librarian Central Library (ISO 9001:2008 Certified) IIT Kharagpur
A Framework for Distributed Preservation Workflows Rainer Schmidt AIT Austrian Institute of Technology iPres 2009, Oct. 5, San.
An Introduction to DuraCloud Carissa Smith, Partner Specialist Michele Kimpton, Project Director Bill Branan, Lead Software Developer Andrew Woods, Lead.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Good practice in Research Data Management Module 6: Tools, training and support.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
 EPrints & Preservation David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
XML: The Strategic Opportunity Roy Tennant Challenges*  Only librarians like to search, everyone else likes to find  Our users want more information.
David Tarrant University of Southampton Applying Open Storage to Institutional Repositories.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
Geoff Payne ARROW Project Manager 1 April Genesis Monash University information management perspective Desire to integrate initiatives such as electronic.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
Hussein Suleman University of Cape Town Department of Computer Science Advanced Information Management Laboratory High Performance.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
Electronic Theses at Rhodes University presented by Irene Vermaak Rhodes University Library National ETD Project CHELSA Stakeholder Workshop 5 November.
Ms. Irene Onyancha ISTD/Library & Information Management Services United Nations Economic Commission for Africa The Second Session of the Committee on.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
BMC Open Access Colloquium, 8 February Morgan: "Open Access Repositories"
Open access & visibility Management Digital Preservation ORA: Purposes.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
EPrints 10 Years of Digital Preservation. What is EPrints For?  EPrints offers a safe, open and useful place to store, share and manage material in the.
A centre of expertise in digital information management RDN, e-Prints UK and NOF- Digitise: a (very) small sample of UK OAI activity Andy.
The Resource Discovery Network and OAI Andy Powell UKOLN, University of Bath UKOLN is funded by Resource: The Council.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Services for Object Storage and Preservation March 2008 All content in these slides is considered work in progress. In no way does it represent an absolute.
Repositories COMP3016 Public, managed, web collections of knowledge.
Uganda Scholarly Digital Library (USDL) Makerere University’s Institutional Repository By Margaret Nakiganda URL:
Connecting Preservation Planning and Plato with Digital Repository Interfaces David Tarrant
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Hussein Suleman University of Cape Town Department of Computer Science Digital Libraries Laboratory February 2008 Data Curation Repositories:
1 BCS, Oxfordshire, 19 February, 2004 WEB ARCHIVING issues and challenges Deborah Woodyard Digital Preservation Coordinator.
From ePrints to eSPIDA: Digital Preservation at the University of Glasgow William J Nixon, Service Development DAEDALUS, University of Glasgow DPC: Digital.
The NLW Digital Asset Management System Paul Bevan DAMS Implementation Manager
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
International Planetary Data Alliance Registry Project Update September 16, 2011.
Building Distributed Educational Applications using P2P
An Introduction to Tessella and The Safety Deposit Box Platform
Overview: Fedora Architecture and Software Features
KeepIt Kultur, eCrystals, EdShare (and NECTAR) – Preserve It!
P2N: Cloud Control David Tarrant Ben O’Steen
PRESERV PReservation Eprint SERVices
Jisc Research Data Shared Service (RDSS)
Presentation transcript:

Towards smart storage for repository preservation services Steve Hitchcock, David Tarrant, Adrian Brown 1, Ben O’Steen 2, Neil Jefferies 2 and Leslie Carr Preserv 2 Project School of Electronics and Computer Science, University of Southampton 1 The National Archives, Kew 2 Oxford University Library 2008: The Fifth International Conference on Preservation of Digital Objects, London, September 2008

Three-stage strategy for keeping your data safe Ability to move data freely, easily and instantly –OAI, ORE, Atom Reliable, trusted large-scale storage –Open Storage Risk profiling: invoke a range of selectable services –Smart storage

About institutional repositories Set up by institutions of higher education and research to manage and disseminate their digital intellectual outputs. IRs are a special type of Web site, typically based on some repository software that presents a database of records pointing to the objects deposited. The Preserv 2 project is investigating the provision of preservation services for IRs. IRs in flux Uncertainty in terms of target content - published papers, theses, research data, teaching materials - policy, rights, even locus of content and responsibility for long- term management. OAI-ORE (Object Reuse and Exchange) effectively frees the data from being captive to repository software. Commercial repository services, from software- specific services to digital library services or more general 'cloud' or network storage services. Photo: Flickr/cpikas Flickr/cpikas

IRs are Open source repository softwares Open access content Open archives using OAI-PMH to share data with e.g. discovery services. Open repositories, using OAI-ORE enables the easy movement of data between different types of repository software Photo: Flickr/RighteeFlickr/Rightee

A new ‘open’ How open storage supports preservation services Open storage, large-scale storage devices based on open source software Open storage averts the need for a repository layer to access first- class objects – these are objects that can be addressed directly –In turn, these digital objects can be distributed and/or replicated over many open storage platforms. –In turn, able to select storage with built-in preservation support –Resilient storage platforms may be viable for preservation services aimed at multiple repositories E.g. Sun Microsystems STK5800 (codenamed Honeycomb) Google Repository

Smart storage Smart storage combines an underlying passive storage approach with the intelligence provided through services. The key to realising smart storage is to enable the services to communicate and share information with the digital content sources they may be acting on. This is done through machine-level application programming interfaces (APIs) and protocols.

APIs, interfaces and the Web architecture Major services on the Web, such as deploy their own simple, but different, APIs, e.g. –Google Maps –Within the repository community, SWORD (Simple Web-service Offering Repository Deposit) –Open storage platforms such as Sun's STK5800 and the Amazon Simple Storage Service (S3) To take advantage of open storage, repositories have to be able to talk to these services through their APIs.

Smart storage example: format services Preservation methods affecting formats can be classified in three stages (‘seamless flow’): –Format identification and characterization (which format?) –Preservation planning and technology watch (format risk and implications) –Preservation action, migration, etc. (what to do with the format) Format-based services tend to be ad hoc processes for which some tools are available –E.g. PRONOM-DROID from The National Archives (UK) –PRONOM is an online registry of technical information, such as file format signatures –DROID is a downloadable file format identification tool that applies these signatures) These and other tools could be used in a more coordinated manner.

Smart storage DROID: concept

Smart storage DROID: scheduling/history Scheduling interface controls when a DROID classification needs to be performed. Preserv 2 has developed a scheduling service that uses the Darwin Calendar Server and iCalendar format. Provides a powerful scheduling service with many clients already available - Apple iCal, Mozilla Sunbird, and others - that can read and interpret the files so that past and future events can be reviewed.

Smart storage DROID: OAI-PMH interface An OAI-PMH interface to open storage discovers the latest objects to have been deposited and which are ready for format classification. Could also be performed by simpler RSS or Atom-based methods. The interface has since been expanded to allow export of OAI-ORE resource maps in both RDF and Atom formats.

Smart storage DROID: implementation E.g. iCal, Outlook, Sunbird DROID Messaging History Open storage OAI-PMH Web server HTTP Stores results of DROID events Calendar server Repository Atom? Schedule event Is event done? Get results of event url, date User interface Machine interface, API ImplementedTo be implemented Scheduler DROID-OAI harvester

Risk profiling The scheduler will invoke actions based on the results of scanning by DROID allied to decision-making tools that use intelligence from planning and technology watch tools, such as –PRONOM, –Plato preservation planning tool from the EC-funded Planets project, –and others. Photo: Flickr/yourbartender Flickr/yourbartender

Summary: smart storage in the storage scheme Binary stream File systemneed to store multiple streams with permissions Content addressableadds content validation and object identifiers, metadata required to locate an object Openadds error correction and recovery, places processing close to storage, solves some bandwidth problems Smartopens up the close-to-storage approach for application development, transition to 'cloud' storage How smart storage addresses current storage issues – see full paper

Storage can become smarter Openness, in its various forms, the ability to move data freely and easily, needs to be supplemented by decision-making that can be automated based on the supplied intelligence and information. In this way, open storage can become ‘smarter’. Thanks to