Hydra, research data and Archivematica

Slides:



Advertisements
Similar presentations
Preserv Preservation Eprint Services Simple Preservation Services – towards Proactive Support for the Institutional Repository.
Advertisements

EThOS A National OAI and Digitisation Service for e-theses in the United Kingdom Chris Awre EThOSnet Web Services Day June 2009.
RepoMMan and the University of Hull Institutional Repository Richard Green.
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
EUROCRIS2013, Porto, /15 Publishing structural health monitoring data Fábio Costa, Gabriel David, Álvaro Cunha INESC TEC, Faculty of Engineering.
RepoMMan: using Web Services and BPEL to facilitate workflow interaction with a digital repository Richard Green.
PREMIS: To Be or Not To Be in My METS The Preservation Journey at the University of Connecticut Libraries ALA Annual 2013 ALCTS PARS Intellectual Access.
Fedora Users’ Conference Rutgers University May 14, 2005 Researching Fedora's Ability to Serve as a Preservation System for Electronic University Records.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Towards a mature, multi- purpose repository for the institution… Chris Awre, Simon Lamb, Richard Green Open Repositories 2012, Session RF6.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
Preservica: Preservation as a Service
ISO & OAI-PMH By Neal Harmeyer, Amy Hatfield, and Brandon Beatty PURDUE UNIVERSITY RESEARCH REPOSITORY.
IASSIST conference 2006 Efficient Ingest of Datasets in a Two-Stage Archival Process: The First Phase - Easy-Store Marion Wittenberg
Common Use Cases for Preservation Metadata Deborah Woodyard-Robinson Digital Preservation Consultant Long-term Repositories:
Archivematica-Islandora Integration Module Evelyn McLellan
SOAPI: a flexible toolkit for implementing ingest and preservation workflows Mark Hedges Centre for e-Research, King’s College London Arts and Humanities.
1 The REMAP Project Using a digital repository to support the embedding of records management and digital preservation within the institution Fedora at.
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.
Making the SHiFt: Using Sufia with Hydra/Fedora for collection management and access James Halliday Programmer/Analyst, Library Technologies Juliet L.
“Filling the digital preservation gap” an update from the Jisc Research Data Spring project at York and Hull Jenny Mitcham Digital Archivist Borthwick.
US GPO AIP Independence Test CS 496A – Senior Design Team members: Antonio Castillo, Johnny Ng, Aram Weintraub, Tin-Shuk Wong Faculty advisor: Dr. Russ.
International Council on Archives Section on University and Research Institution Archives Michigan State University September 7, 2005 Preserving Electronic.
The DSpace Course Module – An introduction to DSpace.
Glen Robson Head of Systems Unit National Library of Wales
Hydra Europe Symposium | April 2015 | 1 Hydra and open access Chris Awre Hydra Europe Symposium London School of Economics, 24 th April 2015.
The repositories Landscape: where are Repositories now and what’s around the corner? UKDA-store Louise Corti UKDA, University of Essex MIMAS OPEN FORUM.
BUILDING ON COMMON GROUND: EXPLORING THE INTERSECTION OF ARCHIVES AND DATA CURATION Lizzy Rolando & Wendy Hagenmaier 6/3/2015IASSIST 2015.
May 2, 2013 An introduction to DSpace. Module 1 – An Introduction By the end of this module, you will … Understand what DSpace is, and what it can be.
Metadata for digital preservation: a review of recent developments Michael Day UKOLN, University of Bath ECDL2001, 5th European Conference.
OAIS: From Requirements to Reality at OCLC FLICC / CENDI Symposium, Dec Pam Kircher Product Manager, Digital Archive OCLC Digital & Preservation.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Preservation metadata and the Cedars project Michael Day UKOLN: UK Office for Library and Information Networking University of Bath
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Sharing OERs via Jorum Siobhán Burke and Sarah Currier 12 th December 2012.
Fedora Service Framework Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Preservation Functionality in a Digital Archive Erik Oltmans Koninklijke Bibliotheek Raymond J. van Diessen IBM Business Consulting Services Hilde van.
Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library.
PREMIS in Archivematica PETER VAN GARDEREN Artefactual Systems Inc. American Library Association New Orleans - June 24, 2011.
Beyond the Repository: Research Systems, REF & New Opportunities William J Nixon Digital Library Development Manager.
A Reusable Framework for Automated Record Creation and Population
Current as of April/May 2013
Ingest and Dissemination with DAITSS
CRIS interoperability in the UK: developing solutions for Open Access and Research Data Management 31/05/2017 Dr Tamsin Burland.
Personal Archives Accessible in Digital Media
DAITSS: Dark Archive in the Sunshine State
Trove Tufts Digital Image Library
Digital Preservation In Practice
Invenio-Archivematica/SIPStore
Richard Green (for Chris Awre) Open Repositories Conference, Dublin
Institutional role in supporting open access, open science, open data
Integrated Open Access (OA) Service Mick Eadie, Research Information Officer Valerie McCutcheon, Research Information
Better than it was Finding what works for processing born-digital archives at the Bentley Historical Library Mike Shallcross U-M Bentley Historical Library.
VI-SEEM Data Repository
Sophia Lafferty-hess | research data manager
Introducing the IRUSdataUK pilot
Using Hydra to Model Sustainable Collections
eCulture Science Gateway – reloaded
Alan Morrison Research Data Support Officer University of Strathclyde
Hydra: a case study Chris Awre
Christy Shorey Southern Miss
Research Data Management
Repository Platforms for Research Data Interest Group: Requirements, Gaps, Capabilities, and Progress Robert R. Downs1, 1 NASA.
Open Archival Information System
Jisc Research Data Shared Service (RDSS)
Presentation transcript:

Hydra, research data and Archivematica 6/9/2018 Hydra, research data and Archivematica Julie Allinson (York) and Richard Green (Hull) Hydra Connect 2016– Boston, MA – 5th October 2016

Research data management Why do we need digital preservation for research data? Often unique and irreplaceable May be needed to validate conclusions reached in publications May have a high-level mandate to do so May have potential for re-use Doesn’t our repository do that? Many (most?) repositories are designed for medium-term curation and access rather than long-term preservation Preservation implies actively taking steps to increase the chances of enabling meaningful re-use in the future

Enter Archivematica Archivematica is “a web- and standards-based, open-source application which allows your institution to preserve long- term access to trustworthy, authentic and reliable digital content.” “…in compliance with the ISO-OAIS functional model.” York and Hull both keen to see how Archivematica might potentially fit into our preservation workflows

Jisc “Research Data Spring” funding Late 2014 Jisc made grant funding available for universities to investigate ways of managing research data (to satisfy a Government mandate) Hull and York jointly awarded money to look at the role Archivematica might play in an RDM workflow: “Filling the Digital Preservation Gap”

Jisc “Research Data Spring” funding Led to three phase project 2015-2016 (each phase bid for separately!): Phase 1 : Desk research Phase 2 : Work with Archivematica and PRONOM teams to make Archivematica a better fit to the need – the project actively funded development work - and to develop local implementation plans for integrated systems Phase 3 : Further improvements (particularly to file format identification via DROID) and build proof-of-concept (p-o-c) systems at Hull and York

Jisc Research Data Shared Service Almost in parallel with the Research Data Spring projects Jisc were planning a Research Data Shared Service The resulting system will be managed and hosted, and will offer three core modules : repository, preservation and reporting Phase 1 and 2 reports from Hull and York very influential for the preservation module Commercial and open source offerings for each module, including Archivematica (for preservation) and Hydra (for the repo) Over 20 pilot institutions recruited (including York) – all identified preservation as a priority

York p-o-c implementation York wanted to provide: an easy way of depositing data a way of monitoring datasets for RDM staff a way of requesting access to data with: data sent to archivematica dataset metadata pulled from PURE

York p-o-c implementation Metadata from PURE pulled in nightly or on-demand Visual representation of workflow status Fedora objects created for the dataset to store local admin info and help connect the PURE and Archivematica records

York PCDM modelling Dataset = Dataset record from PURE Dataset can be made up of multiple ‘Packages’ of data, eg. newer version Individual data files stored, but folder structure is not Folder structure available in Archivematica METS

York p-o-c outputs Code: https://github.com/digital-york/researchdatayork https://github.com/digital-york/dlibhydra Data model (draft): https://docs.google.com/document/d/1QPw9kLqRFz I5aStRr3nlBqqj5BzkI3eFZY4zNjOmo8w/edit?usp=sha ring

What next in York? Our RDM staff love the p-o-c and we have agreement to turn it into a production system over the autumn/winter This has been a helpful exercise for broader data modelling / Hydra implementation at York

Hull p-o-c implementation Hull keen to make Archivematica part of a workflow for any type of repository content – not just research data. You may have seen a poster at Hydra Connect last year: Hull’s p-o-c implements most of the automated bulk ingest route, creates AIP(s) and builds repository objects from the DIP(s)

Hull p-o-c implementation User assembles files and simple descriptive file(s) in Box folder. Shares the folder with Archivematica System checks folder contents and if OK creates a bag (BagIt standard) for each object which is passed to Archivematica Archivematica processes the bag to create an AIP which goes to a preservation store… …and also a DIP which is passed to the DIP processor DIP processor creates Hydra objects from the DIP contents and injects them into the repository QA queue… …matched to the AIP by UUID Thanks to Cottage Labs for all the new development work!

Hull p-o-c options Depositors have several options: A folder containing multiple data files and one descriptive file  a single AIP and a single repository object with (optionally) one or more surrogate files for download (so can be a “metadata-only” record) A folder containing multiple files and a csv file (one row per file)  multiple AIPs with multiple repository objects, each with (optionally) a surrogate for download A folder containing the top-level folder of a structure  a zipped structure in a single AIP and a single repository object (optionally) containing the zipped file for download

What’s next in Hull? We hope to be able to take the p-o-c work and turn it into a production system Hull is the UK’s “City of Culture” next year and there will be a great deal of digital material that the University Archives want to capture for posterity

Digital Preservation Awards 2016 We’re flattered that the project has been nominated as one of the three finalists for the Digital Preservation Awards “Research and Innovation Category” this year

Project reports The project reports for “Filling the Digital Preservation Gap” are available through Hull’s repository hydra.hull.ac.uk - search for the project title The phase one and two reports are there now, phase three by the end of the month

Thanks! Questions? julie.allinson@york.ac.uk r.green@hull.ac.uk Hull github: https://github.com/orgs/uohull Colleagues Chris Awre (Interim Librarian, University of Hull) c.awre@hull.ac.uk Jenny Mitcham (Digital Archivist, University of York) jenny.mitcham@york.ac.uk Simon Wilson (University Archivist, University of Hull) s.wilson@hull.ac.uk Archivematica - www.archivematica.org