Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library.

Slides:



Advertisements
Similar presentations
New Services for Data Creators and Providers Louise Corti, Head ESDS Qualidata/ Outreach & Training Alasdair Crockett, ESDS Data Services Manager.
Advertisements

UKRDS: the policy context 26 February 2009 Paul Hubbard Head of Research Policy, HEFCE.
Data Management Tools David Wallom. YOUR DATA DOES NOT BELONG TO YOU! IT BELONGS TO YOUR EMPLOYING INSTITUTION!
Discover the world at Leiden University Hans Fransen Introducing data management planning at an institution the first wobbly steps of a newborn baby.
… because good research needs good data DMP Online, Lincoln, 28 th Feb 2013 DMP Online Kerry Miller Digital Curation Centre University of Edinburgh
OpenAIRE & OA in H2020 Open Access Infrastructure for Research In Europe Inge Van Nieuwerburgh Gwen Franck.
UCL Library Services and Research Data Management – a case study Martin Moyle UCL Library Services ODE Workshop, LIBER Conference, 27 June 2012.
Data Management Planning Kerry Miller Digital Curation Centre University of Edinburgh DIY Research Data Management Training Kit for.
Managing your research data: University support for researchers Sally Rumsey The Bodleian Libraries University of Oxford Mary Harssch
How to Write a Data Management Plan Gareth Cole, Data Curation Officer, Open Access Team.
Repositioning for repositories: making the move to science data management Gerry Ryder CSIRO Information Management & Technology 21 January 2009.
Good practice in Research Data Management Module 2: RDM Introduction.
Because good research needs good data Research Data Management for Researchers University of Aberdeen 7 th October 2014 Jonathan Rans Digital Curation.
New DFG Information Infrastructure Projects Dr. Stefan Winkler-Nees; Birmingham, 28. March 2011 New DFG Information Infrastructure Projects.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Open Exeter Project Team
Open Research Data: EPFL 28 October, 2014 Open data and research data management at the University of Edinburgh: policies and services Open Research Data:
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
Library Services Research Data Management Services at Royal Holloway Dr Dace Rozenberga, Research Information Manager
C4D Workshop, Glasgow and London. July 2013 St Andrews : RDM and Pure CERIF 4 Datasets Workshop, Glasgow and London. July 2013 Anna Clements, Enterprise.
The emerging role of Institutional CRIS in facilitating Open Scholarship Anna Clements, Assistant Director (Digital Research) Jackie Proven, Repository.
Undertaken by the ………………………………
Good practice in Research Data Management Module 6: Tools, training and support.
EPSRC expectations on research data: What researchers need to know 12/03/2015 Masud Khokhar and Hardy Schwamm.
Sharing the load – librarians and research data support services Stephen Grace, Research Services Librarian M25 Conference, Wellcome Collection, 23 April.
Hydra and Research Data Management Neil Stewart, Digital Library Manager, London School of Economics and Political Science Presentation for Hydra Europe.
The emerging role of Institutional CRIS in facilitating Open Scholarship Anna Clements, Assistant Director (Digital Research) Jackie Proven, Repository.
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
Enabling E Research ANU Data Commons. What is it ? Building a repository for data sets o data can be deposited o updated o published to Research Data.
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
Research Data Management for Research Support staff 30 th June 2015 Isabel Chadwick, Research Data Management Librarian
Because good research needs good data Funded by: Digital Curation for Researchers, 28th February 2013 The Shifting Research Data Management Policy Landscape.
Research Services Ten top things researchers need to know about research data management Slides provided by DaMaRO Project, University of Oxford.
Services for Object Storage and Preservation March 2008 All content in these slides is considered work in progress. In no way does it represent an absolute.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Series 2013 Data Management at the National Climate Change and Wildlife Science Center.
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
What is Amanda Nixon Manager,
Data Management Lesley A. Brown Director of Proposal Development.
11 Researcher practice in data management Margaret Henty.
Research Data Management: University of Edinburgh Roadmap Jeff Haywood Vice Principal, CIO & Librarian Professor of Education & Technology University of.
Options for customising DMPonline Sarah Jones Digital Curation Centre, Glasgow DMPonline workshop, 9-10 November.
Institutional data curation implementation 1st African Digital Curation Conference 12 February 2008.
University of St Andrews Towards e-Research June 16 th 2005 Research-related computing developments in St Andrews Birgit Plietzsch, Anna Clements, Jeremy.
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
Introduction to Research Data Management Joy Davidson and Sarah Jones Digital Curation Centre
Using the DMPTool for data management plans Kathleen Fear February 27, 2014.
Writing a Data Management Plan with the DMPTool Kathleen Fear January 15, 2015.
RoaDMaP LEEDS RESEARCH DATA MANAGEMENT PILOT Research data Management Workshop Welcome!
Writing a successful data management plan Kathleen Fear October 17, 2013.
Monash.edu Research data ecosystem David Groenewegen Director, Research, University Library.
RDM and the research lifecycle. Create a DMP to plan how data will be created, captured, stored and accessed throughout its lifecycle. UoB Help: Research.
Data Management Planning Joy Davidson
Research Data Management in the Humanities: an Introduction to the Basics Open Exeter Project Team.
Robin Rice & Jeff Haywood University of Edinburgh IDCC, Chicago, Research Data Management (RDM) Initiatives at the University of Edinburgh.
Open Exeter Project Team
EPSRC research data expectations and research software management
Summit 2017 Breakout Group 2: Data Management (DM)
Institutional role in supporting open access, open science, open data
CFI John R Evans Leaders Fund Digital Data Management
Data catalogues and the data repository ADMIRe JISC MRD
Open Access to your Research Papers and Data
Open access in REF – Planning Workshop
Research Data Management
Research Data Management
Using a CRIS to support communication of research: mapping the publication cycle to deposit workflows for data and publications Federica Fina, Data Scientist,
Jisc Research Data Shared Service (RDSS)
Research data lifecycle²
Successful Data Curation for Large Data Archives
Research Data Dr Aoife Coffey, Research Data Coordinator
Presentation transcript:

Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library

Contents 2 Where do we fit in? Funders’ requirements St Andrews RDM Policy Research Lifecycle Our datasets since June 2015 The workflow and its gaps

Where do we fit in: Digital Research 3 Digital Research Research Computing Research Data Open Access

Funders’ requirements Funders are: RCUK, EC (Horizon2020), the Royal Society, and others. Applies to published results 4 Data Management Plan (DMP): o in this document outline how you will comply; what will you share; where will you archive; agree with collaborators Publications: o acknowledge funder (grant number) and include a statement on how to access supporting data (where? On which conditions?) Data: o should be made publicly available o should be retained - at least from last date of access Costs: o can be included in grant applications

5 St Andrews RDM policy […] raw and processed data, much of which is valuable and needs to be retained over the long term. […] data is managed according to good practices […] appropriate for the data and discipline concerned, and thereby to ensure compliance with the requirements of funding agencies and other stakeholders. Principal Investigators should, in the course of formulating their research proposals, develop a management plan for the data to be collected. Data will be deposited into an appropriate long-term archive. The University will maintain a central catalogue of metadata descriptions The University expects that all data will over time be made publicly available wherever possible

6 The Research Lifecycle Active Storage Public Storage Archival Storage Research Activity – the researchers’ interest is mainly focused here Dispose

7 Plan Create Process Preserve Share Reuse Active Storage Archival Storage Public Storage Data management plan o Funders require researchers to provide a series of information about their data o In what file format will the data be created? o What data volume is expected? o How are they going to be stored, backed-up and preserved? o When and how will the data be shared? The DMP offers the chance to estimate early on the storage requirements Not yet common practice

8 DMAOnline online tool developed by University of Lancaster Will allow estimate of storage and cost Dashboard view Plan Create Process Preserve Share Reuse Active Storage Archival Storage Public Storage

9 How does the Library help o Advice o Advocacy o Help writing DMPs o DMPonline – online tool with templates developed by the Digital Curation Center (DCC) o Research Computing – sustainable software Plan Create Process Preserve Share Reuse Active Storage Archival Storage Public Storage

10 Create and process: active storage Plan Create Process Preserve Share Reuse Active Storage Archival Storage Public Storage Centralised storage o University’s policy: 0.5 TB for each member of staff on an academic contract – any additional 1TB at £1000 o SAN, backed up daily. Some researchers have started using this storage as a back-up De-centralised storage (the reality) o USB devices, external hard disks, laptops, PCs, instrument servers, cloud storage o Data fragmented on different devices/servers o Often 1 copy in 1 location

11 Issues o No standard approach, data loss o No realisation that the data are the real asset of research o “The student left, I don’t know where the data are”; “Our data is saved on many different instruments’ servers, difficult to estimate the current total volume” Cultural change required Plan Create Process Preserve Share Reuse Active Storage Archival Storage Public Storage Additional storage can be costed in grant applications (funders say)

12 Solutions o Library and IT Services are working together and looking at: o Storage solutions (ITS expertise) o Data management systems, e.g. Electronic Laboratory Notebook (ELN) ELN in St Andrews (project in progress) o LabArchives - Flexible and no discipline specific o Pilot in discussion – Chemistry, Physics, Biology, Computer Science o Cloud storage – currently in US but if enough demand, UK/EU server o What about sensitive data? (Psychology raised concerns at this stage) o Local storage option Plan Create Process Preserve Share Reuse Active Storage Archival Storage Public Storage

13 Preserve: archival storage Plan Create Process Preserve Share Reuse Active Storage Archival Storage Public Storage Data can be deposited in PURE Not a proper preservation process o Long-term storage o Back-up o No file format migration Not a systematic process At the moment only for data underpinning published results – if funder has a data policy What about the other data?

14 Pure o Current Research Information System (CRIS) o Research Outputs catalogue, Research Data catalogue and repository o Research data stored on local servers

15 Plan Create Process Preserve Share Reuse Active Storage Archival Storage Public Storage What do we keep? o Data underpinning publication o Models, big volume of processed data? o Codes and parameters only? o Who should decide? “What formats?” o Open or proprietary formats? Some data have no use in an open format o EPSRC: “researchers are clearly responsible for the quality of data and supporting documentation” and “those requesting access to data responsible for: access to proprietary third party software”

16 Share and reuse: public storage Plan Create Process Preserve Share Reuse Active Storage Archival Storage Public Storage Once in Pure the data can be made publicly available on the University Research Portal ( We will issue a DOI

17

dataset records in Pure 71 dataset records with files and active St Andrews’ DOIs = 18 GB 32 datasets in progress 32 datasets deposited elsewhere but with metadata in Pure Our datasets since June 2015

19 The workflow and its gaps Metadata Files Research activity St Andrews Research Portal User Archival Private (105GB) Archival Public (528GB) Active Move

20 Gap 1: data management and active storage Metadata Files Research activity St Andrews Research Portal User Archival Private (105GB) Archival Public (528GB) Active Move

21 Issues - gaps o Researchers save data on their personal devices o Lack of back-up o Lack of proper data management o Data are often distributed in different locations (fragmented) o Difficult to estimate active storage required o Cost Questions o Could ELNs help? o Maybe a shared, affordable, on demand active storage? Feasible? o Large files?

22 Gap 2: data deposit Metadata Files Research activity St Andrews Research Portal User Archival Private (105GB) Archival Public (528GB) Active Move

23 Issues - gaps o Still a manual process o Researchers find it challenging to group all the data for deposit “The student left, I do not know where the data are” o Large files? o Software? Research Computing offer assistance Solutions? o Simplify the deposit process for researchers o Provide digital tools during the research activity phase to help track data o Integrate existing digital solutions to automate the deposit  NOMAD-Pure integration

24 NOMAD – NMR Online Management And Datastore Dr Tomas Lebl - Senior Scientific Officer for Solution State NMR -

25 NOMAD-Pure o On going project: Chemistry, Computer Science, Library and IT o NMR is heavily used in Chemistry o Aim: streamline the deposit of NMR data for funders’ compliance NOMAD File 1  File 2 File 3  … NMR public portal Link added to Pure record Copy NOMAD server NOMAD public server

26 Gap 3: preservation Metadata Files Research activity St Andrews Research Portal User Archival Private (105GB) Archival Public (528GB) Active Move

27 Preservation requires: o An archival and an access copy o Format migration Solutions? o Have an additional layer for preservation? For example Arkivum? o Jisc RDM Shared Service Pilot Files St Andrews Research Portal Preservation Layer Archival Private (105GB) Archival Public (528GB) Move Copy

28 Jisc RDM Shared Service Pilot

29 Looks at the preservation layer, not at the ‘live’ active storage “Will create a user friendly research data ingest and publication platform […] and enable long term archiving of research data

Thank you for listening How do we deal with large files? How can we fill the Active Storage gap? A Scottish shared service? Federica Fina, Anna Clements, CC BY