Presentation on theme: " An Introduction to Data Management University of the Arts London, 19 March 2014 Jonathan Rans Digital Curation Centre, University of Edinburgh."— Presentation transcript:
An Introduction to Data Management University of the Arts London, 19 March 2014 Jonathan Rans Digital Curation Centre, University of Edinburgh
Overview 1. Definitions 2. National drivers 3. Institutions’ responses 4. What support is available?
The Digital Curation Centre The (est. 2004) is… A UK centre of expertise in digital preservation, with a particular focus on research data management (RDM) Based across three sites: Universities of Edinburgh, Glasgow and Bath Working with a number of UK universities to identify gaps in RDM provision and raise capabilities across the sector Also involved in a variety of national and international collaborations…
What is research data management? “the active management and appraisal of data over the lifecycle of scholarly and scientific interest” PlanCollectAssureDescribePreserveDiscoverIntegrateAnalyze SHARE …and RE-USE The DataONE lifecycle model Data management is a part of good research practice - RCUK
Developments in sensor technology, networking and digital storage enable new research and scientific paradigms As costs also fall, possibilities for data sharing, citation and re-use become much more widespread Research funders and publishers recognise the value of this and now tend to have greater expectations of the research that they support… Why is it a growing concern?
What are the benefits? PRESERVATION: Lots of data is unique, and can only be captured once. If lost, it’s irreplaceable. EFFICIENCY: Data collection can be funded once, and used many times for a variety of purposes TRANSPARENCY: The data that underpins research can be made open for anyone to scrutinise, and attempt to replicate findings RISK MANAGEMENT: A pro-active approach to data management reduces the risk of inappropriate disclosure of sensitive data, whether commercial or personal
Definitions vary from discipline to discipline, and from funder to funder… Here’s a science-centric definition: “The recorded factual material commonly accepted in the scientific community as necessary to validate research findings.” (US Office of Management and Budget, Circular 110) And another from the visual arts: “Evidence which is used or created to generate new knowledge and interpretations. ‘Evidence’ may be intersubjective or subjective; physical or emotional; persistent or ephemeral; personal or public; explicit or tacit; and is consciously or unconsciously referenced by the researcher at some point during the course of their research.” (Leigh Garrett, KAPTUR project: see http://kaptur.wordpress.com/ 2013/01/23/what-is-visual-arts-research-data-revisited/ So what is ‘data’ exactly?
Some characteristics of Arts and Humanities data are likely to require a different kind of handling from that given to other disciplines They are often personal and may not be factual in nature. Furthermore, they may be quite valuable or precious to their creator. The digital data in the Arts is as likely to be an outcome of the creative research process as an input to a workflow Event resources… http://www.dcc.ac.uk/events/research-data-management-forum- rdmf/rdmf10-research-data-management-arts-and-humanities http://www.dcc.ac.uk/events/research-data-management-forum- rdmf/rdmf10-research-data-management-arts-and-humanities http://www.digital.hss.ed.ac.uk/archive-events/201314- events/managing-humanities-research-data/ http://www.digital.hss.ed.ac.uk/archive-events/201314- events/managing-humanities-research-data/ Data in the Arts and Humanities
Nature, 09/08Economist, 02/10 Popular Science, 11/11 Science, 02/11 Nature, 09/09ACM, 12/08 InformationWeek, 08/10Computerworld, 11/12 Five years of front pages…
Open Data Open Data is a philosophy, underpinned by pragmatism… transparency + utility. “Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control.” – Wikipedia Governments, cities etc are all getting onboard Open Knowledge Foundation is basically the political / activist wing: http://okfn.org/http://okfn.org/ From the government / industry side, we have the Open Data Institute: http://theodi.org/http://theodi.org/
What do funders have to say? (i) Seven “Common Principles on Data Policy” – Data as a public good; Preservation; Discovery; Confidentiality; Right of first use; Recognition; Public funding for RDM Six of the seven RCUK councils require data management plans, or equivalent, at the application stage The seventh (EPSRC) requires nothing short of an institutional data infrastructure
What do funders have to say? (ii) AHRC requires that significant electronic resources or datasets are made available in an accessible repository for at least three years after the end of the grant Applicants submit a statement on data sharing in the relevant section of the Je-S form, and provide a two-page data management and sharing plan addressing 9 distinct themes Datasets must be offered to the UK Data Archive on conclusion of the project
Components of an RDM service http://www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services
Data management planning support Requirements written in to institutional policy Research office, IT and Library provide support DCC’s DMPonline Free for researchers Institutions can use the tool to deliver local support UAL has an institutional template http://www.dcc.ac.uk/resources/developing-rdm-services/dmps-arts-and-humanities
Active storage Institutions investing in managed storage for active data are making substantial amounts available free Institutional collaborative platforms 5 TB 1 TB0.5 TB http://www.dcc.ac.uk/blog/defining-institutional-data-storage-requirements
Selection and deposit 1. Relevance to Mission – including any legal/funder requirement to retain the data beyond its immediate use. 2. Scientific or Historical Value – significance and relationship to publications etc. 3. Uniqueness – can it be found elsewhere / if we don’t preserve it, who will? 4. Potential for Redistribution – quality / IP / ethical concerns are addressed. 5. Non-Replicability – either impossible to replicate (e.g. atmospheric or social science data) or not financially viable. 6. Economic Case – costs of managing and preserving the resource stack up well against potential future benefits. 7. Full Documentation – surrounding / contextual information necessary to facilitate future discovery, access, and reuse is adequate. How to Appraise & Select Research Data for Curation Angus Whyte, Digital Curation Centre, and Andrew Wilson, Australian National Data Service (2010)
Data repositories http://datashare.is.ed.ac.uk www.dspace.cam.ac.ukwww.dspace.cam.ac.uk/ Essex-RDR and DataPool at Southampton Not intended to replace national, subject or other established data collections Acknowledge hybrid environment http://www.researchdata.arts.ac.uk/
Data Catalogues DataFinder (Oxford) Researcher Dashboard (Lincoln) UK Research Data Registry (DCC and Jisc)
Disciplinary support services There may be scope for centres with a specific disciplinary focus to provide tailored support
i. DCC resources Publications Briefing Papers and How-To Guides Training e.g. DC101 events and Curation Reference Manual Advice e.g. Disciplinary metadata, www.dcc.ac.uk/resources/metadata-standards www.dcc.ac.uk/resources/metadata-standards Events International Digital Curation Conference (next one in London, February 2015) Research Data Management Forum (next one TBC, but always held in UK) Tools DMPonline, CARDIO, Data Asset Framework, DRAMBORA
ii. UAL resources DCC and UAL ran an institutional engagement between 2011 and 2013, which developed… A data management guidance web area: http://www.arts.ac.uk/research/research- environment/research-management/data- management/ http://www.arts.ac.uk/research/research- environment/research-management/data- management/ An institutional policy: http://www.arts.ac.uk/media/research/documents/UAL- Research-Data-Management-Policy.pdf http://www.arts.ac.uk/media/research/documents/UAL- Research-Data-Management-Policy.pdf A UAL data management planning template: http://dmponline.dcc.ac.uk http://dmponline.dcc.ac.uk UAL was involved in the KAPTUR project: http://kaptur.wordpress.com http://kaptur.wordpress.com
Thank you Any questions? This work is licensed under the Creative Commons Attribution 2.5 UK: Scotland License. For more about DCC services see www.dcc.ac.ukwww.dcc.ac.uk or follow us on twitter @digitalcuration and #ukdcc Jonathan Rans Digital Curation Centre University of Edinburgh J.Rans@ed.ac.uk @JNRans