Presentation is loading. Please wait.

Presentation is loading. Please wait.

Starting from the end: what to do when restricted data is released

Similar presentations


Presentation on theme: "Starting from the end: what to do when restricted data is released"— Presentation transcript:

1 Starting from the end: what to do when restricted data is released
Dr Marta Teperek Office of Scholarly Communication, University of Cambridge @martateperek SciDataCon 2016, Monday 12 September 2016

2 Slides are available at:

3 This session will cover:
Content This session will cover: Background to Cambridge research repository The incident: the release of restricted data What did we do Workflow development Lessons learnt

4 About the Cambridge research repository
Created in 2005 – joint project with the MIT Hosting ALL research outputs (problems!) Over 200,000 research outputs!!! Articles, Theses Datasets, Software Videos, Book Chapters Presentations…

5 About the Cambridge research repository
Created in 2005 – joint project with the MIT Hosting ALL research outputs (problems!) Quite popular: 12 August – 11 September: 20,772 visits

6 About the Cambridge research repository
Created in 2005 – joint project with the MIT Hosting ALL research outputs (problems!) Quite popular Mints DOIs for datasets

7 Advocacy + easy to share process => lots of data shared
2015 2016 In a bit more than a year 10X more data submissions than during a decade

8 About the Cambridge research repository
Created in 2005 – joint project with the MIT Hosting ALL research outputs (problems!) Quite popular Mints DOIs for datasets But: managed access to data currently not provided More and more requests for managed access to data Currently scoping to provide managed access to data

9 We tell researchers to go somewhere else…

10 Restricted dataset shared by a Cambridge researcher was released
The incident Restricted dataset shared by a Cambridge researcher was released

11 Externally-held dataset Dataset protected by:
What was released Externally-held dataset Dataset protected by: Pre-publication embargo License agreement specifying the re-use conditions The dataset was not yet complete The researcher informed weeks after the repository noticed the error The dataset had been downloaded several times

12 Never blame the repository:
Time to act We had to act: Provide the researcher with appropriate support and advice on how to proceed Document the steps: Develop workflows for dealing with is type of situations in the future Community resource Never blame the repository: Hosting personal/sensitive data will always present an element of risk

13 Risk assessment Three types of risks: Risks to study participants Risks to the researcher Reputational risks

14 Risk to study participants
Can participants be re-identified?... The risk can never be eliminated, it can only be managed

15 Risk to the researcher Being scooped
Publishers might refuse to publish this work Re-users might be misled by incomplete data

16 Reputational risks What if participants are re-identified and the information is released in the public domain Threat to future research funding

17 Risk mitigation Contact those who downloaded the data… …impossible…. …only IP addresses available

18 Mitigating the risk to study participants
Low risk of re-identification Informed the study administrator at the Research Office Inform the ethics committee

19 Mitigating the risk to the researcher
Letters to publishers

20 Mitigating reputational risks
Contacting the funder of research

21 Workflow establishment

22 Lessons learnt Transparency and open communication necessary to build trust and understanding

23 Lessons learnt – better guidance needed
We already offer: Workshops on Research Integrity and Ethics Workshops on Research Data Management Online training Guidance on creating consent forms Missing data anonymisation guidance - risks constantly evolve: new datasets available new computational tools to link data

24 Watch out for our paper in the Data Science Journal
Thank you Questions: @martateperek @CamOpenData Watch out for our paper in the Data Science Journal Slides:


Download ppt "Starting from the end: what to do when restricted data is released"

Similar presentations


Ads by Google