Presentation is loading. Please wait.

Presentation is loading. Please wait.

Open data in the social sciences, conundrum or feasible?”

Similar presentations


Presentation on theme: "Open data in the social sciences, conundrum or feasible?”"— Presentation transcript:

1 Open data in the social sciences, conundrum or feasible?”
Veerle Van den Eynden UK Data Service University of Essex Digital Research Conversation University of Birmingham 7 November 2018

2 UK Data Service Curate, preserve, provide access to social science data for reuse Funded by ESRC Data management advice for data creators Support for users of the service Information about the use to which data are put ukdataservice.ac.uk We put together a collection of the most valuable data and enhance that over time. We preserve data in the long term to make it available for re-use We also provide data management advice as we recognise it’s importance for high quality research data and research excellence and also generally research funders require data management and sharing plans at the start of research projects so we like to be able to assist researchers with that. We provide user support and we encourage users to send queries in via our helpdesk as it’s the most efficient way of dealin with user queries.

3 Some statistics about the UK Service
7,277 datasets in the collection 1034 qualitative and mixed methods collections 400 new datasets added each year 219 case studies of data reuse 25,000 registered users 60,000 downloads worldwide per year 4000+ user support queries per year

4 Research data services team
Supporting researchers to make research data shareable UK Data Service helps materialise Data Policy for the Economic and Social Research Council (ESRC) Data management planning advice & guidance Data management guidance & training, esp. on confidentiality, security, ethics Research data available for re-use to maximum extent possible, via: ReShare repository

5 Research with people – can data be open data?
Research data can be (alpha)numerical (e.g. survey), textual (e.g. interview), video (e.g. focus group), audio (e.g. interview), image (e.g. ethnography) Research data can contain: Personal information: information that discloses the identity of the person E.g. tax reference number, address, photo Confidential information: information ‘promised’ to be kept confidential, through a confidentiality agreement, statement in consent form or mutual (tacit) understanding E.g. employment history, business information Sensitive information: information to be protected against unwanted disclosure E.g. sexual preferences, health information About the person studied, or about 3rd parties

6 Share data obtained from people in a legal and ethical way
Informed consent or permission to share, esp. if there’s identifying, confidential or sensitive information Protect identities through anonymisation or not collecting personal information Regulate access where needed

7 A bit of ethical and legal background
Research with human participants requires ethical review (Research Ethics Committee) Ethics = do no harm, uphold scientific standards Data sharing is NOT violation of data privacy or research ethics

8 Key principles of research ethics
Research should aim to maximise benefit for individuals and society and minimise risk and harm The rights and dignity of individuals and groups should be respected Wherever possible, participation should be voluntary and appropriately informed Research should be conducted with integrity and transparency Lines of responsibility and accountability should be clearly defined Independence of research should be maintained and conflicts of interest cannot made explicit ESRC Framework for Research Ethics

9 Duty of confidentiality and data sharing
Duty of confidentiality exists in UK common law and may apply to research data Information given in circumstances where it is expected that a duty of confidence applies, cannot normally be disclosed without the information provider’s consent Disclosure of confidential information is lawful when: the individual to whom the information relates has consented – consent for data sharing disclosure is necessary to safeguard the individual, or others, or is in the public interest there is a legal duty to do so, for example a court order Duty need not be explicit, and need not be in writing.

10 The General Data Protection Regulation (GDPR)
= Data Protection Act 2018 Applies to ‘personal data’: any information relating to an identifiable (living) person who can be directly or indirectly identified in particular by reference to an identifier Living persons Anonymised data is NOT personal data so the GDPR does NOT apply Applies to: any EU researcher (data controller) who collects personal data about a citizen of any country, anywhere in the world A data controller or data processor based outside the EU but collecting personal data on EU citizens

11 The GDPR principles for processing personal data
1. Process lawfully, fair and transparent Inform participant of what will be done with the data, process accordingly 2. Keep to the original purpose Collect data for specified, explicit and legitimate purposes Do not process further in a manner incompatible with those purposes 3. Minimise data size Personal data collected should be adequate, relevant and limited to what is necessary 4. Uphold accuracy Personal data should be accurate and kept up to date 5. Remove data which aren't used 6. Ensure data integrity and confidentiality Protection against unauthorised or unlawful processing, accidental loss, destruction or damage, using appropriate technical or organisational measures

12 Data subject rights The right to be informed The right of access
The right to rectification (correction) The right to erasure (right to be forgotten) The right to restrict processing The right to data portability The right to object Rights in relation to automated individual decision-making and profiling

13 Grounds for processing personal data
One of these must be present to process a data subject’s personal data: Consent of the data subject Necessary for the performance of a contract Legal obligation placed upon controller Necessary to protect vital interests of the data subject Carried out in the public interest or is in the exercise of official authority Legitimate interest pursued by controller

14 The GDPR research exemption
Further processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes is not considered to be incompatible with the initial purposes Appropriate safeguards, e.g. data minimisation pseudonymisation Principles 2 and 5 less strict: Purpose: further processing of personal data allowed (2) Personal data may be stored for longer periods (5)

15 Share data obtained from people in a legal and ethical way
Informed consent or permission to share, esp. if there’s identifying, confidential or sensitive information Protect identities through anonymisation or not collecting personal information Regulate access where needed

16 Consent is needed across the data life cycle
Engagement in the research process Dissemination in presentations, publications, the web Data sharing, archiving and future reuse of data Always dependent on the research context – special cases for covert research, verbal consent, etc. UKDS template consent form

17 Consent and data sharing
The best way to achieve informed consent for data sharing is to identify and explain the possible future uses of their data and offer the participant the option to consent on a granular level Across the research lifecycle e.g. for each new data collection in a longitudinal study Examples: in a multi-modal study, allow the participant to consent (or not) separately to data sharing for various data collection events, e.g. survey, clinical assessment,… in a qualitative study, allow the participant to consent (or not) separately to data sharing of anonymised transcripts, non-anonymised audio recordings, photographs,…

18 Timing and form of consent
Advantage Disadvantage One-off consent: participant is asked to consent to taking part in the research project only once. Simple Least hassle to participants Research outputs not known in advance Participants will not know all info they will contribute Process consent : participant’s consent is requested continuously throughout the research project Ensures ‘active’ consent May not get all consent needed before losing contact Repetitive, can annoy participants Advantage Disadvantage Written consent (no legal requirement) More solid legal ground, e.g. participant has agreed to disclose confidential info Often required by Ethics Committees Offers more protection for researcher (as they have written documentation of consent) Not possible for some cases: infirm, illegal activities May scare people from participating (or have them think that they cannot withdraw their consent)  Verbal consent Best if recorded Can be difficult to make all issues clear verbally Possibly greater risks for researcher (in regards to adequately proving participant consent)

19 Anonymising quantitative data
Direct and indirect identifiers Remove direct identifiers e.g. names, address, institution, photo Reduce the precision/detail of a variable through aggregation e.g. birth year instead of date of birth, occupational categories rather than jobs; and, area rather than village Generalise meaning of detailed text variable e.g. occupational expertise Restrict upper lower ranges of a variable to hide outliers e.g. income, age

20 Anonymising qualitative data
Remove direct identifiers, or replace with pseudonyms – often not essential research information Avoid blanking out; use pseudonyms or replacements Identify replacements with [brackets] Plan or apply editing at time of transcription Avoid over-anonymising – removing information in text can distort data, make them unusable, unreliable or misleading; so balance anonymisation with the need to preserve context Consistency within research team and throughout project

21 Managing access to data at UKDS
available for download/online access under open licence without any registration Open available for download / online access to logged-in users who have registered and agreed to an End User Licence (e.g. not identify any potentially identifiable individuals) special agreements (depositor permission; approved researcher) embargo for fixed time period Safeguarded available for remote or safe room access to authorised and authenticated users whose research proposal has been and who have received training Controlled Essential when anonymisation is ineffective or damaging visual or audio data or disclosive microdata UK Data Service Access Policy has three tiers: Open data - no registration, but may be licenced, e.g. CC Safeguarded data – not personal, but disclosure risk if linked Registration required, agree End User Licence (e.g. not identify any potentially identifiable individuals) Special agreements (depositor permission; approved researcher) Embargo for fixed time period Controlled data - may be identifiable Only available to accredited users Accessed via onsite or virtual secure environment (secure lab)

22 So can social science data be open data?
Yes: Anonymous data Beware: full anonymization can be difficult Permission (verbal or written) from people to disclose identifying information E.g. oral history stories

23 Open about data with restricted access
Publish: Which data exist Where data are kept, e.g. which repository Who can access them For which purpose Under which conditions

24 Data at UKDS Open data: 553 datasets Safeguarded data: 6568 datasets
Controlled data: 151 datasets

25 Examples

26

27 Veerle Van den Eynden


Download ppt "Open data in the social sciences, conundrum or feasible?”"

Similar presentations


Ads by Google