Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why do I need to know about data management?

Similar presentations


Presentation on theme: "Why do I need to know about data management?"— Presentation transcript:

1 Why do I need to know about data management?
Dr Richard R. Plant & Dr Andrew Thompson Data Management Planning & Storage for Psychology project

2 Mission statement The DMSPpsych project will establish a culture of data management planning, archiving, and ongoing reuse of data acquired as a result of psychological research within the Department of Psychology at The University of Sheffield. We recognise it is often difficult and time consuming for the individual, research group or even department to follow a coordinated approach especially where there are no local or discipline specific exemplars to follow. By tackling these issues at a grass roots level, on a one-to-one basis, we hope to provide support and foster an atmosphere of collaboration with regard to data management. 24/05/2019 © The University of Sheffield

3 Why? You will no longer be able to apply for funding unless you have a data management plan, take better care of your data and ultimately share it Increase your citations by at least 69% Increase your chances of further funding and collaboration “Backstop” your research papers – journals are likely to request datasets (Stapel fraud) You might want to reuse your own data! Universities need a better organizational memory Good for science, good for UK PLC 24/05/2019 © The University of Sheffield

4 Cognitive dissonance? 24/05/2019 © The University of Sheffield

5 Ahhh, that’s better 24/05/2019 © The University of Sheffield

6 That looks bad! 24/05/2019 © The University of Sheffield

7 Fail to plan, plan to fail
These pictures were taken by Harvey Rutt 24/05/2019 © The University of Sheffield

8 Any gamblers in the house?
Anyone in the audience willing to let me smash up their main work laptop, PC, Mac with a hammer for: £5,000 Cash in used notes (no waiting around) A brand new identical laptop, PC, Mac instantly for free – you know what I’ll even upgrade you! No comeback - you can even have a go on the hammer (think how good that’d feel ) Limited time offer. Right here, right now… 24/05/2019 © The University of Sheffield

9 Anti-patterns In software engineering, an anti-pattern is a pattern that may be commonly used but is ineffective and/or counterproductive in practice. Coined in 1995 by Andrew Koenig. Popularized three years later by the book AntiPatterns; extended the use into general social interaction. At least two key elements present to formally distinguish an actual anti-pattern from a simple bad habit, bad practice, or bad idea: Some repeated pattern of action, process or structure that initially appears to be beneficial, but ultimately produces more bad consequences than beneficial results, and A refactored solution exists that is clearly documented, proven in actual practice and repeatable By formally describing repeated mistakes, one can recognize the forces that lead to their repetition and learn how others have refactored themselves out of these broken patterns. 24/05/2019 © The University of Sheffield

10 Data loss will happen to you
As surely as death and taxes – when & how Not just catastrophic events you should worry about: Dropping your laptop Hard drive failures Software updates Obsolescence/upgrades Poorly described data (metadata) Theft of equipment People move on Research trends (follow the money consequences) Overwriting data/versioning File formats Media degradation (CDR’s, memory sticks, SSD’s) 24/05/2019 © The University of Sheffield

11 Show me the money All UK research councils now require a data management plan be submitted with all new funding bids. Odds are you already need to do this if you have a grant! 24/05/2019 © The University of Sheffield

12 24/05/2019 © The University of Sheffield

13 NSF annual budget of about $6.9 billion (2010)
Funding source for 20% of federally supported research by America's colleges and universities Beginning January 18, 2011, plans for data management and sharing of the products of research a requirement. Proposals must include a supplementary document of no more than two pages labelled “Data Management Plan”. This supplement should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results (see AAG Chapter VI.D.4), and may include: the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project; the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies); policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements; policies and provisions for re-use, re-distribution, and the production of derivatives; and plans for archiving data, samples, and other research products, and for preservation of access to them. 24/05/2019 © The University of Sheffield

14 Increased citations Increase your citations by 69% through sharing data (Piwowar, Day & Fridsma 2007) 24/05/2019 © The University of Sheffield

15 Data papers Growing interest in publishing data papers which can be cited in a similar method to normal papers via DOI’s Get academic credit for sharing data Such papers describe what the data is, how it was collected, methodology, variables, suggested reuse and a link to the actual data DataCite ( is a classic example. New Psychology journal from Ubiquity press 24/05/2019 © The University of Sheffield

16 Better chance of further funding and collaboration
Suggestion that funders might bar or otherwise penalize if you don’t share 3 strikes and your out proposal ESRC funding of £10.8m for reusing existing data sets (call open now) More academic collaboration if you could see exactly what people are doing 24/05/2019 © The University of Sheffield

17 Psychology repositories that work already out there
24/05/2019 © The University of Sheffield

18 Backstop your research papers
24/05/2019 © The University of Sheffield

19 Our turn in the spotlight (again)
Cyril Burt "The Burt Affair“ Heritability of intelligence (as measured in IQ tests) Twin studies Published numerous articles and books on a host of topics Two of Burt's supposed collaborators, Margaret Howard and J. Conway, were invented by Burt himself First British Psychologist to be knighted Earlier work is often accepted as valid All of his notes and records had been burnt 24/05/2019 © The University of Sheffield

20 Reuse your own data In short more than one and the more basic and common the better, e.g. CSV raw text is better than E-Prime E-DataAid files Keep the original data along with any versions translated into new emerging formats, e.g. MS Word 2 -> Word 4 -> Word 6 going forward... Update to new storage media as it becomes available, Floppy disk -> CDR -> DVDR… plus keep the originals Bear in mind companies go bust and take their software and file formats with them, e.g. WordStar, WordPerfect, Lotus 123… plus companies are taken over and change direction IBM SPSS! Be sure to describe your data properly using metadata (data about data) so you or someone else can understand it! You can fall under the proverbial bus and so can all your data so describe it fully Printed copies aren’t all bad but remember these can go in the bin if space is short

21 Metadata: describe your data and methods used to create it
Pages 40-1 of Alexander Graham Bell's unpublished laboratory notebook ( ), describing first successful experiment with the telephone 24/05/2019 © The University of Sheffield

22 Copernicus stored his data with thesis & explained how it was coded c
Copernicus stored his data with thesis & explained how it was coded c. 500 years old! For an English translation: 24/05/2019 © The University of Sheffield

23 Organizational memory
Institutions now realizing this needs to be done (research & data) Exploitable Funding implications, e.g. EPSRC Prestige Impact / REF (back to 1993) Institutional policies and repositories, e.g. eprints 24/05/2019 © The University of Sheffield

24 Too much? I’m swimming against the tide!

25 Help! I’m here as a pair of boots on the ground to give discipline specific help: I can help you write Data Management Plans for grants to increase your chances of getting funded Put plans in place to help existing projects Help you manage/describe/share your data more effectively Aid DClinPsys with site files and data Currently working on a localised one-stop-shop website Come talk to me: 24/05/2019 © The University of Sheffield

26 Practical help! Come talk to me: r.r.plant@sheffield.ac.uk 24/05/2019
© The University of Sheffield


Download ppt "Why do I need to know about data management?"

Similar presentations


Ads by Google