Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774.

Similar presentations


Presentation on theme: "Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774."— Presentation transcript:

1 Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774 DOI: 10.6084/m9.figshare.1368774

2 Today’s message Tools that fit with GigaDB – General purpose Research Object store Enhancing – Accessibility – Reproducibility Of some of your research objects – Software – images DOI: 10.6084/m9.figshare.1368774

3 Problems with scientific software - reproducibility DOI: 10.6084/m9.figshare.1368774

4 Measuring software reproducibility Systematic study: 515 papers (429 conference, 86 journal) <30% reproducible http://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774

5 Measuring software reproducibility http://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774

6 Reasons for failure “The good news is that I was able to find some code. I am just hoping that it is a stable working version of the code... I have lost some data... The bad news is that the code is not commented and/or clean. So, I cannot really guarantee that you will enjoy playing with it.” http://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774

7 Cost of failure Waste time Waste money – Ioannidis 2014 – 85% resources wasted Frustrating Distrust DOI: 10.1371/journal.pmed.1001747 DOI: 10.6084/m9.figshare.1368774

8 Literate programming - KnitR DOI: 10.6084/m9.figshare.1368774

9 Literate programming Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to humans what we want the computer to do. – Donald E. Knuth, Literate Programming, 1984 DOI: 10.6084/m9.figshare.1368774

10 Literate programming options See listing: http://www.gigasciencejournal.com/content/ 3/1/19 http://www.gigasciencejournal.com/content/ 3/1/19 – R: KnitR, Sweave, R-Markdown – Javascript: Tangle, Active Markdown (CoffeeScript) – Python: Ipython Notebooks – iReport links this functionality for Galaxy DOI: 10.6084/m9.figshare.1368774

11 KnitR is versatile R Python Ruby Haskell Perl SAS Coffeescript.txt LaTeX HTML D3.js R Markdown HTML5 slides Command line Any text? WordPress DOI: 10.6084/m9.figshare.1368774

12 KnitR – how does it work? Code chunks – Basic text (or latex or markdown), interrupted by ‘chunks’ of code For latex, similar to Sweave …some text \Sexpr{rfunc(var)} more text… …some text >= Some code @ Process this combined text/code with knit() in R DOI: 10.6084/m9.figshare.1368774

13 KnitR uses: easy to explain http://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774

14 KnitR uses: reproducible analysis Can string different tools/languages together Stores parameters Just like a pipeline/workflow system – E.g. galaxy, taverna, Knime But also: codifies your figures… DOI: 10.6084/m9.figshare.1368774

15 KnitR uses – codified figures Classic problems: No description of error bars No description of distributions Admittedly this could be fixed by ‘proper’ peer review Source code: http://bit.ly/1NQZlHh DOI: 10.6084/m9.figshare.1368774

16 KnitR uses: codified figures Code can be found quickly Using text as markers Plot can be altered – 1 line of code New visualisation produced instantaneously Better evaluation of results Source code: http://bit.ly/1NQZlHh DOI: 10.6084/m9.figshare.1368774

17 GigaScience KnitR example “This article is an example of a literate programming document. It has been created in R using the knitr package. Figures and tables in this paper are generated dynamically as the document is compiled. Several R packages are required to run the analysis. Materials are archived in the Gigascience database” DOI:10.1186/2047-217X-3-3 DOI: 10.6084/m9.figshare.1368774

18 Environment wrappers - VMs DOI: 10.6084/m9.figshare.1368774

19 Measuring software reproducibility http://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774

20 Your environment How hard would it be to start from scratch? What if you move from Ubuntu to Centos? Or just upgrade? Dependencies / Versions System settings Hard for you, horrendous for others! DOI: 10.6084/m9.figshare.1368774

21 Share your environment Virtual machine – Copy your exact environment – If it works for you, it works for anyone – Reproducibility, frozen in time DOI:10.1186/2047-217X-3-23 DOI: 10.6084/m9.figshare.1368774

22 Share your environment Docker – ‘light’ vm – Discrete unit of code+environment – Can be called from command line – Can be linked together New possibilities e.g. nucleotid.es – Benchmarking -> “data-driven peer-review”? http://nucleotid.es/ DOI: 10.6084/m9.figshare.1368774

23 Share your environment Some concerns: – http://ivory.idyll.org/blog/vms-considered- harmful.html http://ivory.idyll.org/blog/vms-considered- harmful.html – VM = black box? – Docker == black box! Solution-> codify the environment DOI: 10.6084/m9.figshare.1368774

24 Codify your environment Provisioning scripts are ‘research objects’ Improves adaptability (easier to recode for alternative OS etc) Builds in extra documentation Easier to share – although GigaDB still wants a compiled snapshot (i.e. full machine) DOI: 10.6084/m9.figshare.1368774

25 Short list of provisioning systems Vagrant Chef Salt Puppet Ansible Many more – see link for info Source: http://bit.ly/1wrYiuI DOI: 10.6084/m9.figshare.1368774

26 Images: release ALL the images with OMERO “And now for something completely different” DOI: 10.6084/m9.figshare.1368774

27 NO Phenotyping with microCT doi:10.1186/2047-217X-2-14 DOI: 10.6084/m9.figshare.1368774

28 NO Phenotyping with microCT doi:10.1186/2047-217X-3-6 DOI: 10.6084/m9.figshare.1368774

29 Hosting Images Image LIMS MetaData!!! Can handle most formats Web embedding View online, no need for software Open Source www.openmicroscopy.org/site/products/omero DOI: 10.6084/m9.figshare.1368774

30 www.openmicroscopy.org/site/products/omero DOI: 10.6084/m9.figshare.1368774

31 OMERO: providing access to imaging data View, filter, measure raw images with direct links from journal article. See all image data, not just cherry picked examples. Download and reprocess. DOI: 10.6084/m9.figshare.1368774

32 OMERO: Adding value http://jcb-dataviewer.rupress.org/ DOI: 10.6084/m9.figshare.1368774

33 The alternative......look but don't touch DOI: 10.6084/m9.figshare.1368774

34 Thanks for listening! Acknowledgements GigaTeam – Scott Edmunds – Peter Li – Chris Hunter – Jesse Xiao – Nicole Edmunds – Laurie Goodman Where to get these slides FigShare DOI: – 10.6084/m9.figshare.1368774 http://bit.ly/1JmnRiU DOI: 10.6084/m9.figshare.1368774


Download ppt "Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774."

Similar presentations


Ads by Google