Presentation is loading. Please wait.

Presentation is loading. Please wait.

Supporting reproducible science in CSIRO RESEARCH DATA SUPPORT/ IM&T Sue Cook & Dom Hogan | CSIRO Information Management & Technology 29 April 2015.

Similar presentations


Presentation on theme: "Supporting reproducible science in CSIRO RESEARCH DATA SUPPORT/ IM&T Sue Cook & Dom Hogan | CSIRO Information Management & Technology 29 April 2015."— Presentation transcript:

1 Supporting reproducible science in CSIRO RESEARCH DATA SUPPORT/ IM&T Sue Cook & Dom Hogan | CSIRO Information Management & Technology 29 April 2015

2 Started 1665 350 years Primary output is the journal article Citations link outputs Rewards based on numbers of outputs and citations to outputs since 1955 Supporting reproducible science | Sue Cook 2 | Started 1993 * 25 years *give or take

3 Change The goal of reproducibility means that all science outputs and contributions – articles, data and software – need publication and citations to link them 3 | Supporting reproducible science | Sue Cook DataSoftware Provenance

4 “Citable”? One person’s opinion - Wilke, 2015: 1.Uniquely and unambiguously citable 2.Available in perpetuity, in unchanged form 3.Accessible to the public 4.Self-contained and complete 5.Attributable authorship “websites hosting scientific software will usually fail at least conditions 2 and 3, and thus would not be citable by my criteria.” Journal Editors and Peer Reviewers are the gatekeepers 4 | Supporting reproducible science | Dominic Hogan

5 Early example: MDBSY Murray Darling Basin Sustainable Yields Source data licensing Quality control Provenance Informs policy decisions that have large impact – decisions that wind up being defended in court. Data transparency is essential, but data quality is also essential. Supporting reproducible science | Sue Cook 5 |

6 Self-Serve Repository – Metadata and data Supporting reproducible science | Sue Cook 6 |

7 Self-Serve Repository – IP guides Supporting reproducible science | Sue Cook 7 |

8 Legal issues Data licences –Creative Commons promotes reuse, but is your data derived from something with restricted permissions? –CSIRO Data Licence: non- commercial, does not allow redistribution. Restricts reuse, but lower risk. Supporting reproducible science | Sue Cook 8 |

9 Software More licences available Binaries vs Code IP issues: –derived code? –Open source development? –Patents? Supporting reproducible science | Sue Cook 9 |

10 Supporting reproducible science | Sue Cook 10 |

11 11 | Link to code repository for updates and development Link to the related publication Link to the data Licence and supplement Attribution Supporting reproducible science | Sue Cook http://dx.doi.org/10.4225/08/ 536302C43FC28

12 12 | Software citation Data citation Supporting reproducible science | Sue Cook

13 Storage and permissions A controlled space allows for persistence, version control and security. This is good for getting DOIs, but… What about linking to data hosted elsewhere? Hosted services? Data Access Portal has grown over 100TB in the last year – the growth rate will increase. Supporting reproducible science | Sue Cook 13 |

14 Supporting reproducible science | Sue Cook 14 | If 1GB = 1 box trailer… 33.3 minutes at ADSL 2 1TB = 33 B-Doubles 23.1 days at ADSL 2 1PB = 3 supertankers 63 years at ADSL 2

15 Data volumes CAWCR Wave Hindcast – ~10 TB moves slowly over ADSL Supporting reproducible science | Sue Cook 15 |

16 Australian Square Kilometre Array Pathfinder ASKAP – processing a data stream of 70 Tb/s (that’s 8.75 TB) The data rates arriving at the Pawsey Centre are 2.5 GB/s (or 75 PB per year) – we can’t store this much Full operation will deal with 16 TB per day (5.7 PB per year) Supporting reproducible science | Sue Cook 16 |

17 ASKAP Data Management Supporting reproducible science | Sue Cook 20 |

18 “Progressive” DOIs Supporting reproducible science | Sue Cook 18 |

19 Provenance Supporting reproducible science | Sue Cook 19 |

20 Provenance Management System (PROMS) Supporting reproducible science | Sue Cook 20 | Don’t try this at home! Instead, go to http://ands.org.au/partner/provenance_interest_group.htmlhttp://ands.org.au/partner/provenance_interest_group.html

21 Some elements to connect Supporting reproducible science | Sue Cook 21 | Systems Infrastructure Processes (e.g. Quality Control, Approval) Legal Licensing Intellectual Property Culture Training Fulfilling needs … … … Policy

22 Thanks Research Data Support team Dom Hogan,David Benn, Anne Stevenson, John Morrissey, Cynthia Love CSIRO Information Management & Technology CSIRO Applications team CSIRO Scientific Computing team Australia Telescope National Facility Ian Corner for the supertanker analogy Nick Car for the provenance slides Australian National Data Service (ANDS) 22 | Supporting reproducible science | Dominic Hogan

23 Questions? Supporting reproducible science | Sue Cook

24 References Paul L Dineen. Blue. Photo, April 16, 2010. https://www.flickr.com/photos/pauldineen/4529213297/. https://www.flickr.com/photos/pauldineen/4529213297/ "Philosophical Transactions Volume 1 frontispiece" by Henry Oldenburg - Philosophical Transactions. Licensed under CC BY 4.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Philosophical_Transactions_Volume_1_frontispiece.j pg#mediaviewer/File:Philosophical_Transactions_Volume_1_frontispiece.jpg http://commons.wikimedia.org/wiki/File:Philosophical_Transactions_Volume_1_frontispiece.j pg#mediaviewer/File:Philosophical_Transactions_Volume_1_frontispiece.jpg Wilke, Claus. “What Constitutes a Citable Scientific Work?” The Serial Mentor, January 2, 2015. http://serialmentor.com/blog/2015/1/2/what-constitutes-a-citable-scientific-workhttp://serialmentor.com/blog/2015/1/2/what-constitutes-a-citable-scientific-work CSIRO. Water availability in the Murray-Darling basin : summary of a report to the Australian Government. 2008-10. https://publications.csiro.au/rpr/pub?pid=legacy:683https://publications.csiro.au/rpr/pub?pid=legacy:683 Whan, Alex, Matt Bolger, Leanne Bischof (2014): GrainScan - Software for analysis of grain images. v2. CSIRO. Data Collection. http://dx.doi.org/10.4225/08/536302C43FC28http://dx.doi.org/10.4225/08/536302C43FC28 Durrant, Tom, Diana Greenslade, Mark Hemer, Claire Trenham (2014). A Global Wave Hindcast focussed on the Central and South Pacific. CAWCR Technical Report No. 070. http://www.cawcr.gov.au/publications/technicalreports/CTR_070.pdf http://www.cawcr.gov.au/publications/technicalreports/CTR_070.pdf Car, Nicholas (2014). Inter-agency standardised provenance reporting in Australia. eResearch Australasia, 27-31 October 2014. Melbourne, Australia. 10p. https://publications.csiro.au/rpr/pub?pid=csiro:EP145084 https://publications.csiro.au/rpr/pub?pid=csiro:EP145084 Supporting reproducible science | Dominic Hogan 24 |

25 IM&T/Research Data Support Sue Cook Data Librarian t+61 8 64368532 eresearchdatasupport@csiro.au CSIRO IM&T Thank you


Download ppt "Supporting reproducible science in CSIRO RESEARCH DATA SUPPORT/ IM&T Sue Cook & Dom Hogan | CSIRO Information Management & Technology 29 April 2015."

Similar presentations


Ads by Google