Supporting reproducible science in CSIRO RESEARCH DATA SUPPORT/ IM&T Sue Cook & Dom Hogan | CSIRO Information Management & Technology 29 April 2015.

Slides:



Advertisements
Similar presentations
University of Sydney – Academic Forum – 13 April 2005 John Shipp University Librarian THE FUTURE OF THE UNIVERSITY LIBRARY CHANGES IN SCHOLARLY COMMUNICATION.
Advertisements

Data citation at Geoscience Australia Policy Amanda Steen (Systems and Data Librarian) Infrastructure to support data citation Dr Sue Fyfe (Director, Data.
The Finch Report and RCUK policies Michael Jubb Research Information Network 5 th Couperin Open Access Meeting 24 January 2013.
CSIRO ASKAP Science Data Archive (CASDA) Project Kick-Off IM&T AND CASS Dan Miller| Project Manager 17 July 2014.
Demystifying the data interview SEQld Data Intensive - 30 January 2015 Kathryn Unsworth.
LIBRARY SERVICES Managing Research Data at QUT Paula Callan - eResearch Access Coordinator This work is licensed.
What is data citation & why do we care? What’s been happening here and overseas? How ready are you for data citation? 1 Welcome! Image:
ASKAP Central Processor: Design and Implementation Calibration and Imaging Workshop 2014 ASTRONOMY AND SPACE SCIENCE Ben Humphreys | ASKAP Software and.
An Assessment of the Efficiency of Water-Trading Markets in the Murray-Darling Basin James Smits Supervised by Professor Snow Barlow University of Melbourne,
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
Greater Reach for your Research: Author’s Rights & the Shifting Landscape of Scholarly Communication Lisa Goddard & Shannon Gordon Memorial University.
Research Impact Alexandra Byrnes, Research Publication Officer Rio
Data Publishing & Management Learning Objectives: 1.Introduce the advantages of publishing your data, the steps involved and how to publish to increase.
Open Exeter Project Team
WORLD BANK Publications The reference of choice on development The Promise, and Challenge, of Implementing Open Access at the World Bank Carlos Rossel.
Open access to publications and research data in Horizon 2020
Keith G Jeffery Director, IT Grey Literature The Process and Quality Issues Keith G Jeffery.
Changing the dynamics of scholarly communication Fides Datu Lawton Director (Library Resources Unit) UTS:Library SCONUL visit 8 september 2005.
Curtin University is a trademark of Curtin University of Technology CRICOS Provider Code 00301J OF RESEARCH DATA Research Week 2014 THE CARE AND FEEDING.
Social Science Data and ETDs: Issues and Challenges Joan Cheverie Georgetown University Myron Gutmann ICPSR – University of Michigan Austin McLean ProQuest.
Data licensing in CSIRO CSIRO INFORMATION MANAGEMENT & TECHNOLOGY Sue Cook, CSIRO & Gerry Ryder, ANDS 21 May 2015.
1 | CSIRO ASKAP Science Data Archive (CASDA) – Stage 0 Project Intent Statement Confirm the necessary requirements, use cases, workflows, business processes,
Data citation in CSIRO Building a culture of data citation CSIRO INFORMATION MANAGEMENT & TECHNOLOGY Anne Stevenson | Research Data Services Support Adapted.
Data citation in CSIRO Building a culture of data citation CSIRO INFORMATION MANAGEMENT & TECHNOLOGY Anne Stevenson | Research Data Services Support 26.
Data citation in CSIRO Building a culture of data citation CSIRO INFORMATION MANAGEMENT & TECHNOLOGY Anne Stevenson | Research Data Services Support Adapted.
We are the 92% Valuing the contribution of research software Neil Chue Hong, FORCE2015 Research Communications and e-Scholarship.
CRICOS No J a university for the world real R Managing the legal issues: practical steps for handling copyright, IP and other legal issues Kylie.
Citing Data Sets in the Literature: ORNL DAAC Practices Robert Cook, Suresh SanthanaVannan, and Daine Wright Environmental Sciences Division Oak Ridge.
Elements of a Data Management Plan Bill Michener University Libraries University of New Mexico Data Management Practices for.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
1 Why should “WE” CARE about data?. International initiatives OECD principles and guidelines for access to research data from public funding 2007 “Access.
Where are the rewards? University of Melbourne 28 January
Amy Jackson UNM Technology Days July 22,  An institutional repository (IR) is a web-based database of scholarly material which is institutionally.
Where are the rewards? Building a culture of data citation workshop Edith Cowan University, Perth March
Data Citation & Digital Object Identifiers DOIs. 2 DOIs for articles mints DOIs for Journal articles and some datasets.
Introducing Australia’s Terrestrial Ecosystem Research Network: linking disciplines for better environmental outcomes. Nikki Thurgate.
RENEE LE ROUX SABIF MANAGER WHY SABIF ? (SOUTH AFRICAN NODE OF GBIF) To create an enabling platform for researchers in South Africa –to.
ESRIN Earth Observation Program Ground Segment Department 26/09/2015 CEOS-WGISS-40 - Olivier BaroisSlide 1 Open Source Practices.
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
ScholarSpace & Open UH Mānoa March 2013 Beth Tillinghast Web Support Librarian ScholarSpace & eVols Project Manager UHM Library.
Connecting researchers and research organisations with data publication metrics the visibility and value of data publications in bibliometrics and altmetrics.
Data Citation & Digital Object Identifiers DOIs. 2 Digital Object Identifiers 101 Persistent identifier Identifies intellectual property in the digital.
Ethics and Scientific Writing. Ethical Considerations Ethics more important than legal considerations Your name and integrity are all that you have!
It’s the data that makes a paper Joerg Heber Executive Editor Nature Communications.
What is data citation & why do we care? What’s been happening here and overseas? How ready are you for data citation? 1 Welcome! Image:
Software Sustainability Institute Tracking Software Contributions doi: /m9.figshare Joint ORCID – DRYAD Symposium on Research.
11 Researcher practice in data management Margaret Henty.
CSIRO’s Data Access Portal Sue Cook | Research Data Services Support 18 March 2014.
IESR, A Registry of Collections and Services: Using the DCMI Collection Description Profile in Practice Ann Apps MIMAS, The University of Manchester, UK.
Licensing Health and Sensitive Data Dr Jeff Christiansen, Intersect | med.data.edu.au Publishing & Sharing Health-y Data Seminar, 26 Nov
Issues in RDM This work is licensed under a Creative Commons Attribution 4.0 International LicenseCreative Commons Attribution 4.0 International License.
| 1 Anita de Waard, VP Research Data Collaborations Elsevier RDM Services May 20, 2016 Publishing The Full Research Cycle To Support.
Updating image To update the background image: Go to ‘View’ Select ‘Slide Master’ Select the page with the image Right click on the image and select ‘Change.
Why ANDS? 16 May, 2011 Mathew Wyatt. Trends towards open data  Data science  Gov 2.0  Research 2.0  Open Science  Freedom of Information.
DATUM for Health – Healthy research needs healthy data I’ve collected my data, so what do I do with it now? Research data management Session 2 Data Curation.
Kathleen Shearer Data management: The new frontier for libraries.
Publish your Data on the Tropical Data Hub Seeding the Commons Project Australian National Data Service e-Research Centre James Cook University This work.
Open Exeter Project Team
Author Rights Sarah A. Norris, Scholarly Communication Librarian,
Pasquale Pagano CNR – ISTI (Pisa, Italy)
Sarah Norris, Lily Flick, UCF Libraries
ACS 2016 Moving research forward with persistent identifiers
CS 115: COMPUTING FOR The Socio-Techno Web
Experiences of the Digital Repository of Ireland
COPYRIGHT A Melbourne Athenaeum Library Cybersafety Information Guide
Research Data at TU Delft
Research Data Management
Sharing and publishing research data
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
Presentation transcript:

Supporting reproducible science in CSIRO RESEARCH DATA SUPPORT/ IM&T Sue Cook & Dom Hogan | CSIRO Information Management & Technology 29 April 2015

Started years Primary output is the journal article Citations link outputs Rewards based on numbers of outputs and citations to outputs since 1955 Supporting reproducible science | Sue Cook 2 | Started 1993 * 25 years *give or take

Change The goal of reproducibility means that all science outputs and contributions – articles, data and software – need publication and citations to link them 3 | Supporting reproducible science | Sue Cook DataSoftware Provenance

“Citable”? One person’s opinion - Wilke, 2015: 1.Uniquely and unambiguously citable 2.Available in perpetuity, in unchanged form 3.Accessible to the public 4.Self-contained and complete 5.Attributable authorship “websites hosting scientific software will usually fail at least conditions 2 and 3, and thus would not be citable by my criteria.” Journal Editors and Peer Reviewers are the gatekeepers 4 | Supporting reproducible science | Dominic Hogan

Early example: MDBSY Murray Darling Basin Sustainable Yields Source data licensing Quality control Provenance Informs policy decisions that have large impact – decisions that wind up being defended in court. Data transparency is essential, but data quality is also essential. Supporting reproducible science | Sue Cook 5 |

Self-Serve Repository – Metadata and data Supporting reproducible science | Sue Cook 6 |

Self-Serve Repository – IP guides Supporting reproducible science | Sue Cook 7 |

Legal issues Data licences –Creative Commons promotes reuse, but is your data derived from something with restricted permissions? –CSIRO Data Licence: non- commercial, does not allow redistribution. Restricts reuse, but lower risk. Supporting reproducible science | Sue Cook 8 |

Software More licences available Binaries vs Code IP issues: –derived code? –Open source development? –Patents? Supporting reproducible science | Sue Cook 9 |

Supporting reproducible science | Sue Cook 10 |

11 | Link to code repository for updates and development Link to the related publication Link to the data Licence and supplement Attribution Supporting reproducible science | Sue Cook C43FC28

12 | Software citation Data citation Supporting reproducible science | Sue Cook

Storage and permissions A controlled space allows for persistence, version control and security. This is good for getting DOIs, but… What about linking to data hosted elsewhere? Hosted services? Data Access Portal has grown over 100TB in the last year – the growth rate will increase. Supporting reproducible science | Sue Cook 13 |

Supporting reproducible science | Sue Cook 14 | If 1GB = 1 box trailer… 33.3 minutes at ADSL 2 1TB = 33 B-Doubles 23.1 days at ADSL 2 1PB = 3 supertankers 63 years at ADSL 2

Data volumes CAWCR Wave Hindcast – ~10 TB moves slowly over ADSL Supporting reproducible science | Sue Cook 15 |

Australian Square Kilometre Array Pathfinder ASKAP – processing a data stream of 70 Tb/s (that’s 8.75 TB) The data rates arriving at the Pawsey Centre are 2.5 GB/s (or 75 PB per year) – we can’t store this much Full operation will deal with 16 TB per day (5.7 PB per year) Supporting reproducible science | Sue Cook 16 |

ASKAP Data Management Supporting reproducible science | Sue Cook 20 |

“Progressive” DOIs Supporting reproducible science | Sue Cook 18 |

Provenance Supporting reproducible science | Sue Cook 19 |

Provenance Management System (PROMS) Supporting reproducible science | Sue Cook 20 | Don’t try this at home! Instead, go to

Some elements to connect Supporting reproducible science | Sue Cook 21 | Systems Infrastructure Processes (e.g. Quality Control, Approval) Legal Licensing Intellectual Property Culture Training Fulfilling needs … … … Policy

Thanks Research Data Support team Dom Hogan,David Benn, Anne Stevenson, John Morrissey, Cynthia Love CSIRO Information Management & Technology CSIRO Applications team CSIRO Scientific Computing team Australia Telescope National Facility Ian Corner for the supertanker analogy Nick Car for the provenance slides Australian National Data Service (ANDS) 22 | Supporting reproducible science | Dominic Hogan

Questions? Supporting reproducible science | Sue Cook

References Paul L Dineen. Blue. Photo, April 16, "Philosophical Transactions Volume 1 frontispiece" by Henry Oldenburg - Philosophical Transactions. Licensed under CC BY 4.0 via Wikimedia Commons - pg#mediaviewer/File:Philosophical_Transactions_Volume_1_frontispiece.jpg pg#mediaviewer/File:Philosophical_Transactions_Volume_1_frontispiece.jpg Wilke, Claus. “What Constitutes a Citable Scientific Work?” The Serial Mentor, January 2, CSIRO. Water availability in the Murray-Darling basin : summary of a report to the Australian Government Whan, Alex, Matt Bolger, Leanne Bischof (2014): GrainScan - Software for analysis of grain images. v2. CSIRO. Data Collection. Durrant, Tom, Diana Greenslade, Mark Hemer, Claire Trenham (2014). A Global Wave Hindcast focussed on the Central and South Pacific. CAWCR Technical Report No Car, Nicholas (2014). Inter-agency standardised provenance reporting in Australia. eResearch Australasia, October Melbourne, Australia. 10p. Supporting reproducible science | Dominic Hogan 24 |

IM&T/Research Data Support Sue Cook Data Librarian t CSIRO IM&T Thank you