Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Repositories and Science Gateways for Open Science Presenter: Roberto Barbera – UNICT and INFN EGI Community Forum Bari – 11 November 2015.

Similar presentations


Presentation on theme: "Data Repositories and Science Gateways for Open Science Presenter: Roberto Barbera – UNICT and INFN EGI Community Forum Bari – 11 November 2015."— Presentation transcript:

1 Data Repositories and Science Gateways for Open Science Presenter: Roberto Barbera – UNICT and INFN EGI Community Forum Bari – 11 November 2015

2 Outline  Introductory concepts, definitions and driving considerations  A viable approach to Open Science  Summary and conclusions 2

3 The Scientific Method Examples of IR: Classical Mechanics Newton’s Gravitation Theory Examples of DR: General Relativity Standard Model of Particle Physics 3 G. Galilei

4 The Pillars of the Scientific Method Repeatability The closeness of agreement between independent results obtained with the same method on identical test material, under the same conditions (same operator, same apparatus, same laboratory and after short intervals of time) Affected by random errors Reproducibility The closeness of agreement between independent results obtained with the same method on identical test material but under different conditions (different operators, different apparatus, different laboratories and/or after different intervals of time) Affected by systematic errors Is science really reproducible ? 4

5 Challenges in irreproducible research (http://www.nature.com/nature/focus/reproducibility/index.html) 5

6 The “reproducibility crisis” 18 Out of 18 microarray papers, results from 10 could not be reproduced Out of 18 microarray papers, results from 10 could not be reproduced 6 1.Ioannidis et al., 2009. Repeatability of published microarray gene expression analyses. Nature Genetics 41: 14 2.Science publishing: The trouble with retractions http://www.nature.com/news/2011/111005/full/478026a.htmlhttp://www.nature.com/news/2011/111005/full/478026a.html 3.Bjorn Brembs: Open Access and the looming crisis in science https://theconversation.com/open-access-and-the-looming-crisis-in-science-14950 https://theconversation.com/open-access-and-the-looming-crisis-in-science-14950

7 Repeatability and Reproducibility are not all 7

8 How e-Infrastructures support the (e-)Scientific Method Data Infrastructures Open Access Doc. Repos. Data Repos. Semantic-web enrichment of linked data Data preservation HTC/HPC Clusters Grids, Clouds both ways Challenge: «walk» across the knowledge path both ways 8

9 Open Science

10 An INFN approach to Open Science: the “grand” view Digital Repository of Research Products (pilot: www.openaccessrepository.it) arXiv CNR S&T DL CINECA VQR INFN Multi media SINGLE – MANDATORY - DEPOSIT SCIENCE PRODUCTS REPRODUCIBILITY ORCID INFN Gray Lit. SCOAP 3 10

11 The INFN Open Access Repository (www.openaccessrepository.it) papers data Automatic ingestion in place from: federated authentication

12 Alternative reputation systems: possibility to add researcher ID’s 12

13 Examples of document and data resources 13 Data stored on:

14 Example of software resources: the ALICE Virtual Research Environment 14

15 Example of research “package” 15

16 The OAR Knowledge Workflow 16

17 The OAR Knowledge Workflow: ALEPH data search & discovery 17

18 1. From OAR it is possible to select an “analysis” as simply as any other resources in the archive The OAR Knowledge Workflow: ALEPH “packages” inspection 18 2. Clicking on RUN PAGE, the researcher can either reproduce or extend that particular analysis using a Catania Science Gateway

19 The OAR Knowledge Workflow: ALEPH data analysis (1/2) The Science Gateway collects from the OAR, and allows user browse, the metadata associated to the dataset(s) needed to run that particular analysis 19

20 The OAR Knowledge Workflow: ALEPH data analysis (2/2) Data are retrieved from Using the JSAGA adaptor for all OCCI-compliant cloud-middleware, the Science Gateway starts a dedicated VM already configured with the all the experiment software Both the CHAIN-REDS Cloud Testbed and the EGI Federated Cloud can be used as e-Infrastruc- tures Jobs run both on and 20

21 Remember: repeatability and reproducibility are not all Reusability and «extensibility» matter! 21

22 1.From within the CHAIN-REDS Science Gateway entitled researchers can start VMs already configured to re-use/extend ALICE data analyses 2.The VMs are deployed both on the CHAIN-REDS Cloud Testbed and on the EGI Federated Cloud using the features of the EGI AppDB Reusability of ALICE data with the CHAIN-REDS Science Gateway (1/3) 22

23 Reusability of ALICE data with the CHAIN-REDS Science Gateway (2/3) 1.The VM is available tor a customizable amount of time during which the user has full access to the dataset(s) and analysis algorithm(s) and source code(s) of the experiment 2.The user can access the VM using different protocols (e.g., SSH, VNC); clicking on the SSH or VNC icons the user can directly access the VM instantiated on the cloud from within the Science Gateway 23

24 Reusability of ALICE data with the CHAIN-REDS Science Gateway (3/3) 24 New stable analyses (and their results), generated running the VM, may be registered in the OAR (with DOIs) to further extend the analysis catalogue shared within the Virtual Research Community

25 “Who’s this science of ?” How to provide authorship to research products? 25

26 ORCID (www.orcid.org – becoming a “de facto” standard) 26 More than 1.74 million ORCID IDs so far

27 ORCID: search & link your works in/from DataCite 27

28 ORCID: add your research products to your profile 28 v <a

29 Summary and conclusions 29  Open Science vision can be implemented only if the “openness” paradigm becomes pervasive in research  Science outputs’ reproducibility, but also re-usability and extensibility, are key to walk through the “knowledge path” in both directions  The INFN Open Access Repository is a pilot knowledge preservation repository meant to serve both researchers and citizen scientists  What makes the INFN OAR different from other repositories is:  Its capability to connect to Science Gateways and exploit cloud resources worldwide to easily reproduce/extend scientific analyses  Its capability to provide full authorship (and hence credit, reputation and visibility) for all products of a scientist  this is key for a correct evaluation of research (…and of researchers)

30 Authors 30  R. Barbera (University of Catania and INFN, Italy)  S. Bianco (INFN LNF, Italy)  T. Boccali (INFN Pisa, Italy)  C. Carrubba (University of Catania, Italy)  G. Inserra (University of Catania, Italy)  M. Maggi (INFN Bari, Italy)  D. Menasce (INFN Milano Bicocca, Italy)  R. Ricceri (University of Catania, Italy)

31 Thank you ! 31


Download ppt "Data Repositories and Science Gateways for Open Science Presenter: Roberto Barbera – UNICT and INFN EGI Community Forum Bari – 11 November 2015."

Similar presentations


Ads by Google