E-Science for the SKA WF4Ever: Supporting Reuse and Reproducibility in Experimental Science Lourdes Verdes-Montenegro* AMIGA and Wf4Ever teams Instituto.

Slides:



Advertisements
Similar presentations
Open repositories: value added services The Socionet example Sergey Parinov, CEMI RAS and euroCRIS.
Advertisements

DRIVER Building a worldwide scientific data repository infrastructure in support of scholarly communication 1 JISC/CNI Conference, Belfast, July.
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Online Qualitative Data Resources: Best Practice in Metadata Creation.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
EU-funded Digital Preservation Research APA 2014 Conference Brussels, 22 October 2014 Dr. Manuela Speiser European Commission DG CONNECT, unit "Creativity"
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
KOS and the Conduct of Science© Straits Knowledge 2011 Knowledge Organisation Systems as Enablers to the Conduct of Science Patrick Lambe.
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS Ravi K Madduri University of Chicago and ANL.
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
A FRAMEWORK BASED ON WEB SERVICES ORCHESTRATION FOR BIOINFORMATICS WORKFLOW MANAGEMENT Laboratory for Bioinformatics (LBI), Institute of Computing (IC)
Future Access to the Scientific and Cultural Heritage – A shared Responsibility Birte Christensen-Dalsgaard State and University Library.
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
Corporation For National Research Initiatives NSF SMETE Library Building the SMETE Library: Getting Started William Y. Arms.
A Semantic Workflow Mechanism to Realise Experimental Goals and Constraints Edoardo Pignotti, Peter Edwards, Alun Preece, Nick Gotts and Gary Polhill School.
Sean Making Metadata Work, ISKO London, 23 rd June 2014 Metadata for Research Objects 1.
New Task Group CRIS Architecture & Development Maximilian Stempfhuber RWTH Aachen University Library
Management, marketing and population of repositories Morag Greig, University of Glasgow.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
Susana Sánchez Expósito Instituto de Astrofísica de Andalucía - CSIC Pablo Martin, Jose Enrique Ruiz, Lourdes Verdes-Montenegro, Julian Garrido, Raül Sirvent,
Data Sharing and Interoperability Francoise Genova RDA TAB and RDA/Europe member WDS Scientific Committee.
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
Preserving the Scientific Record: Preserving a Record of Environmental Change Matthew Mayernik National Center for Atmospheric Research Version 1.0 [Review.
Towards a European network for digital preservation Ideas for a proposal Mariella Guercio, University of Urbino.
Semantic Cyberinfrastructure for Knowledge and Information Discovery (SCiKID) Proposal Principle Investigator: Eric Rozell Tetherless World Constellation.
BP/DRIS TG Sergey Parinov, euroCRIS membership meeting, St. Andrews, November 2009.
Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.
Wf4Ever: Preserving workflows as digital Research Objects EGI Community Forum 2012, Workflow Systems workshop Leibniz Supercomputing Centre, Münich,
Connecting different ethnomusicological archives with ethnoArc Maurice Mengel Music Archive of the Ethnological Museum, National Museum in Berlin (EMEM)
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
Doug Tody E2E Perspective EVLA Advisory Committee Meeting December 14-15, 2004 EVLA Software E2E Perspective.
BlogForever Project Presentation Vangelis Banos, Project Manager, ALTEC Software Stratos Arampatzis, Dissemination Manager, Tero Dr. Alexandra Cristea,
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
Scientific Data Management - From the Lab to the Web Semantic Data Management Dagstuhl Seminar April 2012 José Manuel Gómez Pérez, iSOCO
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
DataONE: Preserving Data and Enabling Data-Intensive Biological and Environmental Research Bob Cook Environmental Sciences Division Oak Ridge National.
EURO-VO Structure Data Centre Alliance (DCA) A collaborative and operational network of European data centres who, by the uptake of new VO technologies.
The Astronomy challenge: How can workflow preservation help? Susana Sánchez, Jose Enrique Ruíz, Lourdes Verdes-Montenegro, Julian Garrido, Juan de Dios.
1 Curation and Characterization of Web Services Jose Enrique Ruiz October 23 rd IVOA Fall Interop Meeting - Sao Paolo.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Enabling e-Research in Combustion Research Community T.V Pham 1, P.M. Dew 1, L.M.S. Lau 1 and M.J. Pilling 2 1 School of Computing 2 School of Chemistry.
The Collaborative Semantic Grid David De Roure University of Southampton, UK
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
NATIONAL TREASURES DATA PRESERVATION WITH METADATA Sharon Shin Metadata Coordinator Federal Geographic Data Committee Secretariat ASPRS-Reno 2006.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
WP3 – Requirements from the Recommendation System Rafael González Cabero (UPM) Oxford, 07/2/2011.
CI.III.1 Wider Adoption, Deployment, Utilization of a Cyberinfrastructure David De Roure.
NeOn Components for Ontology Sharing and Reuse Mathieu d’Aquin (and the NeOn Consortium) KMi, the Open Univeristy, UK
Working with your archive organization: Broadening your user community Robert R. Downs, PhD Socioeconomic Data and Applications Center (SEDAC) Center for.
W ORKFLOW -C ENTRIC R ESEARCH O BJECTS : F IRST C LASS C ITIZENS IN S CHOLARLY D ISCOURSE Khalid Belhajjame, Oscar Corcho, Daniel Garijo, Jun Zhao, Paolo.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
Automating the Audit: Updates from the Metadata Upgrade Project at the University of Houston Libraries Andrew Weidner, Metadata Librarian Santi Thompson,
BG 5+6 How do we get to the Ideal World? Tuesday afternoon What gaps, challenges, obstacles prevent us from attaining the vision now? What new research.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Building PetaScale Applications and Tools on the TeraGrid Workshop December 11-12, 2007 Scott Lathrop and Sergiu Sanielevici.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Research Objects Preserving scientific data and methods Stian Soiland-Reyes, Khalid Belhajjame School of Computer Science, Univ of Manchester myGrid NIHBI.
EOSC Services for Scientists
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Conceptualizing the research world
University of Chicago and ANL
DataNet Collaboration
Utilizing Scientific Advances in Operational Systems
Bird of Feather Session
Presentation transcript:

e-Science for the SKA WF4Ever: Supporting Reuse and Reproducibility in Experimental Science Lourdes Verdes-Montenegro* AMIGA and Wf4Ever teams Instituto de Astrofísica de Andalucía – CSIC * PI of VIA-SKA, Feasibility study of the Spanish technological participation in SKA RadioNet Advanced Radio Astronomy, Commissioning Skills and Preparation for the SKA Manchester, November 15th 2012

2 Summary And commissioning looks soooo similar to a scientific experiment… » The challenge: Reuse and Reproduce scientific experiments Reusability, fundamental for incremental scientific development Reproducibility, key for reliable science » How Wf4Ever is addressing these challenges » Impact in Astrophysics SKA (if any)

3 Summary And commissioning looks soooo similar to a scientific experiment… » The challenge: Reuse and Reproduce scientific experiments Reusability, fundamental for incremental scientific development Reproducibility, key for reliable science » How Wf4Ever is addressing these challenges » Impact in Astrophysics SKA (if any) An experiment is reproducible when someone else, working independently, can get the same results following the same methods and using the same inputs mmm… commissioning

4 The challenge Reproducibility and Reusability of the experiment » Compile all needed elements » Share it with other … Commissioners » Describe it: methods, data,etc. » Make the experiment discoverable » Scientific Workflows: part of the solution › Automation, encourage best practices, method for sharing » But more is needed: › Share/provide the data, annotations, references, etc. › Strategies for avoiding decay › Tools for discovering the experimen t Workflow preservation

5 How Wf4Ever is addressing these challenges Wf4Ever project Wf4Ever - Preservation of scientific workflows in data-intensive science

6 The aim of Wf4Ever How Wf4Ever is addressing these challenges Encapsulate the scientific methodology (the workflows and all the associated information/components of the digital experiment) in an artefact called Research Object (beyond the PDF). What to do with the Research Object? Archival, classification and indexing in scalable semantic repositories Advanced access and recommendation capabilities based on monitoring and metrics to evaluate similarities, decay, quality, stability, completeness. Creation of scientific communities to collaboratively share, reuse and evolve Research Objects Use Cases: Astronomy (IAA) Genome-wide Analysis and Biobanking (LUMC) Commissioning is multidisciplinary

7 Compile all the elements needed by the experiment How Wf4Ever is addressing these challenges [1]Workflow-Centric Research Objects: A First Class Citizen in the Scholarly Discourse, Khalid Belhajjame et al. » Model for Workflow-Centric Research Objects [1] » Semantic ontologies to implement the model › Object Exchange and Reuse (ORE) for specifying aggregation of resources › Annotation Ontology (OA) for annotating the resources Metadata of the processes

8 Share it with other scientists How Wf4Ever is addressing these challenges » Provide digital libraries with RO preservation functionalities Search for trustable reusable material for commissioning (e.g from other disciplines)

9 Describe it: methods, data and all the elements involved. How Wf4Ever is addressing these challenges Mechanisms for » Defining relationships between the elements » Annotating each element and the whole RO Useful for complex processes as commissioning?

10 Compile all the elements needed by the experiment How Wf4Ever is addressing these challenges Runnable Publishable Shareable Repeatable Service up, software working, etc. The output can be reproduced Enough annotation. Service up, software working and all well commented » Completeness It contains all the resources needed to be run, published, shared or repeated » Stability Changes made by different kind of users on the RO, can improve it or make it worse Metrics and tools for QA Evolution with time

11 Be independent of the environment How Wf4Ever is addressing these challenges Decay Information Last check was performed 2 days ago and returned one error: The service SDSS-DR7, needed by the workflow Calculate_galaxy_distances is down Check now Try to repair » Interoperability › RO level › Component level » Decay State of the services (up/down), of the applications (updated/deprecated), permissions to access the data Metrics and tools for QA Evolution with time

12 Make the experiment discoverable How Wf4Ever is addressing these challenges Rating Downloads 36 Citations [2] Re-used [1] Comments [4] » Tracking Rating by other users, who used the RO, comments, etc. » Recommender Service Which exploits semantic description, relations and other metadata to support advanced search mechanisms » Collaboration Spheres Visual mechanism to find similar elements (users, ROs, workflows) to others previously selected Large international teams to commission an experiment spread in two continents would be the most similar thing to a Social Network

13 Changing the way astrophysics work Impact in Astrophysics » AstroTaverna plugin › Connection with Virtual Observatory › Services for managing and visualizing VOTable › Astronomical utilities: coordinate transforms. Building complex commissioning tools from existing ones

14 This talk was an example of R-use! Questions

15 Matching talk topics with the session topics Conclusions Dave de Roure (Oxford e-Research Center) Best practices for everything… and gRRRowing