Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper,

Slides:



Advertisements
Similar presentations
ESIP Resource and Services ESIP Products and Services Committee 2013 ESIP Winter Meeting.
Advertisements

Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
Software Cluster Improve Collaboration and Community Engagement Work with diverse communities that contribute to the sustainability of scientific software.
Australian Partnership for Sustainable Repositories AUSTRALIAN PARTNERSHIP FOR SUSTAINABLE REPOSITORIES Caul Meeting 2005/2 Brisbane 15.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
Customized cloud platform for computing on your terms !
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
Data Fabric IG Introduction. 2  about 50 interviews & about 75 community interactions  Data Management and Processing is too time consuming and costly.
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Summary of RDA Outputs so far dr. Ir. Herman Stehouwer 22 September 2015.
PIXUS - The JISC Image Portal Demonstrator Portals & Portlets 2003 e-Science Institute Sandy Buchanan
Biodiversity Data Exchange Using PRAGMA Cloud Umashanthi Pavalanathan, Aimee Stewart, Reed Beaman, Shahir Shamsir C. J. Grady, Beth Plale Mount Kinabalu.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
Mike Hildreth DASPOS Update Mike Hildreth representing the DASPOS project 1.
Why RDA? A domain repository perspective George Alter ICPSR University of Michigan.
Data Foundation IG DF Organizing Chairs: Gary Berg-Cross & Peter Wittenburg.
Data Citation Implementation Pilot Workshop
Summary of PRAGMA SC Meeting Planning for PRAGMA’s Future PRAGMA March2011 University of Hong Kong.
RDA/US Adoption Seed Projects RDA/US is partnering with four groups as part of the MacArthur 2016 Adoption Seeds program Bringing visibility to food security.
ICSU-WDS & RDA Data Publication Services WG. 2 Linking Research Data and the Literature: why? Why link? 1.Increase visibility & discoverability of research.
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Aalto Data.
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
Preservation e-Infrastructure IG Description: help ensure preservation of needed data succeeds Goals: foster worldwide collaboration; ensure consistency.
The RESEARCH DATA ALLIANCE WG: Brokering Governance Wim Hugo – ICSU-WDS/ SAEON.
Data Fabric IG From Testing to Recommendations Beth Plale.
ODIN – ORCID and DATACITE Interoperability Network ODIN: Connecting research and researchers Sergio Ruiz - DataCite Funded by The European Union Seventh.
Portlet Development Konrad Rokicki (SAIC) Manav Kher (SemanticBits) Joshua Phillips (SemanticBits) Arch/VCDE F2F November 28, 2008.
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Intentions and Goals Comparison of core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points.
Jennie Larkin, PhD Senior Advisor
RDA 9th Plenary Breakout 3, 5 April :00-17:30
WG/IG Collaboration Meeting 6 Dec 12-13, NIST, Gaithersburg 'Assembling the Pieces: Connecting Outputs with Each Other and with Domain Adoption‘
Current and Upcoming RDA Recommendations Dr. ir. Herman Stehouwer
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
RDA Data Fabric (DF) Interest Group Peter Wittenburg & Gary Berg-Cross
Power of PID kernel information
OceanDocs Digital Repository of Marine Science Research Outputs
Research Data Alliance - Research Data Sharing without barriers Terena Networking Conference 22 May 2014.
WG Research Data Collections RDA P10 Montréal – September 2017
Donatella Castelli CNR-ISTI
The RPID Testbed Rob Quick Manager – High Throughput Computing
Data Ingestion in ENES and collaboration with RDA
Multilevel Marketing Tree Viewer
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
PID centric fabric constructed piece by piece
C2CAMP (A Working Title)
EOSCpilot Skills Landscape & Framework
Brief WG/IG reporting Tobias Weigel on behalf of co-chairs
Digital Object Interface Protocol (DOIP)
WG Research Data Collections Draft outputs of a RDA bottom-up effort P9 - April 2017 Co-chairs: Bridget Almas, Frederik Baumgardt, Tobias Weigel, Thomas.
WG Research Data Collections An overview of the recommendation
Using the RDA Collections API to Shape Humanities Data
Data types and persistent identifiers in
Rice Data Interoperability WG update
Three Uses for a Technology Roadmap
Agenda (AM) 9:30-10:15 Introduction to RDA
Brian Matthews STFC EOSCpilot Brian Matthews STFC
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Adoption of RDA DTR and PIT in the Deep Carbon Observatory Data Portal
Bird of Feather Session
RDA uptake activities and plans: ESGF
IDRP: The first distributed data management infrastructure for nanoscience Rossella Aversa Karlsruhe Institute for Technology (KIT) – Steinbuch Center.
Leveraging PIDs for object management in data infrastructures RDA UK Node Workshop, July Tobias Weigel (DKRZ)
EOSC-hub Contribution to the EOSC WGs
1st Call for Collaboration Projects
Presentation transcript:

Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper, Beth Plale 9/16/2016

PRAGMA Data Service Genomics compute VMs Rice genome variant discovery Bringing visibility to food security data results: harvests of PRAGMA and RDA Co-PIs: Beth Plale, Indiana University, USA; Jason Haga, AIST, Japan Persistent ID Types (PIT) Data Type Registry (DTR) Launch the use of two RDA products in Asia by utilizing the PRAGMA community and tools to work with rice researchers in the Philippines and implement software services at AIST (Japan) using the RDA outputs of the PID Information Types and Data Type Registries Working Groups. Goals: Seek an agreeable PID attribute type profile to harvest data objects from varied scientific domains for wider adoption; Implement RDA PIT and DTR recommendations to support data citation of rice genomes data objects; Developed software will be installed additionally at the National Data Service to stimulate adoption in the US.

PRAGMA: A Community of Practice Enabling the Long Tail of Team Science PRAGMA Members and Affiliates Founded in 2002, NSF funded Framework for collaboration – people drive activities Market place of ideas – trusted environment to share Nurturing environment – support students and participants to learn and share resources

Our Motivation  Use RDA PIT/DTR model to improve citation and sharing of scientific data objects by embedding minimum metadata in persistent data identifier  Provide a framework with both repository and PID service to provide long-term access and findability to heterogeneous data objects across scientific boundaries  Propose a methodology to automatically harvest data objects from scientific workflows and improve reproducibility of workflow execution Page 3

Motivation for International Rice Research Institute (IRRI)  Enable collaborators to do genome wide association studies (GWAS) of their own phenotyping data for 3000 Rice Genomes using common analysis framework  Run the workflow selected, version the analysis done to support rice genomic workflow reproducibility for researchers  Provide means to share and cite GWAS analysis results back to IRRI Page 4

Data Object (DO) Lifecycle Page 5 1. End-user2. Repository Service3. PID Service PRAGMA Data Repository PRAGMA Data Service Data- Identity Server Data- Identity Server DTR PIT User Galaxy Workflow User experimental DO DO gets assigned persistent identifier and landing page DO goes in repository database Galaxy Portal Galaxy Portal Data Identity client Data Identity client Data Identity Portal Data Identity Portal Reuse DO and Reproduce Workflow MongoDB

Demo – Reproducing Rice Genomics Workflow Page 6

Success to Date  The PRAGMA Data Services is a user transparent means of harvesting DOs from applications and assignment of PIDs to scientific outcomes  Modular architecture, informed by core members of the rice genomics team  Software is stable; built with default PID information types and metadata (RDA inside!)  High-impact, multi-disciplinary effort in the Pacific Rim  Cross WG interactions in RDA (Rice Data Interoperability) Page 7

Future Work  User interface and hardening over Fall 2016; release early 2017  Refine metadata types based on user group study feedback  Extend data server (mongoDB) with basic preservation capabilities  Demonstrate the service on National Data Service (NDS)  Exploring the use of PRAGMA data services and repository in other domain applications running on and off the PRAGMA Cloud testbed  Using experience to inform PID minimum metadata effort in Data Fabric IG Page 8

QUESTIONS?  Come find us!  Jason Haga  Beth Plale  Ramil Mauleon  Gabriel Zhou  Poster #6  Funded in part by:  RDA US - MacArthur Foundation  PRAGMA NSF OCI  AIST ICT International Team  Special thanks to CNRI for hosting handle V8 server Page 9