Presentation is loading. Please wait.

Presentation is loading. Please wait.

Agenda welcome and goals (Peter)

Similar presentations


Presentation on theme: "Agenda welcome and goals (Peter)"— Presentation transcript:

1 Joint DFIG – PWWG Meeting Amy Nurnberger, Lary Lannom, Peter Wittenburg

2 Agenda 14.00 welcome and goals (Peter)
Breakout 1: Discussion about Guidelines/Recommendations Breakout 2: configuration building and Minimal PID Types Breakout 4: DFIG Core Session Breakout 5: Joint session with Brokering Group Breakout 7: Joint meeting with Publishing Data Workflows 14.00 welcome and goals (Peter) 14.05 DFIG view on scientific data workflows (Peter) 14.20 PWWG view on scientific data and publishing workflows (Amy) 14.35 comparison, overlap and differences in views (Larry) 14.45 discussion (Larry and Amy) 15.30 end

3 Intentions and Goals comparing core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points there are barriers in culture and terminology there is some tradition to not talk to each other RDA is about bridge building this session is about building a bridge and get together need to understand how we can integrate the approaches since we address overlapping issues how to do this -> discussion

4 DFIG view on scientific data workflows
Peter

5 Lab Reality – slowly changing
EU survey: 75% of researcher’s time spent on DM/A M. Brodie (MIT): 80 % something is fundamentally wrong !! far away from data publication considerations are curiosity driven research and chaos twins? is DIS different? clear trends for all: data orientation, more and complex data Automatic workflows would change, but many exceptions, parameter choices, human interventions lack of experts to create flexible software solutions how can we help and change? short term and long term solutions

6 An illustration Feature Sets Collection X Pattern Extractor Collection
Y Smart Machine Pattern Extractor Collection Z Results Iterations

7 An illustration Still lot of handwork, ad hoc scripting involved.
Feature Sets Collection X Still lot of handwork, ad hoc scripting involved. Also many iterations to find out optimal features. It is not obvious whether evidences will be found. Such research takes years. When to register what? When to refer to what? When to create metadata for what? When to publish and cite what? Which components would improve? Pattern Extractor Collection Y Smart Machine Pattern Extractor Collection Z

8 Identify components that could improve (stepwise).
Data Fabric Cycle Observations Experiments Simulations etc. This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could improve (stepwise).

9 From abstract fabrics to concrete compositions
Common Components & Services Specific Components & Services Closing urgent gaps t-repositories PID system MD schemas MD editors vocabularies etc. Global Digital Object Cloud

10 From abstract fabrics to concrete compositions
Common Components & Services Specific Components & Services Of course it would be useful to consider publication requirements while building these compositions. Closing urgent gaps t-repositories PID system MD schemas MD editors vocabularies etc. Global Digital Object Cloud

11 Conclusions Collecting use cases and facts from many labs.
Understand from heterogeneous practices how to come to agreed components. Addressing the data cycle in the labs where publication is often not an issue for quite some time. However the requirements for data management, accessibility and publication are getting tighter. So need to consider these requirements and map them with publication requirements. Need to provide easy transitions. Thus bridge conceptualisation & terminology. Need to overcome social barriers.

12 RDA/WDS Data Publishing Workflows WG + Amy Nurnberger

13 DPWWG – Where we’ve been
What are the current data publishing workflow landscape across disciplines and institutions?

14 Data publishing entities
25 data publishing entities assessed in terms of discipline, function, data formats, and roles The assignment of persistent identifiers (PIDs) to datasets, and the PID type used -- e.g. DOI, ARK, etc. Peer review of data (e.g. by researcher and by editorial review) Curatorial review of metadata (e.g. by institutional or subject repository) Technical review and checks (e.g. for data integrity at repository/data centre on ingest) Discoverability: was there indexing of the data, and if so, where? Links to additional data products (data paper; review; other journal articles) or “stand-alone” product Links to grant information, where relevant, and usage of author PIDs Facilitation of data citation Reference to a data life cycle model Standards compliance

15 Key components of data publishing
Austin, C. C., Bloom, T., Dallmeier-Tiessen, S., Khodiyar, V., Murphy, F., Nurnberger, A., … Whyte, A. (2016). Key components of data publishing: using current best practices to develop a reference model for data publishing.

16 Workflows Ibid

17 Workflows, cont. Ibid

18 + What’s missing?

19 What’s missing? This stuff

20 What’s missing? This stuff
“…early interactions between researchers and a suitable data repository (or repositories), while data is processed and prepared for sharing.” Dallmeier-Tiessen, S., Khodiyar, V., Murphy, F., Nurnberger, A., Raymond, L., Whyte, A. (DRAFT). Connecting data publication to the research workflow: a preliminary analysis

21 What’s missing? Deliberate integration of sundry products from research process, e.g., software, code, models, etc. Integration/Interoperability between data processing tools an platforms Disciplinary difference in data conception, collection, & processing Dallmeier-Tiessen, S., Khodiyar, V., Murphy, F., Nurnberger, A., Raymond, L., Whyte, A. (DRAFT). Connecting data publication to the research workflow: a preliminary analysis

22 What’s needed Small, modular, shareable components that help ensure platforms offer sufficient flexibility to support variety, Research workflow solutions that enable straightforward data and metadata generation in accordance with community defined and accepted standards Commit to the use of PIDs and include versioning capabilities Clear documentation that can offer direct benefits to repository depositors and users Curators Dallmeier-Tiessen, S., Khodiyar, V., Murphy, F., Nurnberger, A., Raymond, L., Whyte, A. (DRAFT). Connecting data publication to the research workflow: a preliminary analysis


Download ppt "Agenda welcome and goals (Peter)"

Similar presentations


Ads by Google