Presentation is loading. Please wait.

Presentation is loading. Please wait.

PaNdata Europe Midpoint workshop 8-10 February 2011 Soleil, Paris PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories.

Similar presentations


Presentation on theme: "PaNdata Europe Midpoint workshop 8-10 February 2011 Soleil, Paris PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories."— Presentation transcript:

1 PaNdata Europe Midpoint workshop 8-10 February 2011 Soleil, Paris PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories

2 Agenda 9:00-9:30 Recap of status of project [Juan] Session 1 - Workpackage 5, Data – 9/2/2011, 9:30 - 12:00 Session 2 - Workpackage 6, Software – 9/2/2011 13:00 - 17:00 Session 3 - Workpackage 4, Users – 10/2/2011, 9:00 - 12:30 Session 4 - General Meeting – 10/2/2011, 13:30 - 16:00

3 Recap of status of project Recap of Support Action PaNdata Europe Recap of I3 proposal PaNdata ODI Review agenda for this meeting

4 PaNdata Europe - Objectives Objective 1 – Collaboration …. Objective 2 – Policy. To agree between partners on the elements of a general, standard data policy framework and to establish and maintain individual data policies in accordance with this standard. Objective 3 – Knowledge exchange and dissemination… Objective 4 – Users. To foster interoperability of user information across the participating facilities and the wider research community. Objective 5 – Data (including Formats and Metadata). To foster interoperability of data formats and metadata schemas across the participating facilities and the wider research community. Objective 6 – Software. To determine how to develop, deploy, operate and evaluate a common registry of data analysis software and, where appropriate, the necessary format converters, so that data from different sources can, in the future, be treated with a variety of data analysis software. Objective 7 – Integration and cross-linking of outputs. To foster the integration of the whole science lifecycle, focusing on linking of publications and data, interaction between institutional repositories of publications, packaging for long-term preservation, and services for search and reuse.

5 Overview

6 PaN-data Europe Timeline PaN-data Europe runs from June 2010 until December 2011 with workshops in Spring and Autumn 2011. PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories

7 5 Standardisation Activities The common data policy framework work package aims to agree between partners on the elements of a standard data policy framework and to establish and maintain individual data policies in accordance with this standard. It is a basis for the work packages dedicated to individual strands, listed below, and its phased timing corresponds to those work packages. The common user information exchange work package will underpin Virtual Organisation Management across the participants. This work package will build upon existing technology developed elsewhere and thus begins from a mature basis. It will consist primarily of proposing adaptations of these technologies to the current environment. The scientific data work package is slightly different in nature as it is largely centred on the common data formats that will enable the integration of the Data Catalogue and Software Services. These formats are already well understood and accepted, so there is no need for a review phase. The data standards will enable the sharing of data across the participating facilities by providing integrated searching across the associated metadata. The data analysis software infrastructure work package will enable best use of the available software by allowing the most appropriate software to be used independently of where the data is collected. This will require interaction with external parties such as software developers, and so the first workshop takes place only a short time after the commencement of this work package, to allow their input to be obtained. The integration and cross-linking of outputs work package is concerned with closing the lifecycle of research, by incorporating the publications that are the end result, and are held in repositories. It also focuses on long-term preservation of the data and other outputs. These aspects are linked because publications can provide what is called Representation Information (in the terminology of the OAIS standard) to assist the continued correct interpretation and use of data into the future.

8 Deliverables... D2.1 : Common policy framework on scientific data (M4) D2.2 : Common policy framework on analysis software (M8) D2.3 : Common policy framework on user data (M12) D2.4 : Common integrated policy framework (M16)... D4.1 : Proposal for authentication system enabling shared Virtual Organisation Management (M8) D4.2 : User information workshop report (M10) D4.3 : Revised specification of common authentication system (M12) D5.1. Proposal for data format standards (M8) D5.2. Data standards workshop report (M10) D5.3. Revised specification of data standards (M12) D6.1: Report on current software registries and data analysis software (M8). D6.2: Workshop report on standards and methods for sharing software (M10). D6.3: Draft proposal on strategy for data analysis software infrastructure (M16). D6.4 : Final proposal on standards and methods for sharing software (M18) D7.1: Report on survey of publication repositories, cross-linking and long-term preservation (M12). D7.2: Proposal for integration of practices (M16). D7.3 : Final report on standards for publication repositories, cross-linking and long-term preservation (M18)

9 WP2 Development of standards for a common data policy framework Objectives To agree between the partners on the elements of a general, standard, data policy framework and to establish, promote, and maintain individual data policies in accordance with this standard. This work package is the basis for the work packages devoted to the individual strands; it sets the requirements and principles within which they operate. Methodology: Survey existing relevant policies at the partner facilities and correlate them with guidelines emerging from national and international bodies. Abstract from these a common set of generic policy elements and refine and approve existing policies against this framework. Undertake a common foresight activity to inform evolution of policy in the light of technical and regulatory developments. Work towards convergence of policies in the longer term as experience of what constitutes best practice emerges. Liaise with other parties where such policies frameworks already exist to promote best practice in data management and exploitation The policies will influence and be influenced by the corresponding Work Packages devoted to the development of standards and practices, and the timings are chosen to match the milestones of those Work Packages. Task 2.1 : Development of common policy framework for scientific data (M1-M4) Task 2.2 : Development of common policy framework for analysis software (M1-M8) Task 2.3 : Development of common policy framework for user data (M8-M12) Task 2.4 : Development of integrated common policy framework for data (M12-M16) Deliverables D2.1 : Common policy framework on scientific data (M4) D2.2 : Common policy framework on analysis software (M8) D2.3 : Common policy framework on user data (M12) D2.4 : Common integrated policy framework (M16)

10 WP4 Development of standards for common user information exchange Objectives To foster interoperability of user information across the participating facilities and the wider research community. To develop standards enabling a shared Virtual Organisation Management and common processes across the participating facilities. Methodology The ultimate objective is the implementation of a system to allow scientific users to access data files across the physically distributed repositories. A typical use case would be a user having performed experiments at several facilities who needs to perform the same data analysis on all data sets. This process involves the use of remote computing resources and software packages, which implies a system whereby a logged user at a local site can be automatically authenticated and authorised (AAA) to use remote facilities. This additional level of AAA should be as transparent as possible to the user. Data protection laws in each country enormously complicate the sharing of user information between organisations. Consequently the AAA must function with the transfer of the very minimum of information, possibly only the user’s name and/or email and the trust information. A corollary is that AAA is not involved in implementing user databases at each site but rather in providing a mechanism of interfacing with existing applications to make available the trust information in a consistent and coordinated manner across the facilities. Task 4.1: Review existing authentication solutions with special emphasis of the IRUVX / ESRFUP prototype solution. Propose prototype authentication system in view of the needs of the full neutron and photon community (M1-M8). Task 4.2: Workshop with facility authentication experts; plan the adoption strategy for the full- community authentication system (M9). Task 4.3: Revise the proposal in the light of the workshop findings, and determine the next steps (non web-based applications, GRID-related issues). (M8-M12). (Note: the final workshop to disseminate the results of the work package takes place in WP3) Deliverables D4.1 : Proposal for authentication system enabling shared Virtual Organisation Management (M8) D4.2 : User information workshop report (M10) D4.3 : Revised specification of common authentication system (M12)

11 WP5 Development of standards for scientific data Objectives To foster interoperability of data formats and metadata schemas across the participating facilities and the wider research community. Methodology: Today all participating facilities use their own data file formats, which is a great obstacle for file access as input file readers have to be provided for each format. A shared infrastructure, involving databases and software, effectively imposes a common data format, which requires some agreement on the data to store and the format itself. This work package, through fact finding, monitoring and strategy development, will define a common data format, based on the NeXus international standard. In order to make raw and processed data accessible to scientists it is essential to be able to search databases by their metadata, which refers to the data describing the stored data, e.g. experiment name, date, facility where the data was taken, energy range of the data, type of technique, sample type and name, etc. The metadata with a link to the raw or processed data file will be made available via a metadata catalogue. This workpackage, through fact finding, monitoring and strategy development, will determine the metadata to be included in databases. Task 5.1: Evaluate existing data format standards and propose a coherent set covering the format requirements across the facilities and in the user community, prepare workshop (M4-M8). Task 5.2. Workshop to agree on this minimum set; include decision makers from users, facilities, software developers (M9). Task 5.3. Revise the data format standards in the light of the workshop findings (M8-M12). (Note: the final workshop to disseminate the results of the work package takes place in WP3) Deliverables D5.1. Proposal for data format standards (M8) D5.2. Data standards workshop report (M10) D5.3. Revised specification of data standards (M12)

12 WP6 Strategy for data analysis software infrastructure Objectives To determine how to develop, deploy, operate and evaluate a common registry of data analysis software and, where appropriate, the necessary format converters, so that data from different sources can, in the future, be treated with a variety of data analysis software. Methodology Data analysis (software) is a key link in the chain of events that transforms original ideas into conclusive scientific output. This WP, by fostering a common software resource, will ultimately enable the most appropriate software to be used independently of where the data is collected. A model for this type of activity is the “Collaborative Computational Projects” in the UK (see www.ccp.ac.uk). The approach of this WP are therefore to help define a common software resource that will:www.ccp.ac.uk 1.simplify and streamline for facility users the conversion of raw data into high quality scientific data for publication, 2.accelerate the deployment and use of new data analysis methods which will open doors to new science across the facilities and the user community, 3.enhance and optimise the scientific output of the facilities i.e. better value for money. Task 6.1: Review existing registries for data analysis software. Catalogue the data analysis software in use across the facilities and in the user community (M4-M8). Task 6.2: Workshop to agree position on data analysis software infrastructure, including providers of this software to define standards/rules for sharing, versioning, tracing software (M9). Task 6.3 : Analyse findings of workshop and propose strategy on software sharing (M8-M16) Task 6.4 : Revise proposal strategy for data analysis software sharing (M17-M18) (Note: the final workshop to disseminate the results of the work package takes place in WP3) Deliverables D6.1: Report on current software registries and data analysis software (M8). D6.2: Workshop report on standards and methods for sharing software (M10). D6.3: Draft proposal on strategy for data analysis software infrastructure (M16). D6.4 : Final proposal on standards and methods for sharing software (M18)

13 WP7 Development of standards for integration and cross-linking of outputs Objectives To foster the integration of the whole science lifecycle, focussing on linking of publications and data, interaction between institutional repositories of publications, packaging for long-term preservation, and services for search and reuse. Methodology: Publications repositories complete the lifecycle of innovation. Linking to Users, Data and Software enable traceability of published results through the scientific process. Sharing of the final results provides a foundation for the next cycle of science, and packaging enables long-term preservation of the outputs of research. Association of data with the publications resulting from it is a basis for preservation through Representation Information—a term from the OAIS standard (Open Archival Information System), meaning information necessary to ensure continued understandability and usability of a digital resource. Furthermore, this is also a basis for reuse of data across diverse communities, since the supplementary information needed for continued understandability is also valuable for transfer across communities. The European Support Action PARSE. Insight (of which STFC is WPL) is producing a roadmap for digital preservation in Europe, informed by a large-scale survey of attitudes and practices in a wide range of scientific disciplines. The roadmap includes components such as tools for creation of Representation Information, and will be taken into account in the project work. Task 7.1: Review existing provision for publication repositories, citation recording and long-term preservation in use across the facilities and in the user community, including facility libraries. (M8-M12) Task 7.2: Propose strategy on integration of practices across the community (M12-M16). Task 7.3: Develop final proposal on integration of practices across the community (M17-18). (Note: the final workshop to disseminate the results of the work package takes place in WP3) Deliverables D7.1: Report on survey of publication repositories, cross-linking and long-term preservation (M12). D7.2: Proposal for integration of practices (M16). D7.3 : Final report on standards for publication repositories, cross-linking and long-term preservation (M18)

14 Agenda for this workshop Data formats (Mark K) Software (Mark J) Users (Bill) Coordination (Simon) Dissemination (Frank)

15 Agenda Session 1 - Workpackage 5, Data 9/2/2011, 9:15 - 12:30 From WP description: Task 5.1: Evaluate existing data format standards and propose a coherent set covering the format requirements across the facilities and in the user community, prepare workshop (M4- M8) Task 5.2: Workshop to agree on this minimum set; include decision makers from users, facilities, software developers (M9). Agenda for Workshop Available data formats Recent developments – Status of the CommonDataModel project: a unified software layer to access data from the analysis point of view [Alain BUTEAU and Stéphane POIRIER] – Discussion NeXus Overview (Optional) List of experimental methods to standardize – Discussion NeXus Application Definitions – Discussion PanData Data Format Work Planning

16 Agenda Session 2 - Workpackage 6, Software 9/2/2011 13:30 - 17:00 From WP description: Task 2.2 : Development of common policy framework for analysis software (M1- M8) Task 6.1: Review existing registries for data analysis software. Catalogue the data analysis software in use across the facilities and in the user community (M4-M8) Task 6.2: Workshop to agree position on data analysis software infrastructure, including providers of this software to define standards/rules for sharing, versioning, tracing software (M9). Agenda for Workshop Current status of information gathered on data analysis software [Mark Johnson and Jamie Hall] – Discussion Proposed database model Demonstration of prototype database – Discussion

17 Agenda Session 3 - Workpackage 4, Users 10/2/2011, 9:00 - 12:30 From WP description: Task 4.1: Review existing authentication solutions with special emphasis of the IRUVX / ESRFUP prototype solution. Propose prototype authentication system in view of the needs of the full neutron and photon community (M1-M8). Task 4.2: Workshop with facility authentication experts; plan the adoption strategy for the full- community authentication system (M9). Agenda for Workshop To be defined

18 Agenda Session 4 - General Meeting 10/2/2011, 13:30 - 16:00 Deliverables due in first six months of project Reporting and finances WP3 Knowledge exchange and dissemination Summary of ICAT workshop Planning for next period – Recap of plans from earlier sessions (WPs 4, 5 and 6) – Plans for other work packages (WPs 2 and 7) – Deliverables coming due in next few months Other related activities – PaNdata presentation at ERF – Any others PaNdata ODI hearings Internal communications – Future virtual meetings – Next face-to-face meeting Upload you slides to wiki! (or send to the list).


Download ppt "PaNdata Europe Midpoint workshop 8-10 February 2011 Soleil, Paris PaN-data Europe – building a sustainable data infrastructure for Neutron and Photon laboratories."

Similar presentations


Ads by Google