Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut Status Report PSI PaNDaaS2 meeting Grenoble 6 – 7 July 2016.

Similar presentations


Presentation on theme: "Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut Status Report PSI PaNDaaS2 meeting Grenoble 6 – 7 July 2016."— Presentation transcript:

1 Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut
Status Report PSI PaNDaaS2 meeting Grenoble 6 – 7 July 2016

2 Current projects at PSI
Data Analysis Service Data Policy Remote access Metadata catalogue Petabyte archive Remote data transfer PSI, PSI, 20. September September 2018 20. September 2018

3 Covering larger parts of the life cycle

4 Project Overview: Data Analysis Service
SUK Project Project Manager: Dr. Stephan Egli, Dr. Derek Feichtinger, Paul Scherrer Institut Partner: ETHZ Project Duration: Financing Support (50% matching funds): CHF 1'618'000 Team members: 16 (including 3 new positions financed by project) Workpackages WP1: Common Tools and Services (4) WP2: Data Analysis Environments for major use cases (3) WP3: Identity Management, DUO, Authentication and Authorization (3) WP4: Integration and development of scientific analysis codes (2) WP5: Procurement, installation, operation of analysis cluster infrastructure (4) WP6: Infrastructure sharing with other institutions (3) WP7: Project Management (2) Numbers in parentheses refer to the number of involved project members

5 DaaS Project Status Main purpose: provide an integrated solution for all SLS Users to do offline data analysis for data taken at SLS (and later SwissFel) Cluster of moderate size (~900 Cores, 2 PB Storage) Hired 3 persons dedicated to this project . Currently about 50% of the foreseen hardware installed and in operation Now in test phase with internal users Adjusting the system and software according to concrete use cases of these users. So far very good feedback Next phase: add external users Planning for Storage upgrade up to a total of about 3 PB until mid 2017 Option for extending the cluster also with “dedicated” resources (for paying customers), but within the same infrastructure and using centrally provided hardware choices

6 Data Policy Status Data Policy based on PaNdata framework
Draft existing Embargo period 3 years, with easy extension to 5 years Should be adopted by PSI directorate in August 2016 Implementation from the end of August 2016

7 Remote Access Usecases: online and offline analysis, remote measurements, shift operation,sharing of sessions for support tasks , Sharing of sessions for collaboration Support for 3D Hardware Acceleration Access to the beamlines and to the DaaS Cluster through a common gateway Architecture based on separation of “server” and “node” processes of the Nomachine Software Version 5 Added graphical management tool to define (time based) access to beamline resources and offline compute cluster, with role based delegated management to resource responsible

8 Data Catalogue Currently comparing two possible approaches:
ICAT/TOPCAT combination Approach based on NoSQL document databases (MongoDB), taking advantage of recent developments for middleware (Strongloop/Loopback) and component based graphical user interfaces (Angular2) Compare approaches concerning Ease of data ingestion Potential to integrate into existing IT infrastructures and storage/archive systems Flexibility to cope with multitude and growing requirements coming from different facilities, research groups, beamline instruments and experimental method in the area of data queries, data display and data analysis

9 Interactive and Batch data Analysis
Support for doing interactive (e.g. Matlab) data analysis on the cluster, nodes can be reserved for interactive work. Standard Batch processing based on Slurm

10 Petabyte Archive PSI must prepare for the archiving of high amounts of data being expected for SLS and SwissFEL over the next decades. Strategic collaboration of PSI with the Swiss National Supercomputing Center (CSCS) in Lugano for building a Petabyte Tape Archive solution at CSCS Project initiated by a PoC within the DaaS project Volume increase driven by detector and instrumentation advances. Planning to leverage IBM Spectrum Scale (GPFS) AFM technology for the asynchronous data transfers between the sites. Data growth for high data volume beamlines in 2016 Estimated projections for yearly data production

11 Remote Data Transfer Support for rsync/scp and gridftp (Globus Online)
Also evaluated Aspera solution from IBM. Could be added, but only if (paying) customer would request for it The integration with the longterm archive will create additional requirements

12 Possible Collaborations
Remote Access Lossy Compression . TOMCAT group at SLS is developing solutions here. Should be driven from scientific community. Umbrella federated identity management.

13 Wir schaffen Wissen – heute für morgen
My thanks go to Stephan Egli Derek Feichtinger Gerd Mann


Download ppt "Mirjam van Daalen, (Stephan Egli, Derek Feichtinger) :: Paul Scherrer Institut Status Report PSI PaNDaaS2 meeting Grenoble 6 – 7 July 2016."

Similar presentations


Ads by Google