WP18, High-speed data recording Krzysztof Wrona, European XFEL

Slides:



Advertisements
Similar presentations
Operating System.
Advertisements

Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.
Operating Systems CS451 Brian Bershad
Figure 1.1 Interaction between applications and the operating system.
Chapter 7 Configuring & Managing Distributed File System
Process Management A process is a program in execution. It is a unit of work within the system. Program is a passive entity, process is an active entity.
BLOCK DIAGRAM OF COMPUTER
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
CHAPTER 2: COMPUTER-SYSTEM STRUCTURES Computer system operation Computer system operation I/O structure I/O structure Storage structure Storage structure.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
Example: Sorting on Distributed Computing Environment Apr 20,
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
Slide 3-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 3.
WP18: High-Speed Data Recording Krzysztof Wrona, European XFEL 07 October 2011 CRISP.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Data Recording Model at XFEL CRISP 2 nd Annual meeting March 18-19, 2013 Djelloul Boukhelef 1Djelloul Boukhelef - XFEL.
High Speed Detectors at Diamond Nick Rees. A few words about HDF5 PSI and Dectris held a workshop in May 2012 which identified issues with HDF5: –HDF5.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
C LUSTER OF R ESEARCH I NFRASTRUCTURES F OR S YNERGIES IN P HYSICS Prototype for High-Speed Data Acquisition at European XFEL CRISP 3 rd Annual meeting.
CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.
Parallel IO for Cluster Computing Tran, Van Hoai.
ALICE experiences with CASTOR2 Latchezar Betev ALICE.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
CRISP WP18, High-speed data recording Krzysztof Wrona, European XFEL PSI, 18 March 2013.
E-infrastructure requirements from the ESFRI Physics, Astronomy and Analytical Facilities cluster Provisional material based on outcome of workshop held.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
DART SI-8: Pilot long-distance high speed and secure data transfer between the Repositories DART Workshop on Infrastructure Chief Investigator: Dr. Asad.
Compute and Storage For the Farm at Jlab
QC-specific database(s) vs aggregated data database(s) Outline
MERANTI Caused More Than 1.5 B$ Damage
Real Time Fake Analysis at PIC
Chapter 2: Computer-System Structures(Hardware)
Chapter 2: Computer-System Structures
The demonstration of Lustre in EAST data system
Chapter 2: Computer-System Structures
Operating System.
ALICE Monitoring
Introduction the IT and DM Topic
LHC experiments Requirements and Concepts ALICE
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Enrico Gamberini, Giovanna Lehmann Miotto, Roland Sipos
Introduction to Data Management in EGI
Bernd Panzer-Steindel, CERN/IT
Parallel Data Laboratory, Carnegie Mellon University
ProtoDUNE SP DAQ assumptions, interfaces & constraints
Enabling High Speed Data Transfer in High Energy Physics
ALICE Computing Upgrade Predrag Buncic
WP18, High-speed data recording
Computing Infrastructure for DAQ, DM and SC
Introduction to client/server architecture
University of Technology
TYPES OFF OPERATING SYSTEM
Storage Virtualization
CENTRAL PROCESSING UNIT CPU (microprocessor)
Real IBM C exam questions and answers
GARRETT SINGLETARY.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Operating Systems CSE451 Winter 2000
Chapter 2: Operating-System Structures
Introduction to Operating Systems
Chapter 2: Computer-System Structures
Chapter 2: Computer-System Structures
Chapter 2: Operating-System Structures
EAST MDSplus Log Data Management System
Overview of Computer system
Presentation transcript:

WP18, High-speed data recording Krzysztof Wrona, European XFEL CRISP WP18, High-speed data recording Krzysztof Wrona, European XFEL CERN, 23 Sep 2013

High-speed Data Recording Objectives: “High-speed recording of data to permanent storage and archive” “Optimized and secure access to data using standard protocols” Partners: DESY, ESRF, ESS, European XFEL, GANIL, ILL, Univ. Cambridge

General status Milestone MS21 originally planned for month 24 Architecture document as the base for implementation of the prototype system Preparation for the document ongoing. Current estimation: delayed by 1-2 months

Proposed architecture Proposed architecture consists of multiple layers Actual implementations may vary between facilities due to specific requirements and restrictions Additional layers can be added Some layers may be skipped

Proposed architecture 1 Detectors or dedicated electronic devices 2 Real time data processing, data aggregation from different sources and formatting 3 Local buffer as a temporary data storage 4 Central storage system 5 Data archive 6 Data and experiment monitoring 7 Data pre-processing 8 Data analysis, data export services 9 Online shared scratch disk space 10 Offline shared scratch disk space Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K Requires further discussion between WP18 partners

Proposed architecture 1 Detectors or dedicated electronic devices 2 Real time data processing, data aggregation from different sources and formatting 3 Local buffer as a temporary data storage 4 Central storage system 5 Data archive 6 Data and experiment monitoring 7 Data pre-processing 8 Data analysis 9 Online shared scratch disk space 10 Offline shared scratch disk space Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Proposed architecture Data is sent from detectors (1) and received on the PC layer (2) Data is sent over high speed network, i.e. using multiple 10GE links. Received data are then aggregated, processed and formatted on the PC layer At this stage data processing may alter the data content before it becomes data is persistent Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

10GE data transfer 10GE network transfer for data acquired by detectors UDP and TCP protocols have been successfully implemented and tested for high throughput parallel data streams http protocol has been investigated for commercial devices dumping data to files and where interoperability between different operating system is needed

Online data processing Design of data processing layer at European XFEL Data receivers Read train data and store it in local memory buffer. UDP for big train data (2D detectors) TCP for slow and small data Processing pipeline Users’ algorithms Perform data monitoring, rejection, and analysis. Aggregator, formatter, writer Filter data and merge results, Format data into HDF5 files, Send files to data cache. Data receivers Data aggregator & formatter Scheduler Shared Memory Synchronization Analysis 1:N Check 1:10 Rejection 1:1 Monitoring Slow data Fast data Network writer Multicast HDF5 files Primary process Secondary process Online data Cache & Scientific Computing

Online data processing PC layer node software is divided into a primary process and one or more secondary processes. Primary process Performs critical tasks such as data receiving, storing, and scheduling. Requires super-user mode Secondary processes Run users’ algorithms (pipeline) Run at normal user-mode Data exchange is done through inter-process shared memory Scheduler Monitors tasks and data status Coordinates threads activities Data receivers Data aggregator & formatter Scheduler Shared Memory Synchronization Analysis 1:N Check 1:10 Rejection 1:1 Monitoring Slow data Fast data Network writer Multicast? HDF5 files Primary process Secondary process Online Data Cache & Scientific Computing

Proposed architecture B. Store data in the local buffer From the PC layer (2) data is sent to the online data cache (3). If the PC layer (2) is not implemented data is sent directly from detectors (1) to the data cache (3) Capacity of the online cache should be sufficient to allow for data storage up to several days Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Online data cache Sending single channel formatted data through 10GE interface using TCP Storage performance: cached vs. direct IO Tests with two types of storage systems: 14 x 900GB 10Krpm , SAS, 2 x 6Gbps, RAID6 12 x 3TB 7.2Krpm, NL SAS, 2 x 6Gbps, RAID6 Results Achieved data rate per channel: 1.1GB/s and 0.97GB/s, resp. Direct IO improves performance and stability

Proposed architecture C. Send data required for online data monitoring Results of real time data processing are sent to the monitoring cluster Monitoring system prepares data for visualization (i.e. histograms) and provides slow feedback for experimentalists Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Proposed architecture D. Send data for online processing Multicast can be used to send data from the PC layer to the online data cache and processing cluster Subset of data is sent to the online computing cluster where additional algorithms can be run without strict control on execution time. Data received on the online cluster can be stored on the shared disk space and inspected with standard tools used by the experimentalists Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Proposed architecture E. Data are pre-processed before storing in the central storage system (4). Additional filtering, data reduction and merging may be performed before registering dataset and storing in the central storage system Results of the pre-processing determines if data is useful for further analysis or it should be discarded Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Proposed architecture F. Output data from the pre-processing are stored in the online data cache This data can be stored in addition or instead of the original raw data Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Proposed architecture G. Send data to central data storage Entire dataset or selected good quality data are stored in the central system At this point all datasets are registered in the metadata catalogue Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Data storage systems At DESY and XFEL At SKA Testing dCache system as the candidate for the central data storage dCache presents data distributed over multiple servers as a single namespace Data is accessible using pNFS4.1 protocol dCache implements access control list according to the NFS4 specification dCache manages tape data archiving/restoring At SKA Initial tests of the Lustre distributed filesystem

Proposed architecture H. Archive data Data received in the central storage system needs to be secured for long term storage Implementation of the long term data archive is recommended Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Proposed architecture I. Read data required for offline analysis Access to data needs to be protected according to the adopted data access policy Data should be readable using standard protocols i.e. pNFS4.1 Offline analysis performed on cluster of computing nodes Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Proposed architecture J. Write processed data to the central data storage Results of user data analysis are stored in the central storage system, archived and kept according to the data policy adopted at facility Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Proposed architecture K. Scratch shared disk space for data processing Fast, short term data storage May be required for accessing intermediate data within application or between execution runs Proposed architecture 5. Archive 1. Detectors/Electronics 2. PC layer 3. Online data cache 4. Central storage 6.Monitoring cluster 7.Online Computing cluster 8. Offline Computing cluster A B C D H G F E J I 9. Shared disk space 10. Shared disk space K

Data storage systems Performed initial tests of the Fraunhofer filesystem. Plan to use it as a scratch space for fast access required by demanding analysis applications Fhgfs setup at DESY

Summary Preliminary architecture design exists Require further feedback from WP18 participants Architecture document needs to be prepared soon Implementation of the prototype system will follow at XFEL, DESY and possibly at SKA.