Presentation is loading. Please wait.

Presentation is loading. Please wait.

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

Similar presentations


Presentation on theme: "CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy."— Presentation transcript:

1

2 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy

3 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 2 The KLOE experiment at DA  NE  -factory main goal: CP violation study other interesting fields: kaon form factors kaon rare decays radiative  decays K S   +  - K L   +  - (CP not) K S   +  - K L  3  0  6 

4 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 3 KLOE Requirements Data acquisition (at full DA  NE luminosity) 10 11 events per year acquired 50 MB/s sustained throughput Computing power ALL the events need to be reconstructed Storage requirements one petabyte of raw and reconstructed events hundreds of megabytes of related data (configurations, slow control data, calibration parameters, etc.)

5 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 4 KLOE computing environment Based on a set of medium-sized servers Connected using commercial switched networks (Fast Ethernet and Gigabit Ethernet) Heterogeneous environment, several platforms: IBM AIX on PowerPC Sun Solaris on Sparc Compaq Tru64 Unix on Alpha HP-UX on PA-RISC

6 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 5 KLOE storage pool Different policies for different types of data: raw and reconstructed events on tape libraries, with big disk pools for data caching related data managed by a disk based database system analysis output on disk pools

7 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 6 Disk pools Four categories of disk pools are present: each data acquisition node in the farm has its own small disk pool computing nodes write their output to centralized, NFS mounted disk pools separate disk pools are used as a cache for the events on tape analysis output is written to its own, central AFS mounted disk pool

8 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 7 Tape library Several automated tape libraries supported (at the moment the 5500 slot tape library is partitioned between two tape servers) Accessed using commercial software IBM ADSM with the current tape library

9 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 8 KLOE software Three distinct categories DAQ (or online) reconstruction and analysis (or offline) Monte Carlo ANSI C FORTRAN inside A_C FORTRAN The interface to the Data Handling System must be compatible with all of them

10 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 9 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

11 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 10 KLOE Data Handling System A mix of commercial and custom software the dependency on commercial software is minimized by the layers of custom software commercial software carries on all the vital functions custom software mostly extends and coordinates the functionality of the commercial software

12 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 11 KLOE Data Handling System Based on a set of multi-threaded non- privileged daemons and related libraries Distributed across several nodes Communication by means of TCP/IP sockets on high ports  bypasses TCP/IP filtering  flexible, programming language and operating system independent  no configuration needed on the client side

13 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 12 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

14 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 13 Database System Two distinct database systems are used offline database system online database system based on HepDBdata stored as ZEBRA banks based on a Relational DBMS data are structured in fields extended for distributed environments

15 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 14 Online Database System data stored in a Relational DBMS IBM DB2 Universal Database at the moment communication between the clients (user applications) and the RDBMS through a database daemon RDBMS DD app

16 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 15 Database Daemon The database daemon is the only link between the applications and the RDBMS if the RDBMS is changed in the future, only the database daemon will need to be changed Different kinds of commands are managed by the daemon general SQL commands KLOE specific commands

17 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 16 Database Daemon Different kinds of commands are managed by the daemon general SQL commands KLOE specific commands passed directly to the RDBMS select run_nr from run_logger where status = 'OK' managed by the daemon itself the RDBMS is used to retrieve and store data needed by the daemon itself log that I am starting processing file relative to run 3

18 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 17 Database Daemon The use of KLOE specific commands has several advantages additional checks and restrictions are possible data consistency management is centralized fast central caches can be implemented for example, the DAQ configuration cache reduces the typical access time from 4 to 0.1 s

19 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 18 A light version The RDBMS is used to ensure flexibility, reliability and performance Demanding in terms of computing resources and management effort stand-alone environments often cannot afford it A RDBMS-independent version of the database daemon is under development

20 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 19 A light version A RDBMS-independent version of the database daemon is under development limited to KLOE specific and the most frequently used SQL commands based on use of flat files containing a small portion of the data not suitable for production environment, but enough for home use

21 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 20 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

22 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 21 KLOE Archiving System Expected event data managed by KLOE 1 PB Tape libraries needed data storage and retrieval non trivial random access to data very inefficient Disk-based intermediate buffers used

23 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 22 KLOE Archiving System Two types of intermediate buffers DAQ, offline and Monte Carlo output are structured as YBOS files and written on their disk output areas event data needed by offline as input are read from the archiving system disk-cache

24 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 23 KLOE Archiving System Data needs to be migrated from output areas to the tape library as soon as possible (taking into account also efficiency concerns) from the tape library to the disk cache when an application needs it (or even better, a bit earlier) Migration is totally automated and transparent to the applications

25 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 24 KLOE Archiving System The Archiving System is made of four components storage managers disk space managers output areas cache areas archival director cache manager Communication by means of TCP/IP sockets Coordinated by the online database archADS M spacekeep er filekeeper archiver retrieve

26 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 25 Storage Managers One for each logical tape library Allows queries about tape library content file archival file retrieval Transaction oriented (if the underlying tape library software supports it)

27 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 26 Storage Managers The only link between the tape library and the rest of the system interface independent of the underlying archiving software IBM ADSM is used with the current tape library if other products is used in the future, only a specific storage manager will need to be developed

28 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 27 Disk Space Managers One for each disk pool Create and delete files unused files get deleted to make space for new ones

29 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 28 Archival Director Fully automated Works in polling mode from time to time looks for files ready to be archived starts archiving only when enough data is available Files are ordered and grouped to minimize the expected retrieve time Several groups of files can be archived in parallel

30 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 29 Cache Manager User driven when a file is needed, the application asks the cache manager where it is located a retrieve is performed by the manager if needed Several requests can be issued at the same time the manager reorders them internally to minimize the tape mounts Communication by means of TCP/IP sockets

31 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 30 KLOE Archival System archiver archADSM spacekeepe r filekeeper spacekeep er filekeeper retrieve DB... n m k NFS mount local file system TCP/IP socket Tape Library Disk Pool

32 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 31 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

33 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 32 Spy System KLOE data acquisition software allows the event data to be read-out before they get written to disk The mechanism that reads those data is called Spy Based on use of shared memory buffers DAQ processes are piped using this mechanism the spy system reads data from the buffers without interfering with the DAQ

34 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 33 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

35 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 34 KLOE Integrated Dataflow (KID) Integration library database accesses and retrieve operations hidden Offers a single point of access to all the services URI-based selection datarec:(run_nr=5000) and (stream='ksl')spy:/buffer open a spy channel and pass the events to the application read the list from DB, ask the cache manager for the files, pass the events from the files to the application

36 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 35 Management effort The entire system is managed by only a few people: 3 people (2 full time) are engaged in KLOE computing system management (including storage) 1 person is engaged in the development and management of the online database and the archiving system 2 people spend few percent of their time for the maintenance of the offline database

37 CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 36 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy


Download ppt "CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy."

Similar presentations


Ads by Google