CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

Transaction.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Lecturer: Sebastian Coope Ashton Building, Room G.18 COMP 201 web-page: Lecture.
Yousuf Surmust Instructor: Marius Soneru Course: CS550 Fall 2001
1 Database Systems (Part I) Introduction to Databases I Overview  Objectives of this lecture.  History and Evolution of Databases.  Basic Terms in Database.
Introduction to Databases
1 Lecture 31 Introduction to Databases I Overview  Objectives of this lecture  History and Evolution of Databases  Basic Terms in Database and definitions.
Chapter 12 Distributed Database Management Systems
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
Architectural Design, Distributed Systems Architectures
Agenda  Overview  Configuring the database for basic Backup and Recovery  Backing up your database  Restore and Recovery Operations  Managing your.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Network File System (NFS) in AIX System COSC513 Operation Systems Instructor: Prof. Anvari Yuan Ma SID:
Object Oriented Databases by Adam Stevenson. Object Databases Became commercially popular in mid 1990’s Became commercially popular in mid 1990’s You.
Oracle Recovery Manager (RMAN) 10g : Reloaded
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Shuei MEG review meeting, 2 July MEG Software Status MEG Software Group Framework Large Prototype software updates Database ROME Monte Carlo.
Designing a HEP Experiment Control System, Lessons to be Learned From 10 Years Evolution and Operation of the DELPHI Experiment. André Augustinus 8 February.
Architectural Design, Distributed Systems Architectures
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Components of Database Management System
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
CS 474 Database Design and Application Terminology Jan 11, 2000.
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
Windows 2000 Operating System -- Active Directory Service COSC 516 Yuan YAO 08/29/2000.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Mainframe (Host) - Communications - User Interface - Business Logic - DBMS - Operating System - Storage (DB Files) Terminal (Display/Keyboard) Terminal.
Personal Computer - Stand- Alone Database  Database (or files) reside on a PC - on the hard disk.  Applications run on the same PC and directly access.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
The european ITM Task Force data structure F. Imbeaux.
Database Architectures Database System Architectures Considerations – Data storage: Where do the data and DBMS reside? – Processing: Where.
KLOE Computing Update Paolo Santangelo INFN LNF KLOE General Meeting University of Rome 2, Tor Vergata 2002, December
CE Operating Systems Lecture 3 Overview of OS functions and structure.
ALICE, ATLAS, CMS & LHCb joint workshop on
Lesson Overview 3.1 Components of the DBMS 3.1 Components of the DBMS 3.2 Components of The Database Application 3.2 Components of The Database Application.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
The BaBar Prompt Reconstruction Manager: a Real Life Example of a Constructive Approach to Software Development. Francesco Safai Tehrani Istituto Nazionale.
The KLOE computing environment Nuclear Science Symposium Portland, Oregon, USA 20 October 2003 M. Moulson – INFN/Frascati for the KLOE Collaboration.
A simple Desktop DAQ for U2F readout Ulf jörnmark Physics Dept. Lund Status and plans.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
DoE Review January 1998 Online System WBS 1.5  One-page review  Accomplishments  System description  Progress  Status  Goals Outline Stu Fuess.
Online Monitoring System at KLOE Alessandra Doria INFN - Napoli for the KLOE collaboration CHEP 2000 Padova, 7-11 February 2000 NAPOLI.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 5.
CEG 2400 FALL 2012 Linux/UNIX Network Operating Systems.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Unit 1: IBM Tivoli Storage Manager 5.1 Overview. 2 Objectives Upon the completion of this unit, you will be able to: Identify the purpose of IBM Tivoli.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
An Introduction to GPFS
Status Report on Data Reconstruction May 2002 C.Bloise Results of the study of the reconstructed events in year 2001 Data Reprocessing in Y2002 DST production.
KID - KLOE Integrated Dataflow
WP18, High-speed data recording Krzysztof Wrona, European XFEL
The Client/Server Database Environment
Computing Infrastructure for DAQ, DM and SC
Storage Virtualization
Chapter 2: The Linux System Part 5
Overview Activities from additional UP disciplines are needed to bring a system into being Implementation Testing Deployment Configuration and change management.
Presentation transcript:

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 2 The KLOE experiment at DA  NE  -factory main goal: CP violation study other interesting fields: kaon form factors kaon rare decays radiative  decays K S   +  - K L   +  - (CP not) K S   +  - K L  3  0  6 

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 3 KLOE Requirements Data acquisition (at full DA  NE luminosity) events per year acquired 50 MB/s sustained throughput Computing power ALL the events need to be reconstructed Storage requirements one petabyte of raw and reconstructed events hundreds of megabytes of related data (configurations, slow control data, calibration parameters, etc.)

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 4 KLOE computing environment Based on a set of medium-sized servers Connected using commercial switched networks (Fast Ethernet and Gigabit Ethernet) Heterogeneous environment, several platforms: IBM AIX on PowerPC Sun Solaris on Sparc Compaq Tru64 Unix on Alpha HP-UX on PA-RISC

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 5 KLOE storage pool Different policies for different types of data: raw and reconstructed events on tape libraries, with big disk pools for data caching related data managed by a disk based database system analysis output on disk pools

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 6 Disk pools Four categories of disk pools are present: each data acquisition node in the farm has its own small disk pool computing nodes write their output to centralized, NFS mounted disk pools separate disk pools are used as a cache for the events on tape analysis output is written to its own, central AFS mounted disk pool

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 7 Tape library Several automated tape libraries supported (at the moment the 5500 slot tape library is partitioned between two tape servers) Accessed using commercial software IBM ADSM with the current tape library

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 8 KLOE software Three distinct categories DAQ (or online) reconstruction and analysis (or offline) Monte Carlo ANSI C FORTRAN inside A_C FORTRAN The interface to the Data Handling System must be compatible with all of them

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 9 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 10 KLOE Data Handling System A mix of commercial and custom software the dependency on commercial software is minimized by the layers of custom software commercial software carries on all the vital functions custom software mostly extends and coordinates the functionality of the commercial software

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 11 KLOE Data Handling System Based on a set of multi-threaded non- privileged daemons and related libraries Distributed across several nodes Communication by means of TCP/IP sockets on high ports  bypasses TCP/IP filtering  flexible, programming language and operating system independent  no configuration needed on the client side

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 12 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 13 Database System Two distinct database systems are used offline database system online database system based on HepDBdata stored as ZEBRA banks based on a Relational DBMS data are structured in fields extended for distributed environments

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 14 Online Database System data stored in a Relational DBMS IBM DB2 Universal Database at the moment communication between the clients (user applications) and the RDBMS through a database daemon RDBMS DD app

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 15 Database Daemon The database daemon is the only link between the applications and the RDBMS if the RDBMS is changed in the future, only the database daemon will need to be changed Different kinds of commands are managed by the daemon general SQL commands KLOE specific commands

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 16 Database Daemon Different kinds of commands are managed by the daemon general SQL commands KLOE specific commands passed directly to the RDBMS select run_nr from run_logger where status = 'OK' managed by the daemon itself the RDBMS is used to retrieve and store data needed by the daemon itself log that I am starting processing file relative to run 3

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 17 Database Daemon The use of KLOE specific commands has several advantages additional checks and restrictions are possible data consistency management is centralized fast central caches can be implemented for example, the DAQ configuration cache reduces the typical access time from 4 to 0.1 s

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 18 A light version The RDBMS is used to ensure flexibility, reliability and performance Demanding in terms of computing resources and management effort stand-alone environments often cannot afford it A RDBMS-independent version of the database daemon is under development

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 19 A light version A RDBMS-independent version of the database daemon is under development limited to KLOE specific and the most frequently used SQL commands based on use of flat files containing a small portion of the data not suitable for production environment, but enough for home use

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 20 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 21 KLOE Archiving System Expected event data managed by KLOE 1 PB Tape libraries needed data storage and retrieval non trivial random access to data very inefficient Disk-based intermediate buffers used

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 22 KLOE Archiving System Two types of intermediate buffers DAQ, offline and Monte Carlo output are structured as YBOS files and written on their disk output areas event data needed by offline as input are read from the archiving system disk-cache

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 23 KLOE Archiving System Data needs to be migrated from output areas to the tape library as soon as possible (taking into account also efficiency concerns) from the tape library to the disk cache when an application needs it (or even better, a bit earlier) Migration is totally automated and transparent to the applications

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 24 KLOE Archiving System The Archiving System is made of four components storage managers disk space managers output areas cache areas archival director cache manager Communication by means of TCP/IP sockets Coordinated by the online database archADS M spacekeep er filekeeper archiver retrieve

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 25 Storage Managers One for each logical tape library Allows queries about tape library content file archival file retrieval Transaction oriented (if the underlying tape library software supports it)

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 26 Storage Managers The only link between the tape library and the rest of the system interface independent of the underlying archiving software IBM ADSM is used with the current tape library if other products is used in the future, only a specific storage manager will need to be developed

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 27 Disk Space Managers One for each disk pool Create and delete files unused files get deleted to make space for new ones

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 28 Archival Director Fully automated Works in polling mode from time to time looks for files ready to be archived starts archiving only when enough data is available Files are ordered and grouped to minimize the expected retrieve time Several groups of files can be archived in parallel

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 29 Cache Manager User driven when a file is needed, the application asks the cache manager where it is located a retrieve is performed by the manager if needed Several requests can be issued at the same time the manager reorders them internally to minimize the tape mounts Communication by means of TCP/IP sockets

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 30 KLOE Archival System archiver archADSM spacekeepe r filekeeper spacekeep er filekeeper retrieve DB... n m k NFS mount local file system TCP/IP socket Tape Library Disk Pool

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 31 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 32 Spy System KLOE data acquisition software allows the event data to be read-out before they get written to disk The mechanism that reads those data is called Spy Based on use of shared memory buffers DAQ processes are piped using this mechanism the spy system reads data from the buffers without interfering with the DAQ

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 33 KLOE Data Handling System Composed of four elements: Database System Archiving System Spy System KLOE Integrated Dataflow (KID)

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 34 KLOE Integrated Dataflow (KID) Integration library database accesses and retrieve operations hidden Offers a single point of access to all the services URI-based selection datarec:(run_nr=5000) and (stream='ksl')spy:/buffer open a spy channel and pass the events to the application read the list from DB, ask the cache manager for the files, pass the events from the files to the application

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 35 Management effort The entire system is managed by only a few people: 3 people (2 full time) are engaged in KLOE computing system management (including storage) 1 person is engaged in the development and management of the online database and the archiving system 2 people spend few percent of their time for the maintenance of the offline database

CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 36 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy