D. Duellmann - IT/DB LCG - POOL Project1 The LCG Pool Project and ROOT I/O Dirk Duellmann What is Pool? Component Breakdown Status and Plans.

Slides:



Advertisements
Similar presentations
Object Persistency & Data Handling Session C - Summary Object Persistency & Data Handling Session C - Summary Dirk Duellmann.
Advertisements

CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
M.Frank LHCb/CERN POOL persistency: Status  Current understanding  Today’s model  Conclusions.
CERN - IT Department CH-1211 Genève 23 Switzerland t LCG Persistency Framework CORAL, POOL, COOL – Status and Outlook A. Valassi, R. Basset,
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release Plan for 2003 Dirk Düllmann LCG Application Area Meeting, 5 th March 2003.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
POOL Project Status GridPP 10 th Collaboration Meeting Radovan Chytracek CERN IT/DB, GridPP, LCG AA.
D. Duellmann, CERN Data Management at the LHC1 Data Management at CERN’s Large Hadron Collider (LHC) Dirk Düllmann CERN IT/DB, Switzerland
WP6: Grid Authorization Service Review meeting in Berlin, March 8 th 2004 Marcin Adamski Michał Chmielewski Sergiusz Fonrobert Jarek Nabrzyski Tomasz Nowocień.
SEAL V1 Status 12 February 2003 P. Mato / CERN Shared Environment for Applications at LHC.
LCG Persistency Workshop, June 5th 2002
SPI Software Process & Infrastructure EGEE France - 11 June 2004 Yannick Patois
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting June 13-14, 2002.
Conditions DB in LHCb LCG Conditions DB Workshop 8-9 December 2003 P. Mato / CERN.
LCG Application Area Internal Review Persistency Framework - Project Overview Dirk Duellmann, CERN IT and
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
Overview of the SAS® Management Console
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
CHEP 2003 March 22-28, 2003 POOL Data Storage, Cache and Conversion Mechanism Motivation Data access Generic model Experience & Conclusions D.Düllmann,
Framework of Job Managing for MDC Reconstruction and Data Production Li Teng Zhang Yao Huang Xingtao SDU
Lesson Overview 3.1 Components of the DBMS 3.1 Components of the DBMS 3.2 Components of The Database Application 3.2 Components of The Database Application.
An RTAG View of Event Collections, and Early Implementations David Malon ATLAS Database Group LHC Persistence Workshop 5 June 2002.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
POOL Status and Plans Dirk Düllmann IT-DB & LCG-POOL Application Area Meeting 10 th March 2004.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
SEAL Core Libraries and Services CLHEP Workshop 28 January 2003 P. Mato / CERN Shared Environment for Applications at LHC.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
The POOL Persistency Framework POOL Project Review Introduction & Overview Dirk Düllmann, IT-DB & LCG-POOL LCG Application Area Internal Review October.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
NOVA A Networked Object-Based EnVironment for Analysis “Framework Components for Distributed Computing” Pavel Nevski, Sasha Vanyashin, Torre Wenaus US.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
CHEP 2004, Core Software Integration of POOL into three Experiment Software Frameworks Giacomo Govi CERN IT-DB & LCG-POOL K. Karr, D. Malon, A. Vaniachine.
G.Govi CERN/IT-DB 1 September 26, 2003 POOL Integration, Testing and Release Procedure Integration  Packages structure  External dependencies  Configuration.
Some Ideas for a Revised Requirement List Dirk Duellmann.
LCG Distributed Databases Deployment – Kickoff Workshop Dec Database Lookup Service Kuba Zajączkowski Chi-Wei Wang.
Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 1 HARP DATA AND SOFTWARE MIGRATION FROM TO ORACLE Authors: A.Valassi,
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Dictionary and POOL Dirk Duellmann.
Overview of C/C++ DB APIs Dirk Düllmann, IT-ADC Database Workshop for LHC developers 27 January, 2005.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
G.Govi CERN/IT-DB 1GridPP7 June30 - July 2, 2003 Data Storage with the POOL persistency framework Motivation Strategy Storage model Storage operation Summary.
ROOT Based CMS Framework Bill Tanenbaum US-CMS/Fermilab 14/October/2002.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
POOL & ARDA / EGEE POOL Plans for 2004 ARDA / EGEE integration Dirk Düllmann, IT-DB & LCG-POOL LCG workshop, 24 March 2004.
D. Duellmann - IT/DB LCG - POOL Project1 Internal Pool Release V0.2 Dirk Duellmann.
Applications Area Status Torre Wenaus, BNL/CERN PEB Meeting October 8, 2002.
D. Duellmann, IT-DB POOL Status1 POOL Persistency Framework - Status after a first year of development Dirk Düllmann, IT-DB.
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release V1.0 Dirk Düllmann LCG Application Area Meeting, 14 th May 2003.
POOL Based CMS Framework Bill Tanenbaum US-CMS/Fermilab 04/June/2003.
Magda Distributed Data Manager Torre Wenaus BNL October 2001.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
Persistency Project (POOL) Status and Work Plan Proposal
LCG Applications Area Milestones
(on behalf of the POOL team)
By: S.S.Tomar Computer Center CAT, Indore, India On Behalf of
More Interfaces Workplan
Open Source distributed document DB for an enterprise
3D Application Tests Application test proposals
Database Readiness Workshop Intro & Goals
POOL: Component Overview and use of the File Catalog
POOL File Catalog: Design & Status
POOL persistency framework for LHC
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
First Internal Pool Release 0.1
The POOL Persistency Framework
Grid Data Integration In the CMS Experiment
POOL Status & Release Plan for V0.4
Event Storage GAUDI - Data access/storage Framework related issues
Presentation transcript:

D. Duellmann - IT/DB LCG - POOL Project1 The LCG Pool Project and ROOT I/O Dirk Duellmann What is Pool? Component Breakdown Status and Plans

D. Duellmann - IT/DBLCG - POOL Project 2 What is Pool? Pool of persistent objects for LHC LCG Framework for Object Persistency Targeted at event data but not only event data -scalability, navigation, decoupling Hybrid technology approach -RDBMS for consistent meta data handling (eg transactions) -eg file catalogs, object collections, extensible meta data -prototype based on MySQL (and InnoDB in particular) -Object persistency via extensible streamer/converter system -based on LCG Reflection component (transient object dictionary) -prototype uses Root I/O (Tree and TNamed mode) Following the LCG component approach -components with well defined responsibilities -communicating via public component interfaces

D. Duellmann - IT/DBLCG - POOL Project 3 Pool as a LCG component Persistency is just one of many projects in the LCG Applications Area Sharing a common architecture and s/w process -as described the Blueprint and Persistency RTAG documents Persistency is important… -…but not important enough to allow for uncontrolled direct dependencies eg of experiment code on its implementation Common effort in which the experiments take over a major share of the responsibility for defining the overall and detailed architecture for development of Pool components During the first internal release contribution from several experiments and IT total ~5 FTE mainly at CERN Additional contributors preparing components for upcoming releases

D. Duellmann - IT/DBLCG - POOL Project 4 POOL Components

D. Duellmann - IT/DBLCG - POOL Project 5 POOL Cache Access Key CacheMgr Pointer SmartRef Cache Manager Persistency Manager Token Storage Technology Object Type Persistent Location KeyObjectToken ……… 2

D. Duellmann - IT/DBLCG - POOL Project 6 Storage Hierarchy Data Cache StorageMgr Disk Storage database Database Container Objects Object Pointer Persistent Address Technology DB name Cont.name Item ID

D. Duellmann - IT/DBLCG - POOL Project 7 POOL File Catalog

D. Duellmann - IT/DBLCG - POOL Project 8 Use Case: Farm Production Production manager may pre-register output files with the catalog (eg a “local” MySQL or XML catalog) File ID, physical filename job ID and optionally also logical filenames A production job runs and may continue to add files and catalog entries directly. During the production the catalog can be used to cleanup files (and their registration) from unsuccessful jobs based on the job ID. Once the data quality checks have been passed the production manager decides to publishes the production catalog fragment to the grid based catalog.

D. Duellmann - IT/DBLCG - POOL Project 9 File Catalog Scaling Tests update and lookup performances look fine scalability shown up to a few hundred boxes need to understand availability and back in the production context

D. Duellmann - IT/DBLCG - POOL Project 10 Use Case: Working in Isolation The user extracts a set of interesting files and a catalog fragment describing them from a (central) grid based catalog into a local (eg XML based) catalog. After disconnecting the user executes some unchanged jobs navigating through the extracted data. Any new output files are registered into the local catalog Once ready for data publishing and connected again the user submits selected catalog fragments into the grid based catalog.

D. Duellmann - IT/DBLCG - POOL Project 11 IFileCatalogFactory createCatalog( ) IFileCatalog XMLFactory createCatalog( ) MySQLFactory createCatalog( ) MySQL C++ API MySQLCatalogXMLCatalog Xerces-C++ API Client

D. Duellmann - IT/DBLCG - POOL Project 12 Functionality in the V0.1 release RootI/O Storage Service -two operation modes objects in a tree or objects named into directory structure -provide coherent access to the two very different root optimisation models with the user same code File Catalog -two implementation based on MySQL (networked) and XML (local file) -provide catalog C++ API & command line administration tools Refs and Cache -simple cache implementation -mainly acting as test bed for Ref classes – example for integration with experiment caches Reflection & Conversion -Full reflection interface -Some re-factoring of the interface, template and namespace support are being put in place

D. Duellmann - IT/DBLCG - POOL Project 13 Release Platform POOL V0.1 has been released on October 3 rd the cvs release tag is POOL_0_1_0 Single supported platform so far RedHat 7.2 using gcc Code should also work on rh61 and (partially) on win32/VC6 Require several external packages hosted by SPI see /afs/cern.ch/sw/lcg/external MySQL (4.0.1-alpha) MySQL++ (1.7.9) Root (3.03) Xerces-C (1.6.0) Supported build systems scram and (less complete) cmt

D. Duellmann - IT/DBLCG - POOL Project 14 Pool Release Schedule September - V0.1 - Basic Navigation all core components for navigation exist and interoperate -FileCatalog, Refs & Cache, Storage Svc, Reflection (limited) some remaining simplifications -Assume TObject on read/write – simplified conversion October - V0.2 – Collections first collection implementation integrated -support implicit and explicit collections on either RDBMS or RootIO persistency for foreign classes working -persistency for non-TObject classes without need for user code instrumentation November - V0.3 – Meta Data & Query (first public release) EDG/Globus FileCatalog integrated Meta data annotation and query added -for event, event collection and file based meta data

D. Duellmann - IT/DBLCG - POOL Project 15 Summary The LCG Pool project provides a hybrid store integrating object streaming like Root I/O with RDBMS technology for consistent meta data handling Transparent cross-file (and cross-technology) object navigation via C++ smart pointers Integration with Grid technology (eg EDG/Globus replica catalog) -network and grid decoupled working modes Strong emphasis on component decoupling and well defined communication/dependencies Pool is still young and will need more time to mature to production quality First internal release V0.1 has just been produced on schedule Public release is expected for (end-of) November First larger scale production attempt summer ‘03

See you next summer in the POOL!