The POOL Persistency Framework POOL Project Review Introduction & Overview Dirk Düllmann, IT-DB & LCG-POOL LCG Application Area Internal Review 20-22 October.

Slides:



Advertisements
Similar presentations
Physicist Interfaces Project an overview Physicist Interfaces Project an overview Jakub T. Moscicki CERN June 2003.
Advertisements

Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Database System Concepts and Architecture
The POOL Persistency Framework POOL Summary and Plans.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release Plan for 2003 Dirk Düllmann LCG Application Area Meeting, 5 th March 2003.
Polaris Financial Technologies Welcomes the members of Hyderabad chapter for the 2nd event on 4 th July 14 held by PACE (The Testing Practice)
POOL Project Status GridPP 10 th Collaboration Meeting Radovan Chytracek CERN IT/DB, GridPP, LCG AA.
Professional Informatics & Quality Assurance Software Lifecycle Manager „Tools that are more a help than a hindrance”
D. Duellmann, CERN Data Management at the LHC1 Data Management at CERN’s Large Hadron Collider (LHC) Dirk Düllmann CERN IT/DB, Switzerland
SEAL V1 Status 12 February 2003 P. Mato / CERN Shared Environment for Applications at LHC.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
Lecture On Introduction (DBMS) By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
EGEE is a project funded by the European Union under contract IST Testing processes Leanne Guy Testing activity manager JRA1 All hands meeting,
LCG Applications Area – Overview, Planning, Resources Torre Wenaus, BNL/CERN LCG Applications Area Manager LHCC Comprehensive Review.
LCG Application Area Internal Review Persistency Framework - Project Overview Dirk Duellmann, CERN IT and
CHEP 2003 March 22-28, 2003 POOL Data Storage, Cache and Conversion Mechanism Motivation Data access Generic model Experience & Conclusions D.Düllmann,
MINER A Software The Goals Software being developed have to be portable maintainable over the expected lifetime of the experiment extensible accessible.
An RTAG View of Event Collections, and Early Implementations David Malon ATLAS Database Group LHC Persistence Workshop 5 June 2002.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team James Casey EGEE’08.
POOL Status and Plans Dirk Düllmann IT-DB & LCG-POOL Application Area Meeting 10 th March 2004.
Feedback from the POOL Project User Feedback from the POOL Project Dirk Düllmann, LCG-POOL LCG Application Area Internal Review October 2003.
LCG Generator Meeting, December 11 th 2003 Introduction to the LCG Generator Monthly Meeting.
SEAL Core Libraries and Services CLHEP Workshop 28 January 2003 P. Mato / CERN Shared Environment for Applications at LHC.
SEAL Project Core Libraries and Services 18 December 2002 P. Mato / CERN Shared Environment for Applications at LHC.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
From Digital Objects to Content across eInfrastructures Content and Storage Management in gCube Pasquale Pagano CNR –ISTI on behalf of Heiko Schuldt Dept.
7/6/2004 CMS weekZhen Xie 1 POOL RDBMS abstraction layer status & plan Zhen Xie Princeton University.
STAR Collaboration, July 2004 Grid Collector Wei-Ming Zhang Kent State University John Wu, Alex Sim, Junmin Gu and Arie Shoshani Lawrence Berkeley National.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
CHEP 2004, Core Software Integration of POOL into three Experiment Software Frameworks Giacomo Govi CERN IT-DB & LCG-POOL K. Karr, D. Malon, A. Vaniachine.
Software Engineering Overview DTI International Technology Service-Global Watch Mission “Mission to CERN in Distributed IT Applications” June 2004.
G.Govi CERN/IT-DB 1 September 26, 2003 POOL Integration, Testing and Release Procedure Integration  Packages structure  External dependencies  Configuration.
23/2/2000Status of GAUDI 1 P. Mato / CERN Computing meeting, LHCb Week 23 February 2000.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Pool Project and ROOT I/O Dirk Duellmann What is Pool? Component Breakdown Status and Plans.
Some Ideas for a Revised Requirement List Dirk Duellmann.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
ATLAS Database Access Library Local Area LCG3D Meeting Fermilab, Batavia, USA October 21, 2004 Alexandre Vaniachine (ANL)
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Dictionary and POOL Dirk Duellmann.
Overview of C/C++ DB APIs Dirk Düllmann, IT-ADC Database Workshop for LHC developers 27 January, 2005.
The SEAL Component Model Radovan Chytracek CERN IT/DB, LCG AA On behalf of LCG/SEAL team This work received support from Particle Physics and Astronomy.
CHEP 2004, POOL Development Status & Plans POOL Development Status and Plans K. Karr, D. Malon, A. Vaniachine (Argonne National Laboratory) R. Chytracek,
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
CORAL CORAL a software system for vendor-neutral access to relational databases Ioannis Papadopoulos, Radoval Chytracek, Dirk Düllmann, Giacomo Govi, Yulia.
Lecture On Introduction (DBMS) By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
G.Govi CERN/IT-DB 1GridPP7 June30 - July 2, 2003 Data Storage with the POOL persistency framework Motivation Strategy Storage model Storage operation Summary.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
POOL & ARDA / EGEE POOL Plans for 2004 ARDA / EGEE integration Dirk Düllmann, IT-DB & LCG-POOL LCG workshop, 24 March 2004.
D. Duellmann - IT/DB LCG - POOL Project1 Internal Pool Release V0.2 Dirk Duellmann.
D. Duellmann, IT-DB POOL Status1 POOL Persistency Framework - Status after a first year of development Dirk Düllmann, IT-DB.
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release V1.0 Dirk Düllmann LCG Application Area Meeting, 14 th May 2003.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
POOL Based CMS Framework Bill Tanenbaum US-CMS/Fermilab 04/June/2003.
POOL Historical Notes POOL has been the most advanced and the most used AA project. Currently, excellent teamwork with experiments on new features and.
LCG Applications Area Milestones
(on behalf of the POOL team)
POOL: Component Overview and use of the File Catalog
POOL File Catalog: Design & Status
POOL persistency framework for LHC
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
First Internal Pool Release 0.1
The POOL Persistency Framework
POOL/RLS Experience Current CMS Data Challenges shows clear problems wrt to the use of RLS Partially due to the normal “learning curve” on all sides in.
Grid Data Integration In the CMS Experiment
POOL Status & Release Plan for V0.4
SEAL Project Core Libraries and Services
Presentation transcript:

The POOL Persistency Framework POOL Project Review Introduction & Overview Dirk Düllmann, IT-DB & LCG-POOL LCG Application Area Internal Review October 2003

The POOL Persistency FrameworkD.Duellmann2 What is POOL? Project Goal: Develop a common Persistency Framework for physics applications at the LHCProject Goal: Develop a common Persistency Framework for physics applications at the LHC –P ool O f persistent O bjects for L HC Part of the LHC Computing Grid (LCG)Part of the LHC Computing Grid (LCG) –One of the first Application Area Projects Common effort between the LHC experiments and the CERN IT-DB groupCommon effort between the LHC experiments and the CERN IT-DB group –for defining its scope and architecture –for the development of its components

The POOL Persistency FrameworkD.Duellmann3 POOL Objectives To allow the multi-PB of experiment data and associated meta data to be stored in a distributed and Grid enabled fashionTo allow the multi-PB of experiment data and associated meta data to be stored in a distributed and Grid enabled fashion –various types of data of different volumes (event data, physics and detector simulation, detector data and bookkeeping data) Hybrid technology approach, combiningHybrid technology approach, combining –C++ object streaming technology, such as Root I/O, for the bulk data – transactional safe Relational Database (RDBMS) services, such as MySQL, for catalogs, collections and meta data In particular, it providesIn particular, it provides –Persistency for C++ transient objects –Transparent navigation from one object across file and technology boundaries Integrated with a external File Catalog to keep track of the file physical location, allowing files to be moved or replicated

The POOL Persistency FrameworkD.Duellmann4 POOL Timeline and Statistics POOL project started April 2002POOL project started April 2002 –Ramping up from 1.6 to ~10 FTE –Persistency Workshop in June 2002 –First internal release POOL V0.1 in October 2002 In one year of active development since thenIn one year of active development since then –10 public releases POOL V1.3.1 last Friday –Some 50 internal releases Often picked up by experiments to confirm fixes/new functionality Very useful to insure releases meet experiment expectations beforehand –Handled some 150 bug reports Savannah web portal proven helpful POOL followed from the beginning a rather aggressive schedule to meet the first production needs of the experiments.POOL followed from the beginning a rather aggressive schedule to meet the first production needs of the experiments.

The POOL Persistency FrameworkD.Duellmann5 POOL Milestones First “Public” Release - V0.3 December ’02First “Public” Release - V0.3 December ’02 –Navigation between files supported, catalog components integrated –LCG Dictionary moved to SEAL – and picked up from there Basic dictionary integration for elementary types First “Functionally Complete” Release - V1.0 June ’03First “Functionally Complete” Release - V1.0 June ’03 –LCG dictionary integration for most requested language features including STL containers –Consistent meta data support for file catalog and event collections (aka tag collections) –Integration with EDG-RLS pre-production service (rlstest.cern.ch) First “Production Release” - V1.1 July `03First “Production Release” - V1.1 July `03 –Added bare C++ pointer support, transient data members, update of streaming layer data, simplified (user) transaction model –Due to the large number of requests from integration activities still rather a functionality release than the planned consolidation release. –EDG-RLS production service (one catalog server per experiment) Project stayed close to release data estimatesProject stayed close to release data estimates –Maximum variance 2 weeks –Usually release within a few days around the predicted target date

The POOL Persistency FrameworkD.Duellmann6 POOL - Known Issues No adequate end-user documentation yetNo adequate end-user documentation yet –Preparing a user guide collecting the experience gained during the framework integrations –Expanding the set of example programs and prepare a hands-on tutorial Testing is not perfect..Testing is not perfect.. –.. but about 60 functional and integration tests are executed in an automated way each release cycle –Feature requests now often come as a complete test case from the experiment. Thanks! –Number of high priority feature requests have not always left the time to turn each reported bug into a new test Performance optimisation not yet fully addressedPerformance optimisation not yet fully addressed –Performance tests now exist for all components (addressed in June release) –…but not enough time for in depth performance optimisation in all areas - focus stayed on functionality longer than anticipated. –External design and code reviews setup for use of ROOT I/O and for Object cache Schema Evolution and Stability of File FormatSchema Evolution and Stability of File Format –Current strategy relies fully on ROOT I/O facilities The use of ROOT I/O as black box makes more generic schema evolution support non- trivial –POOL does not fully control the file format, but can help to detect unwanted format changes during regression testing

The POOL Persistency FrameworkD.Duellmann7 Component Architecture POOL is a component based systemPOOL is a component based system –follows the LCG Architecture Blueprint Provides a technology neutral APIProvides a technology neutral API –Abstract component C++ interfaces –Insulates the experiment framework user code from several concrete implementation details and technologies used today POOL user code is not dependent on implementation librariesPOOL user code is not dependent on implementation libraries –No link time dependency on implementation packages (e.g. MySQL, Root, Xerces-C..) –Backend component implementations are loaded at runtime via the SEAL plug-in infrastructure Three major domains, weakly coupled, interacting via abstract interfacesThree major domains, weakly coupled, interacting via abstract interfaces

The POOL Persistency FrameworkD.Duellmann8 POOL Component Breakdown RDBMS Storage Svc ?

The POOL Persistency FrameworkD.Duellmann9 Work Package Breakdown Storage ManagerStorage Manager –Streams transient C++ objects into/from disk storage –Resolves a logical object reference into a physical object –Uses Root I/O. A proof of concept with a RDBMS storage manager prototype underway File CatalogFile Catalog –Maintains consistent lists of accessible files (physical and logical names) together with their unique identifiers (FileID), which appear in the object representation in the persistent space –Resolves a logical file reference (FileID) into a physical file CollectionsCollections –Provides the tools to manage potentially (large) ensembles of objects stored via POOL persistence services Explicit: server-side selection of object from queryable collections Implicit: defined by physical containment of the objects

The POOL Persistency Framework POOL Work Package Presentations Storage ManagerStorage Manager File CatalogFile Catalog CollectionsCollections Integration & TestingIntegration & Testing