Presentation is loading. Please wait.

Presentation is loading. Please wait.

CHEP 2004, POOL Development Status & Plans POOL Development Status and Plans K. Karr, D. Malon, A. Vaniachine (Argonne National Laboratory) R. Chytracek,

Similar presentations


Presentation on theme: "CHEP 2004, POOL Development Status & Plans POOL Development Status and Plans K. Karr, D. Malon, A. Vaniachine (Argonne National Laboratory) R. Chytracek,"— Presentation transcript:

1 CHEP 2004, POOL Development Status & Plans POOL Development Status and Plans K. Karr, D. Malon, A. Vaniachine (Argonne National Laboratory) R. Chytracek, D. Duellmann, M. Frank, M. Girone, G. Govi, J. Moscicki, I. Papadopoulos, H. Schmuecker(CERN) Z. Xie (Princeton University ) T. Barrass (University of Bristol) C. Cioffi (University of Oxford) W. Tanenbaum (Fermi National Accelerator Laboratory) CHEP 2004, Interlaken, Switzerland

2 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN2 The LCG Persistency Framework The LCG persistency framework project consists of two partsThe LCG persistency framework project consists of two parts –Common project with CERN IT and strong experiment involvement POOLPOOL –Hybrid object persistency integration object streaming (ROOT I/O) with Relational Database technology –Established baseline for three LHC experiments –Has been successfully integrated into the software frameworks of ATLAS, CMS and LHCb See also G. Govi’s talk (382) –Being successfully deployed in three large scale data challenges See also M. Girone’s talk (383) Conditions DatabaseConditions Database –Conditions DB was moved into the scope of the LCG project To consolidate different independent developments –Should share storage of complex objects into Root I/O and RDBMS backend with POOL See the talks of A. Valassi (447) and A. Amorim (262) about this work

3 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN3 POOL Project Evolution POOL is entering its third year of active developmentPOOL is entering its third year of active development –During the last 2 years we managed to follow the proposed work plan and met the rather aggressive schedule to move POOL into the experiment production –This year POOL has been proven in the LCG data challenges with volumes ~400TB Changing from pure development mode to support, deployment and maintenanceChanging from pure development mode to support, deployment and maintenance –Several developers moved their effort into experiment integration or back-end services This is healthy move and insures proper coupling between software and deployment! Affects the available development manpower –Task profile changing from design and debugging to user support and re- engineering Need to maintain stable and focused manpower from CERN and the experimentsNeed to maintain stable and focused manpower from CERN and the experiments –This close contact has made POOL a successful project –Both Experiments and CERN have confirmed their commitment to the project

4 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN4 Development Focus This Year Move to ROOT4 (POOL2.0 Line)Move to ROOT4 (POOL2.0 Line) –To take advantage of automatic schema evolution and simplified streaming of STL containers Need to insure backward compatibility for POOL 1.x files –Currently undergoing validation by the experiments Will release two branches until POOL 2 is fully certified File Catalog deployment issuesFile Catalog deployment issues –DC productions showed some weaknesses of grid catalog implementations Several new/enhanced catalogs coming up Changes in the experiment computing models need to be taken into account –POOL tries to generalise from specific implementations and provides an open interface to accommodate upcoming components CollectionsCollections –Several implementations of POOL collections exist –Collection cataloguing has been added in response to experiment requests Similar to file catalogs re-use of catalog implementation and commandline tools –Experiment analysis models are still being concretized –Expect experience from concrete analysis challenges

5 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN5 Why a Relational Abstraction Layer (RAL)? Goal: Vendor independence for the relational components of POOL, ConditionsDB and user codeGoal: Vendor independence for the relational components of POOL, ConditionsDB and user code –Continuation of the component architecture as defined in the LCG Blueprint –File catalog, collections and object storage run against all available RDBMS plug-ins To reduced code maintenance effortTo reduced code maintenance effort –All RDBMS client components can use all supported back-ends –Bug fixes can be applied once centrally To minimise risk of vendor bindingTo minimise risk of vendor binding –Allows to add new RDBMS flavours later or use them in parallel and are picked up by all RDBMS clients –RDBMS market is still in flux.. To address the problem of distributing data in RDBMS of different flavoursTo address the problem of distributing data in RDBMS of different flavours –Common mapping of application code to tables simplifies distribution of RDBMS data in a generic application independent way

6 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN6 Relational Access functionality Database Schema Access and ManipulationDatabase Schema Access and Manipulation –Describing existing and creating new tables –Support for primary, foreign keys and indices Formed by one or more table columns Data Manipulation LanguageData Manipulation Language –Insertion, update and deletion of table rows –Bulk insertions to minimise database server roundtrips QueriesQueries –Nested queries involving one or more tables –Ordering and limiting the result set –Control of client cache for the result set –Database cursors scalable iteration through large query results

7 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN7 Domain Decomposition Pure relational data managementPure relational data management –Provide technology neutral RDBMS connectivity –Encapsulate main differences eg table creation options –Direct clients: File catalog, Collections and Object relational mapping Object-relational mapping and storageObject-relational mapping and storage –Bridges the differences between relational and object world (object identity resolution, object associations) –Provide guided object storage –Direct client: POOL Relational Storage Service POOL Relational Storage ServicePOOL Relational Storage Service –Adapter implementing the POOL StorageSvc interfaces –Direct client: experiment framework

8 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN8 Software design uses Abstract interface Implementation implements Technology dependent plugin FileCatalogCollectionStorageSvc Experiment framework RelationalAccessSeal reflection ObjectRelationalAccess Relational Collection Relational Catalog RelationalStorageSvc MySQL Oracle SQLite

9 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN9 Relational Access Layer Design Interface and implementation design driven by software requirement documentInterface and implementation design driven by software requirement document –Co-authored by main users and POOL developers Simple key-value pair interface (AttributeList) used for the handling and the description of the relational dataSimple key-value pair interface (AttributeList) used for the handling and the description of the relational data Clean standard C++ interfaceClean standard C++ interface –No special SQL types exposed for data elements –Type converter responsible for default and user-defined type conversion between C++ and SQL data types –Can take advantage of vendor specific SQL type extensions Exposed SQL fragments are used only in SQL WHERE clausesExposed SQL fragments are used only in SQL WHERE clauses –Most non standard SQL extensions (eg in create table) are well encapsulated

10 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN10 RDBMS plug-ins in POOL Oracle 9i/10gOracle 9i/10g –Based on OCI –Supports Oracle instant client –Fully supports the POOL RAL interfaces –Available for the Linux platforms (win32 will follow) SQLiteSQLite –A light-weight embeddable SQL database engine –File-based (zero configuration, administration) –Available for the Linux and Win32 platforms MySQLMySQL –Implementation based on the MyODBC driver –Prototype released with POOL 1.8

11 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN11 First clients of RAL RelationalFileCatalogRelationalFileCatalog –Scheduled to replace the more specific MySQLFileCatalog in POOL –Validated the functionality and semantics of the interfaces –Tested against Oracle and SQLite Several experiment projects ongoingSeveral experiment projects ongoing –CMS online and meta data –ATLAS detector geometry First response largely positiveFirst response largely positive –Real validation will need some time

12 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN12 Object to Relational Mapping How to map classes ↔ tables ?How to map classes ↔ tables ? –Both C++ and SQL allow to describe data layout –But with very different constraints/aims no single unique mapping Need for fast object navigation an unique Object identity (persistent address)Need for fast object navigation an unique Object identity (persistent address) –requires unique index for addressable objects –part of mapping definition POOL stores mapping with the object dataPOOL stores mapping with the object data –need to store mapping versions

13 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN13 A Mapping Example class A { int x; int x; float y; float y; std::vector v; std::vector v; class B { class B { int i; int i; std::string s; std::string s; } b; } b;};

14 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN14 A Mapping Example..... “Hi”32.2222 “Hello”31.4101 B_SB_IYXID T_A 0.132 22 32.112 5.45241 4.131 12.221 0.1211 VPOSID T_A_V p.k.f.k. constraint This is only one of the possible mappings!

15 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN15 Mapping Elements A complete mapping consists ofA complete mapping consists of –A mapping version per object –A hierarchical tree of mapping elements per version Each mapping element containsEach mapping element contains –Element type (“Object”, “Primitive”, “Array”, “POOL reference”, “Pointer”) –Database table and column names –C++ member name and type –Lower level associated mapping elements POOL stores these persistently in 3 (hidden) relational tablesPOOL stores these persistently in 3 (hidden) relational tables

16 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN16 Generating a Mapping.. Two use cases need to be supportedTwo use cases need to be supported 1)Starting from existing table schema and data Give access to RDBMS data with minimal changes to existing data POOL generates default header and mapping from the DB schema 2)Starting from existing C++ header file Implement existing class with minimal changes to user C++code POOL generates default DB schema and mapping from the LCG dictionary entry In both cases the user can override a default mapping via an xml steering fileIn both cases the user can override a default mapping via an xml steering file Select the C++ classes which are mapped Override default mapping rules (eg member names and types) Define the mapping version –Mapping then gets “materialized” - eg stored in the database with a command line tool –Need to support copies and

17 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN17 POOL Summary The LCG POOL project provides a hybrid store integrating object streaming (Root I/O) with RDBMS technology (Oracle/MySQL/SQLight)The LCG POOL project provides a hybrid store integrating object streaming (Root I/O) with RDBMS technology (Oracle/MySQL/SQLight) –POOL has been integrated into LHC experiments software frameworks and is use for the pre-production activities in CMS –Successfully deployed as baseline persistency mechanism for CMS, ATLAS and LHCb at the scale of ~400TB POOL continues the LCG component approach by abstracting relational database access in a vendor neutral wayPOOL continues the LCG component approach by abstracting relational database access in a vendor neutral way –POOL Relational Abstraction has been released and is being picked up by several experiments –Minimised risk of vendor binding, simplified maintenance and data distribution are the main motivations POOL as a project is (slowly) migrating to a support and maintenance phasePOOL as a project is (slowly) migrating to a support and maintenance phase –Need keep remaining manpower focused in order to finish remaining developments and to provide relevant support to user community

18 CHEP 2004, POOL Development Status & PlansD.Duellmann, CERN18 Connecting to a database Connection string format:Connection string format: –technology_protocol://hostName:portNumber/databaseOrSchema Name:sidNumber Examples:Examples: –oracle://dbhost/user –mysql://dbhost:1105/dbname –sqlite_http://dbhost/directory/dbfile.db –sqlite_file:/absolute_dbfile_path.db –No authentication parameters Connection string specifying only the data, not the access mechanismConnection string specifying only the data, not the access mechanism –The RelationalService deduced which plugin to use from the connection URI –Current convention: loading of the module named “POOL/RelationalPlugins/technology”


Download ppt "CHEP 2004, POOL Development Status & Plans POOL Development Status and Plans K. Karr, D. Malon, A. Vaniachine (Argonne National Laboratory) R. Chytracek,"

Similar presentations


Ads by Google