Presentation is loading. Please wait.

Presentation is loading. Please wait.

Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model.

Similar presentations


Presentation on theme: "Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model."— Presentation transcript:

1 Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

2 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 2 Introduction  PHENIX is one of two large experiments at RHIC, produces hundreds of TB of data per year  In four years of running PHENIX accumulated tens of GB of calibration ( condition) data which used to get archived in Objectivity database  For a variety of reasons ( licensing and compiler issues among them ) the decision was made to change the underlying storage technology and use open source RDB instead of proprietary OODB  Main constraints - avoid any downtime for production and provide backward compatibility by migrating old Objectivity-based calibration data to RDB of choice

3 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 3 Where do we store data if not in Objy?  One option is to store metadata in RDB and data in flat files ( STAR )  Another option is to store calibrations in BLOBs ( Binary Large Objects ). PHOBOS keeps its calibration data in BLOBs in Oracle  Data consistency, data replication and performance considerations led us to the decision to store calibration data, not only metadata in the database  PostgreSQL was chosen as RDBMS

4 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 4 What’s involved in the database transition  Design a relational schema that supports our data and queries on it.  Migrate large amounts of old Objectivity- based data to a new DB. That requires I/O from objects in memory to tables in RDB  Preserve the existing API by providing a new implementation

5 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 5 Calibration data  Our calibrations differ widely in shape and size but have the same structure - they are arrays ( “banks” ) of individual channels  Example: a lookup table for slewing corrections for a PMT in ZDC can be a channel  A bank is a unit of information which is stored and retrieved based on validity ranges. For example all PMTs in ZDC form a bank PHENIX ZDC

6 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 6 Relational schema Why this doesn’t work for us:  RDBs have limit on the number of columns, large size of some channels makes this approach problematic  One possibility could be to use PostgreSQL array type to store a bank, but array implementation is not optimized for big array size. Moreover other RDBs do not support array type  I/O is still a problem  Most direct approach - map channel data members to columns in a table  Makes data transparent, suitable for Web display

7 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 7 BLOBs  Another approach is to store calibration banks in BLOBs ( Binary Large Objects ) and calibration metadata as simple types  Solves I/O problem - ROOT I/O can be used to serialize banks into BLOBs and RDBC ( ROOT DataBase Connectivity ) to send BLOBs to the database  Makes rewriting of calibration DB interface easy  Allows fast index-based calibration retrieval  The only thing we lose is “transparency”, Web display

8 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 8 Final relational schema  Decided to proceed with BLOBs  Each Objy db mapped into relational db table  All tables have the same schema:  Each object in Objy container mapped to a row in a table  Each calibration header data member mapped to a column in a table Metadata BLOBptr

9 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 9 Software layers  Couple of months spent on installing and testing new software  After fixing few bugs adopted the following:  RDBC - talks to RDBs from ROOT  libodbc++ - c++ library for accessing RDBs, runs on top of ODBC, simplifies the code  unixODBC - free ODBC interface  psqlodbc - official PostgrSQL ODBC driver RDBC libodbc++ unixODBC psqlodbc DB PhenixDB API User application

10 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 10 New calibration API implementation  Top calibration abstract base class inherits from Tobject to use RDBC method SetObject(int, TObject *)  A ClassDef macro added to calibration headers to equip calibration classes with streamers  New calibration DB API was made ODBC-compliant to ease possible future technology changes  Data migration code was written by a perl script with Objy db name and calibration class name as arguments  One new method introduced to benefit from finer commit granularity available in RDBs

11 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 11 Old data transfer  A clone of Objy federation was made and its schema evolved to reflect a change in the inheritance schema ( all calibration classes got Tobject as a parent )  A CVS branch was created for the code development with new replica Objy federation  About 13 GB of old data were transferred from Objy to Postgres which took a few days

12 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 12 Validating new framework  Validating the new framework took a lot of time due to very active code development and Objectivity updates  Non-atomic CVS operations ( tagging the code ) added to the complexity of comparing reconstruction output in old and new frameworks  After byte-by-byte comparisons Postgres-based calibrations are now used in production

13 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 13 Database replication  Due to PostgreSQL source code availability and ease of administration is was not very hard to install local database servers in 6 off-site institutions and make them slave databases  This was possible without synchronizing compiler versions and paying license fees  PHENIX can run reconstruction and simulations at more sites than before

14 Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 14 Summary  Objectivity/DB is not used in PHENIX production since July 2004  Transition from Objectivity to Postgres was relatively transparent to the Collaboration, took about 1 year of 1 FTE  New adopted software saved code development time, but now we must pay a maintenance price  Web display with BLOBs requires more work, but possible  Many thanks to Laurent Aphecetche, Saskia Mioduszewski, Chris Pinkenburg and Martin Purschke for their help


Download ppt "Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model."

Similar presentations


Ads by Google