Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model.

Slides:



Advertisements
Similar presentations
Connecting to Databases. connecting to DB DB server typically a standalone application Server runs on localhost for smaller sites –i.e. Same machine as.
Advertisements

Connecting to Databases. relational databases tables and relations accessed using SQL database -specific functionality –transaction processing commit.
Peter Berrisford RAL – Data Management Group SRB Services.
Chapter 10: Designing Databases
Database Architectures and the Web
PostgreSQL Replicator – easy way to build a distributed Postgres database Irina Sourikova PHENIX collaboration.
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Fast Track to ColdFusion 9. Getting Started with ColdFusion Understanding Dynamic Web Pages ColdFusion Benchmark Introducing the ColdFusion Language Introducing.
DBA Meeting December Supporting the MINOS MySQL Database at FNAL Nick West.
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
At the North of England Institute of Mining and Mechanical Engineers Library, Newcastle upon Tyne.
Microsoft Visual Basic 2012 CHAPTER ONE Introduction to Visual Basic 2012 Programming.
Microsoft Visual Basic 2005 CHAPTER 1 Introduction to Visual Basic 2005 Programming.
Home controlling system based on Galileo Final Semester Presentation Started at: Winter 2015 Project supervised by: Mony Orbach Project performed by: Khalid.
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
SQL Server to MySQL Database Migration SQLWays - Migration Software Presentation March 2009 Copyright (c) Ispirer Systems Ltd.
Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008.
Status of SQL and XML I/O Sergey Linev, GSI, Darmstadt, Germany.
Database Design for DNN Developers Sebastian Leupold.
Basics of Web Databases With the advent of Web database technology, Web pages are no longer static, but dynamic with connection to a back-end database.
CAA/CFA Review | Andrea Laruelo | ESTEC | May CFA Development Status CAA/CFA Review ESTEC, May 19 th 2011 European Space AgencyAndrea Laruelo.
Data File Access API : Under the Hood Simon Horwith CTO Etrilogy Ltd.
M1G Introduction to Database Development 6. Building Applications.
Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.
MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May Proposal.
1 Alice DAQ Configuration DB
ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, April 2013 Relational APDM & Relational ASDM models effort done in online.
Nightly System Growth Graphs Abstract For over 10 years of development the ATLAS Nightly Build System has evolved into a factory for automatic release.
PHP Features. Features Clean syntax. Object-oriented fundamentals. An extensible architecture that encourages innovation. Support for both current and.
ALMA Archive Operations Impact on the ARC Facilities.
ROOT I/O for SQL databases Sergey Linev, GSI, Germany.
The Persistency Patterns of Time Evolving Conditions for ATLAS and LCG António Amorim CFNUL- FCUL - Universidade de Lisboa A. António, Dinis.
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
Databases for data management in PHENIX Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration.
PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
Monte-Carlo Event Database: current status Sergey Belov, JINR, Dubna.
STAR C OMPUTING Plans for Production Use of Grand Challenge Software in STAR Torre Wenaus BNL Grand Challenge Meeting LBNL 10/23/98.
Status Report on the Validation Framework S. Banerjee, D. Elvira, H. Wenzel, J. Yarba Fermilab 15th Geant4 Collaboration Workshop 10/06/
Alberto Colla - CERN ALICE off-line week 1 Alberto Colla ALICE off-line week Cern, May 31, 2005 Table of contents: ● Summary of requirements ● Description.
Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 1 HARP DATA AND SOFTWARE MIGRATION FROM TO ORACLE Authors: A.Valassi,
Overview of C/C++ DB APIs Dirk Düllmann, IT-ADC Database Workshop for LHC developers 27 January, 2005.
Fundamentals of Web DevelopmentRandy Connolly and Ricardo HoarFundamentals of Web DevelopmentRandy Connolly and Ricardo Hoar Fundamentals of Web DevelopmentRandy.
Database Issues Peter Chochula 7 th DCS Workshop, June 16, 2003.
20 October 2005 LCG Generator Services monthly meeting, CERN Validation of GENSER & News on GENSER Alexander Toropin LCG Generator Services monthly meeting.
LDAP related development at Carnegie Mellon ● OpenLDAP and SQL ● LDAP everywhere ● Cyrus SASL development.
External Data Access Adam Rauch, 6/05/08 Team: Geoff Snyder, Kevin Beverly, Cory Nathe, Matthew Bellew, Mark Igra, George Snelling.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The HADES Oracle database and its interfaces for experimentalists Ilse Koenig, GSI Darmstadt for the HADES collaboration.
Microsoft Visual Basic 2015 CHAPTER ONE Introduction to Visual Basic 2015 Programming.
The Database Project a starting work by Arnauld Albert, Cristiano Bozza.
PHENIX Simulation System 1 September 8, 1999 Simulation Work-in-Progress: ROOT-in-PISA Indrani Ojha Banaras Hindu University and Vanderbilt.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
HYDRA Framework. Setup of software environment Setup of software environment Using the documentation Using the documentation How to compile a program.
Jean-Philippe Baud, IT-GD, CERN November 2007
Oracle Database In-Memory feature at CERN
CMS High Level Trigger Configuration Management
Introduction to Visual Basic 2008 Programming
Using EDB Postgres Replication Server to Offload Oracle Reporting Workloads to Postgres Matthew Lewandowski.
The COMPASS event store in 2002
POOL persistency framework for LHC
Content of Presentation
OO-Design in PHENIX PHENIX, a BIG Collaboration A Liberal Data Model
Developing and testing enterprise Java applications
Ralph Lange EPICS Seminar IHEP Beijing 2002
JTLS-GO 6.0 PostgreSQL Information
Presentation transcript:

Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 2 Introduction  PHENIX is one of two large experiments at RHIC, produces hundreds of TB of data per year  In four years of running PHENIX accumulated tens of GB of calibration ( condition) data which used to get archived in Objectivity database  For a variety of reasons ( licensing and compiler issues among them ) the decision was made to change the underlying storage technology and use open source RDB instead of proprietary OODB  Main constraints - avoid any downtime for production and provide backward compatibility by migrating old Objectivity-based calibration data to RDB of choice

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 3 Where do we store data if not in Objy?  One option is to store metadata in RDB and data in flat files ( STAR )  Another option is to store calibrations in BLOBs ( Binary Large Objects ). PHOBOS keeps its calibration data in BLOBs in Oracle  Data consistency, data replication and performance considerations led us to the decision to store calibration data, not only metadata in the database  PostgreSQL was chosen as RDBMS

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 4 What’s involved in the database transition  Design a relational schema that supports our data and queries on it.  Migrate large amounts of old Objectivity- based data to a new DB. That requires I/O from objects in memory to tables in RDB  Preserve the existing API by providing a new implementation

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 5 Calibration data  Our calibrations differ widely in shape and size but have the same structure - they are arrays ( “banks” ) of individual channels  Example: a lookup table for slewing corrections for a PMT in ZDC can be a channel  A bank is a unit of information which is stored and retrieved based on validity ranges. For example all PMTs in ZDC form a bank PHENIX ZDC

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 6 Relational schema Why this doesn’t work for us:  RDBs have limit on the number of columns, large size of some channels makes this approach problematic  One possibility could be to use PostgreSQL array type to store a bank, but array implementation is not optimized for big array size. Moreover other RDBs do not support array type  I/O is still a problem  Most direct approach - map channel data members to columns in a table  Makes data transparent, suitable for Web display

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 7 BLOBs  Another approach is to store calibration banks in BLOBs ( Binary Large Objects ) and calibration metadata as simple types  Solves I/O problem - ROOT I/O can be used to serialize banks into BLOBs and RDBC ( ROOT DataBase Connectivity ) to send BLOBs to the database  Makes rewriting of calibration DB interface easy  Allows fast index-based calibration retrieval  The only thing we lose is “transparency”, Web display

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 8 Final relational schema  Decided to proceed with BLOBs  Each Objy db mapped into relational db table  All tables have the same schema:  Each object in Objy container mapped to a row in a table  Each calibration header data member mapped to a column in a table Metadata BLOBptr

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 9 Software layers  Couple of months spent on installing and testing new software  After fixing few bugs adopted the following:  RDBC - talks to RDBs from ROOT  libodbc++ - c++ library for accessing RDBs, runs on top of ODBC, simplifies the code  unixODBC - free ODBC interface  psqlodbc - official PostgrSQL ODBC driver RDBC libodbc++ unixODBC psqlodbc DB PhenixDB API User application

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 10 New calibration API implementation  Top calibration abstract base class inherits from Tobject to use RDBC method SetObject(int, TObject *)  A ClassDef macro added to calibration headers to equip calibration classes with streamers  New calibration DB API was made ODBC-compliant to ease possible future technology changes  Data migration code was written by a perl script with Objy db name and calibration class name as arguments  One new method introduced to benefit from finer commit granularity available in RDBs

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 11 Old data transfer  A clone of Objy federation was made and its schema evolved to reflect a change in the inheritance schema ( all calibration classes got Tobject as a parent )  A CVS branch was created for the code development with new replica Objy federation  About 13 GB of old data were transferred from Objy to Postgres which took a few days

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 12 Validating new framework  Validating the new framework took a lot of time due to very active code development and Objectivity updates  Non-atomic CVS operations ( tagging the code ) added to the complexity of comparing reconstruction output in old and new frameworks  After byte-by-byte comparisons Postgres-based calibrations are now used in production

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 13 Database replication  Due to PostgreSQL source code availability and ease of administration is was not very hard to install local database servers in 6 off-site institutions and make them slave databases  This was possible without synchronizing compiler versions and paying license fees  PHENIX can run reconstruction and simulations at more sites than before

Sept 27 CHEP’04 Interlaken, CH Irina Sourikova 14 Summary  Objectivity/DB is not used in PHENIX production since July 2004  Transition from Objectivity to Postgres was relatively transparent to the Collaboration, took about 1 year of 1 FTE  New adopted software saved code development time, but now we must pay a maintenance price  Web display with BLOBs requires more work, but possible  Many thanks to Laurent Aphecetche, Saskia Mioduszewski, Chris Pinkenburg and Martin Purschke for their help