Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN 17-19 October 2005 A.Domenici

Slides:



Advertisements
Similar presentations
Introduction to Heterogeneous Data Replication Spring COMMON 1999 Richard Sinn IBM Santa Teresa Lab.
Advertisements

Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
19/06/2002WP4 Workshop - CERN WP4 - Monitoring Progress report
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Approaches to EJB Replication. Overview J2EE architecture –EJB, components, services Replication –Clustering, container, application Conclusions –Advantages.
EU-GRID Work Program Massimo Sgaravatto – INFN Padova Cristina Vistoli – INFN Cnaf as INFN members of the EU-GRID technical team.
On Replication July 2006 Yin Chen. What is? Why need? Types? Investigation of existing technologies –IBM SQL replication –Sybase replication –Oracle replication.
Distributed components
Distributed Heterogeneous Data Warehouse For Grid Analysis
Technical Architectures
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
DISTRIBUTED DATABASE. Centralized & Distributed Database  Single site database – centralized database –A database is located at a single site or distributed.
Chapter 9 : Distributed Database.
Overview Distributed vs. decentralized Why distributed databases
Chapter 11 – Database-Oriented Middleware & EAI Database access is the key element to EAI, especially data-level EAI. Database oriented middleware is not.
Distributed Databases
Overview of Database Access in.Net Josh Bowen CIS 764-FS2008.
1 Web Database Processing. Web Database Applications Static Report Publishing a report is prepared from a database application and exported to HTML DB.
● Problem statement ● Proposed solution ● Proposed product ● Product Features ● Web Service ● Delegation ● Revocation ● Report Generation ● XACML 3.0.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
Multiple Cases Access Utilities1 Access & ODBC Managing and Using ODBC Connections P.O. Box 6142 Laguna Niguel, CA
Design and Implementation of a Module to Synchronize Databases Amit Hingher Reviewers: Prof. Dr. rer. nat. habil. Andreas Heuer Prof. Dr.-Ing. Hartmut.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
RLS Tier-1 Deployment James Casey, PPARC-LCG Fellow, CERN 10 th GridPP Meeting, CERN, 3 rd June 2004.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Backdrop Particle Paintings created by artist Tom Kemp September Grid Information and Monitoring System using XML-RPC and Instant.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
CSS/417 Introduction to Database Management Systems Workshop 4.
INFSO-RI Enabling Grids for E-sciencE Distributed Metadata with the AMGA Metadata Catalog Nuno Santos, Birger Koblitz 20 June 2006.
FailSafe SGI’s High Availability Solution Mayank Vasa MTS, Linux FailSafe Gatekeeper
INFNGrid Constanza Project: Status Report A.Domenici, F.Donno, L.Iannone, G.Pucciani, H.Stockinger CNAF, 6 December 2004 WP3-WP5 FIRB meeting.
Adaptable Consistency Control for Distributed File Systems Simon Cuce Monash University Dept. of Computer Science and Software.
Hands-On Microsoft Windows Server Implementing Microsoft Internet Information Services Microsoft Internet Information Services (IIS) –Software included.
3 Copyright © 2009, Oracle. All rights reserved. Accessing Non-Oracle Sources.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Transaction-based Grid Data Replication Using OGSA-DAI Presented by Yin Chen February 2007.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
Replica Consistency in a Data Grid1 IX International Workshop on Advanced Computing and Analysis Techniques in Physics Research December 1-5, 2003 High.
Experiment Management System CSE 423 Aaron Kloc Jordan Harstad Robert Sorensen Robert Trevino Nicolas Tjioe Status Report Presentation Industry Mentor:
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
DATABASE CONNECTIVITY TO MYSQL. Introduction =>A real life application needs to manipulate data stored in a Database. =>A database is a collection of.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,
Oracle to MySQL synchronization Gianni Pucciani CERN, University of Pisa.
ViaSQL Technical Overview. Viaserv, Inc. 2 ViaSQL Support for S/390 n Originally a VSE product n OS/390 version released in 1999 n Identical features.
HyperKVS Group Meeting Oracle Streams Dr. Volker Kuhr.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
EJB Replication Graham, Iman, Santosh, Mark Newcastle University.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
Database Issues Peter Chochula 7 th DCS Workshop, June 16, 2003.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF.
An Overview of Data Warehousing and OLAP Technology
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
Jean-Philippe Baud, IT-GD, CERN November 2007
Accessing the Database Server: ODBC, OLE DB, and ADO
Gianni Pucciani GD group meeting CERN, 25 November 2008
LCG 3D Distributed Deployment of Databases
Self Healing and Dynamic Construction Framework:
Grid Data Integration In the CMS Experiment
AIMS Equipment & Automation monitoring solution
Presentation transcript:

Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN October 2005 A.Domenici F.Donno L.Iannone G.Pucciani H.Stockinger

Introduction Oracle Heterogeneous Connectivity. Streams based heterogeneous replication. The Constanza project. RCS architecture. RCS for Oracle to MySQL replication. Conclusions. 2 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

Tests on Oracle Heterogeneous Connectivity and Streams based replication Laura Iannone's thesis available at: Oracle solutions to the Heterogeneous Connectivity problem: Transparent Gateways: only provides gateways for specific non-Oracle platforms like Sybase, MS SQL, Informix etc. MySQL is NOT supported. Generic Connectivity: more limited functionality than Transparent Gateways, connection only to local Oracle Database server, no distributed transactions. Both provide the ability to transparently access data in non- Oracle databases from an Oracle environment. 3 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

Oracle Heterogeneous Connectivity The Heterogeneous Connectivity process: ● Heterogeneous service: integrated in the Oracle server. ● Agent: provides connectivity to non-Oracle systems. Transparent Gateways and Generic Connectivity are two types of agents. 4 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

Oracle Generic Connectivity test Oracle machine located at CNAF (SL, Oracle ) MySQL machine at INFN Pisa (RH9, MySQL 4.1.9). - Setting up Oracle Heterogeneous Services. - Setting up ODBC Driver. - Setting up Generic Connectivity Agent. Eventually, we were able to successfully update the remote MySQL database from the Oracle SQL*Plus console. 5 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

Oracle to MySQL data sharing with Streams 1/2 In some Oracle documents we found that Streams can apply changes to a non-Oracle system via Transparent Gateways or Generic Connectivity. - We set up an Oracle Streams environment with the apply process linked to the remote MySQL database. - We created a simple table on both the Oracle and MySQL database. - With the Streams processes activated we tried to make some DML changes to the Oracle table. 6 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

Oracle to MySQL data sharing with Streams 2/2 Results: altough we can access a remote MySQL database from an Oracle server, the Streams tool does not work with MySQL. So, we need to look for different solutions to implement Oracle to MySQL synchronisation. The capture process correctly captured changes but the apply process aborted. Oracle support has been contacted and they told us that Streams needs to have an Oracle Transparent Gateway for the Oracle to non-Oracle data sharing. 7 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

The CONStanza project Developed at INFN, funded by the Italian FIRB project Site: Main goals: developing a Replica Consistency Service (RCS) that maintains consistency of writable replicated datasets (files and databases) in a Grid environment. Current focus on: heterogeneous database replication of VOMS DB from Oracle to MySQL simple but reliable protocol for update propagation (single master). 8 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

RCS: general features ● Designed as flexible as possible to support multiple update propagation protocols and update mechanisms. ● Prototype implemented in C++ with gSOAP as communication framework. ● Other used/to be used tools are: Autotools, doxygen, CppUnit with mockpp, CGSI. ● Using a configuration file we can set most RCS properties. 9 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

RCS: particular features ● Dynamic replica subscription. ● Update requests can be both blocking and non- blocking. ● Quorum constraint on update requests. ● Fault tolerant communication. ● For DB replicas: limited SQL dialect/data types translation. 10 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

RCS Architecture GRCS, entry point for user requests (bottleneck and single point of failure issues will be addressed if needed). LRCSs, distributed servers “close” to replicas. 11 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

RCS: architecture for DBs Two specialised components have been added to extract/apply updates to databases: Log Watcher DB Updater 12 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

RCS: current status ● The current architecture works with homogeneous MySQL databases. ● It monitors the master DB log file, extracts last updates and propagates them to the RCS registered secondary replicas. Updates are then applied to each secondary replica. ● If a replica is temporarily unreachable, a tentative update can be repeated later on. 13 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

RCS: current scenario VOMS1 LogWatcher Extract Log LogVOMS001 GRCS LRCS Notify GRCS VOMS2 VOMS3 Update Replica LogVOMS000 LogVOMS001 DBUpdater GridFTP Apply Update VOMS1 VOMS2 VOMS3 RCS Update Master 14 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

RCS: fault tolerant links ● GRCS  LRCSs: when an LRCS is not reachable, the update operation involved is stored in a queue to be executed later on, periodically or upon an LRCS request. ● Log Watcher  GRCS and DB Updater  DB links will use a similar system. 15 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

RCS: first results Preliminary performance tests done using a master DB located in Pisa and secondary DBs at CERN (50ms RTT) and SLAC (165ms RTT). 16 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

RCS: Oracle to MySQL replication ● The LogWatcher component can be specialised to work with Oracle as it works with MySQL using the Oracle LogMiner utility and the OCCI interface (work in progress). ● The difficult problem of SQL dialect translation will be addressed for special cases (starting with VOMS use case). A general SQL statement translation module is almost impossible to build. Solutions can be: – Reducing the use of non-standard SQL statetements. – Agreeing on a common application-level interface for log storage/exctraction/application. 17 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )

Conclusions ● Oracle Streams does not work with MySQL. ● Can a Transparent Gateway driver be done for MySQL as well? ● CONStanza RCS. ● Using Log Miner RCS should be able to synchronise an Oracle DB with MySQL DBs. ● The problem of non-standard SQL dialects must be addressed. 18 Heterogeneous Database Replication – CERN LCG 3D Workshop ( )