Emulsion Database Design Status Report Cristiano Bozza European Emulsion Group LNGS, May 2003 Updated DB Schema Distributed DB Implementation DB Client.

Slides:



Advertisements
Similar presentations
EIONET Training Beginners Zope Course Miruna Bădescu Finsiel Romania Copenhagen, 27 October 2003.
Advertisements

Connecting to Databases. connecting to DB DB server typically a standalone application Server runs on localhost for smaller sites –i.e. Same machine as.
Connecting to Databases. relational databases tables and relations accessed using SQL database -specific functionality –transaction processing commit.
Case Study: Photo.net March 20, What is photo.net? An online learning community for amateur and professional photographers 90,000 registered users.
Database System Concepts and Architecture
Java.  Java is an object-oriented programming language.  Java is important to us because Android programming uses Java.  However, Java is much more.
Using R as enterprise-wide data analysis platform Zivan Karaman.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
SQL (Structured Query Language) X/OPEN Call Level Interface For SQL ODBC (Open DataBase Connectivity) API JDBC (Java DataBase Connectivity) API SQL (Structured.
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
Distributed Systems: Client/Server Computing
Session-01. Hibernate Framework ? Why we use Hibernate ?
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Overview of Database Access in.Net Josh Bowen CIS 764-FS2008.
ODBC Open DataBase Connectivity a standard database access method developed by Microsoft to access data from any application regardless of which database.
CIS 764 – Presentation Somil Chandwani.  With Microsoft Data Access Components (MDAC), developers can connect to and use data from a wide variety of.
Advance Computer Programming Java Database Connectivity (JDBC) – In order to connect a Java application to a database, you need to use a JDBC driver. –
1 Web Database Processing. Web Database Applications Static Report Publishing a report is prepared from a database application and exported to HTML DB.
 2000 Deitel & Associates, Inc. All rights reserved. Chapter 24 – Web Servers (PWS, IIS, Apache, Jigsaw) Outline 24.1Introduction 24.2Microsoft Personal.
Web Based Applications
Oracle8 JDBC Drivers Section 2. Common Features of Oracle JDBC Drivers The server-side and client-side Oracle JDBC drivers provide the same basic functionality.
Basics of Web Databases With the advent of Web database technology, Web pages are no longer static, but dynamic with connection to a back-end database.
October 30, 2007S. Weigert / Y. HAN1 Working with Eclipse-Ingres RUBIS Autumn 2007.
Using Visual Basic 6.0 to Create Web-Based Database Applications
Presentation: SOAP in a distributed object framework, Application Servers & AXIS SOAP.
COLD FUSION Deepak Sethi. What is it…. Cold fusion is a complete web application server mainly used for developing e-business applications. It allows.
Fundamentals of Database Chapter 7 Database Technologies.
Web Server Administration Chapter 7 Installing and Testing a Programming Environment.
National Center for Supercomputing Applications NCSA OPIE Presentation November 2000.
European Scanning System R&D Bari, Bologna, Bern, Lyon, Napoli, Roma, Salerno Tuning of 10 cm 2 /hour on the OPERA emulsions But not refreshed and 32 micron.
Presentation: SOAP/WS in a distributed object framework, Application Servers & AXIS SOAP.
Ch 1. A Python Q&A Session Spring Why do people use Python? Software quality Developer productivity Program portability Support libraries Component.
DB-based DAQ monitoring and Physics analysis tools Emiliano Barbuto European Emulsion Group (LNGS May 2003)
Project Overview Graduate Selection Process Project Goal Automate the Selection Process.
Presentation: SOAP/WS in a distributed object framework, Application Servers & AXIS SOAP.
Building High Performance, Robust Server Applications with Internet Information Server 5.0 Van Van IIS - Program Manager Microsoft Corporation.
Copyright © Curt Hill Connectivity Communicating with the Database.
Connectivity Solutions from DataDirect™ John Goodson Vice President, DataDirect, R&D.
Interactive Data Analysis on the “Grid” Tech-X/SLAC/PPDG:CS-11 Balamurali Ananthan David Alexander
Basics of JDBC Session 14.
CP476 Internet Computing Perl CGI and MySql 1 Relational Databases –A database is a collection of data organized to allow relatively easy access for retrievals,
Java Programming: Advanced Topics 1 Enterprise JavaBeans Chapter 14.
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
Scaling up from local DB to distributed DB Cristiano Bozza European Emulsion Group Nagoya, Jan 2004 Presented by Giuseppe Grella.
SySal Analysis tools: Status and outlook Cristiano Bozza Salerno Emulsion Group Bern, March 2004.
DAT602 Database Application Development Lecture 1 Course Structure & Background knowledge.
CS 440 Database Management Systems Stored procedures & OR mapping 1.
The Database Project a starting work by Arnauld Albert, Cristiano Bozza.
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
9 Copyright © 2004, Oracle. All rights reserved. Getting Started with Oracle Migration Workbench.
Introduction to Database Programming with Python Gary Stewart
CERN IT Department CH-1211 Genève 23 Switzerland t Load testing & benchmarks on Oracle RAC Romain Basset – IT PSS DP.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GOCDB4 Gilles Mathieu, RAL-STFC, UK An introduction.
Get your Oracle data into SQL Server faster!
The Holmes Platform and Applications
Fundamental of Databases
Databases (CS507) CHAPTER 2.
Web-based Software Development - An introduction
DEPTT. OF COMP. SC & APPLICATIONS
Database Replication and Monitoring
ODBC, OCCI and JDBC overview
Outline SOAP and Web Services in relation to Distributed Objects
Outline SOAP and Web Services in relation to Distributed Objects
PHP / MySQL Introduction
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Scott Stocker November 18, 2002
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Chapter 10 ADO.
Sending data to EUROSTAT using STATEL and STADIUM web client
Presentation transcript:

Emulsion Database Design Status Report Cristiano Bozza European Emulsion Group LNGS, May 2003 Updated DB Schema Distributed DB Implementation DB Client Technologies DB Client Libraries Conclusions Extends Opera Internal Note # 38: “Database Architecture for the European Emulsion Scanning System”

Updated DB Schema

Overall structure unaltered, only fine tuning of data types and table columns To assess performance, tuning must be done in real-life conditions Not only benchmarks, but real use of DB for test exposure data

Updated DB Schema DB size estimate Assumptions for minimum size: Few data from CS (< 100 tracks per sheet), so they are negligible Intercalibration through 4 track maps per sheet, 100 tracks / map All sheets in a brick are intercalibrated (conservative) 1 candidate is followed on 28 sheets on average (conservative) 1 “interesting points” per event (vtx, decays, e.m. showers,...) (depends on Physics as well as instrumental needs) events (OPERA half data set + tests, technical runs...) 3 pass scanning (Vtx location, selection, precision measurement) 100 cosmics / background / fake microtracks in each zone

Updated DB Schema DB size estimate Assumptions for maximum size: Few data from CS (< 100 tracks per sheet), so they are negligible Intercalibration through 4 track maps per sheet, 100 tracks / map All sheets in a brick are intercalibrated (conservative) 10 candidates are followed on 28 sheets on average (conservative) 3 “interesting points” per event (vtx, decays, e.m. showers,...) (depends on Physics as well as instrumental needs) events (OPERA full data set + tests, technical runs...) 3 pass scanning (Vtx location, selection, precision measurement) 100 cosmics / background / fake microtracks in each zone

Updated DB Schema DB size estimate Results: (depends on Physics as well as instrumental needs) Assuming 100 bytes / track + aligned reconstructions, we have 0.11÷2.3 TB Including safety factors for possible underestimations, the DB size should stay within DBSize < 2.5 TB To estimate the network bandwidth needed, we assume that the full dataset is read 10 times during 5 years’ data taking 0.055÷1.2 Mbit/s ON AVERAGE

Distributed DB Implementation Core DB Workstation DB Group Scanning DB

Distributed DB Implementation Core DB Group: Centralized location Full OPERA emulsion data set Multimaster replication (each machine has one copy) Scanning DB: One machine per each scanning site (+ optional backup) Copy of the subset of locally produced scanning data Materialized views (each machine stores only locally produced data) The full dataset is still accessible in a transparent way Minimum network traffic

Distributed DB Implementation Core DB Group Simulation: Two Dell servers with high speed (1 Gbit) Ethernet connection Scanning DB Simulation: One Dell server machine with a materialized view of Salerno data Normal 100 Mbit/s LAN connection Pilot DB farm

DB Client Technologies Several client connection technologies are being explored Goal: highest possible data availability for Oracle data Windows: ODBC, OLE DB, Oracle ODP.NET, ADO, ADO.NET Linux: Oracle OCI C++ libraries, Perl, Tcl/Tk, Python, GNOME-DB, J2EE Mono (.NET) Oracle provider

DB Client Technologies Oracle and Windows: Working with DB is common practice in the Windows community. Oracle provides both the DB server and several client tools / libraries. Oracle 9i Database Server consists of 3 installation disks (2.4 GB). We have made several performance tests to choose the best access method. Since our code will be under.NET, the ADO.NET layer was common to all tests. ODBC: general DB library. OLE DB: 50% faster than ODBC. Oracle Data Provider (ODP.NET): slightly slower (~10%?) than OLE DB. OCI C++: low level API, much harder than other methods. OK!

DB Client Technologies Oracle and Linux: Oracle is fully committed to supporting the Linux operating system. Indeed, Oracle was the first commercial database available on Linux. All key Oracle products including Oracle 9i Database, Application Server, Collaboration Suite, Developer Suite and E-Business Suite. Oracle 9i Database consists of 3 cpio archive files ( ~ 1.4 Gb). We installed it on a Red Hat 7.3 Linux Distribution.

DB Client Technologies Oracle and Linux: OCI C++ Oracle Call Interface (OCI) is the Oracle software allowing access to the database from an external application. In principle, you can use this C-based API to build a Database application from scratch. Oracle and Linux: Perl Perl is probably the most famous open source language. It is an interpreted scripting language easy to learn and extremely quick. It is widely used in Internet and Database applications.

DB Client Technologies Oracle and Linux: Perl Oracle DB Oracle OCI Perl 5 Script Perl DBI DBD:: Oracle Perl Database applications are based on the DBI module, an object oriented architecture module. This module requires a specific database dependent driver (DBD::Oracle) to connect to the database

DB Client Technologies Oracle and Linux: Perl A sample Perl script to access data from Oracle use strict ; use DBI ; # Connection to the DB my $dbh = DBI -> connect(dbi:Oracle:operadb,operausr,operapwd); # Database SQL Query my $sql = qq { SELECT Grains,SlopeX,SlopeY FROM MIPBaseTracks}; # Execute SQL Query my $sth = $dbh ->prepare ($sql); $sth -> execute (); # Loop on the table While (my ($run,$event,$ntracks)= $sth->fetchrow_array){ # Here you can do whatever you want with the data }

DB Client Technologies Oracle and Linux: Tcl/Tk Tcl is an excellent scripting language created in 1987 by John Ousterhout. In 1988 he started to develop a graphic tool called Tk. From that moment on, Tcl/Tk is one of the most favourite language in the Open Source community. Oratcl is the module allowing the connection to an Oracle database. The connection to the database can be tested interactively using the Tcl shell.

DB Client Technologies Oracle and Linux: Tcl/Tk A sample Tcl/Tk script to access data from Oracle tclsh %package require Oratcl %set handle [oralogon %set cursor [oraopen $handle] %orasql $cursor {select Grains,SlopeX,SlopeY from MIPBaseTracks} %orafetch $cursor …………… %oraclose $cursor %oralogoff $handle %exit

DB Client Technologies Oracle and Linux: Perl/Tk Tk is wrapped inside Perl. So you can combine Perl quickness with Tk graphics capabilities. Oraexplain is the Perl/Tk module used to connect a program to an Oracle database. Oraexplain scripts are more complex because they involve graphical elements like windows, buttons and frames. You can use them to develop login windows or more complex graphics applications.

DB Client Technologies Oracle and Linux: Python Python is a GUI open source and object-oriented scripting language created by Guido van Rossum. It is used for any kind of application: GUI, XML, e- mail, …. DCOracle is the module used to connect a Python script to an Oracle database. With Python you can define classes and re-use them in other scripts.

DB Client Technologies Oracle and Linux: GNOME-DB The GNOME-DB project aims to provide a free unified data access architecture to the GNOME project. GNOME-DB is useful for any application that accesses persistent data (not only databases), since it now contains a pretty good data management API. We are currently testing the GNOME-DB libraries. Oracle and Windows/Linux: Mono ADO.NET Data Provider for Oracle databases works on Windows and Linux with Oracle 8i and higher versions. We are currently testing the Mono ADO.NET libraries. Tests in progress: Web applications with Apache, JSP, PHP, Java, JDBC

DB Client Libraries We are developing a full set of.NET / Mono classes specific to Opera DB OperaDb OperaDb::Connection OperaDb::Transaction OperaDb::Scanning OperaDb::Scanning::Batch OperaDb::Scanning::LinkedZone OperaDb::Scanning::MIPBaseTrack OperaDb::Scanning::MIPIndexedEmulsionTrack OperaDb::TotalScan OperaDb::TotalScan::Layer OperaDb::TotalScan::Segment OperaDb::TotalScan::Track OperaDb::TotalScan::Volume OperaDb::ComputingInfrastructure OperaDb::ComputingInfrastructure::Machine OperaDb::ComputingInfrastructure::ProgramSettings OperaDb::ComputingInfrastructure::User OperaDb::ComputingInfrastructure::UserPermissions OperaDb::ComputingInfrastructure::Site Namespaces in black Classes in dark red Who does not want to use SQL can program the DB in C++, C#, VB, FORTRAN, and so on, using these classes More classes to come...

DB Client Libraries Code samples Sample #1: How to store a scanning zone into Opera DB C++: idZone = LinkedZone::Save(pMyZone, idBatch, rawDataPath, startTime, endTime, dbConn, dbTrans); C#: idZone = LinkedZone.Save(MyZone, idBatch, rawDataPath, startTime, endTime, dbConn, dbTrans); Sample #2: How to retrieve a volume reconstruction from Opera DB C++: Volume *pVol = new Volume(dbConn, dbTrans, idVolume, true); C#: Volume Vol = new Volume(dbConn, dbTrans, idVolume, true); Sample #3: How to convert a scanning zone from Opera DB to ROOT file C++: SySal:: Root::Opera::LinkedZone::Save(new LinkedZone(dbConn, dbTrans, idZone), filePath); C#: SySal.Root.Opera.LinkedZone.Save(new LinkedZone(dbConn, dbTrans, idZone), filePath);

Conclusions The DB architecture that has been proposed some months ago is working Oracle 9iDS looks a very good choice (easy to implement, maintain, and develop; widely supported) Even people that are not familiar with SQL can easily work with the proposed structure of emulsion DB using interface libraries All interesting OS are supported by Oracle and by our interface libraries Conversion to Root data is trivial (1 line of code)...everything fine up to now! Interface libraries for OperaDB are almost (95%) complete