Scaling up from local DB to distributed DB Cristiano Bozza European Emulsion Group Nagoya, Jan 2004 Presented by Giuseppe Grella.

Slides:



Advertisements
Similar presentations
Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.
Advertisements

Case Study: Photo.net March 20, What is photo.net? An online learning community for amateur and professional photographers 90,000 registered users.
IBM Software Group ® Integrated Server and Virtual Storage Management an IT Optimization Infrastructure Solution from IBM Small and Medium Business Software.
What to Check SQL Server Buffer Manager: Page Life Expectancy > 300 (seconds) What You’ll See Slow performance across the board Long search crawl.
Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Belle computing upgrade Ichiro Adachi 22 April 2005 Super B workshop in Hawaii.
Cristiano Bozza – European Emulsion Scanning Group – Nagoya Jan OPERA brick scanning by the European Scanning System.
Present status and future developments of the European Scanning System Cristiano Bozza European Emulsion Group Nagoya Dec 2006.
Status and activities in Salerno Cristiano Bozza Salerno Emulsion Group Nagoya Dec 2006.
Cristiano Bozza – European Emulsion Scanning Group of OPERA– Jan 2008, Nagoya 1 Status and evolution of the European Scanning System  Basic concepts 
Cristiano Bozza – European Emulsion Scanning Group – Nagoya Jan Scanning data sharing through Central DB.
Mid-term Project Presentation Eli Bendersky Igor Oks.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Capacity Planning in SharePoint Capacity Planning Process of evaluating a technology … Deciding … Hardware … Variety of Ways Different Services.
MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design Study Guide (70-443) Chapter 1: Designing the Hardware and Software.
Selecting and Implementing An Embedded Database System Presented by Jeff Webb March 2005 Article written by Michael Olson IEEE Software, 2000.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
Emulsion Database Design Status Report Cristiano Bozza European Emulsion Group LNGS, May 2003 Updated DB Schema Distributed DB Implementation DB Client.
CERN - IT Department CH-1211 Genève 23 Switzerland t The High Performance Archiver for the LHC Experiments Manuel Gonzalez Berges CERN, Geneva.
© Janice Regan, CMPT 128, Jan 2007 CMPT 371 Data Communications and Networking HTTP 0.
Jan 3, 2001Brian A Cole Page #1 EvB 2002 Major Categories of issues/work Performance (event rate) Hardware  Next generation of PCs  Network upgrade Control.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
European Scanning System R&D Bari, Bologna, Bern, Lyon, Napoli, Roma, Salerno Tuning of 10 cm 2 /hour on the OPERA emulsions But not refreshed and 32 micron.
Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas.
Block1 Wrapping Your Nugget Around Distributed Processing.
The Future of Windows ComNET 2002 DC David Strom
Group I Renjith Deepesh Praveesh P Varun V Subramanian Halesh P K.
DB-based DAQ monitoring and Physics analysis tools Emiliano Barbuto European Emulsion Group (LNGS May 2003)
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
© 2008 Quest Software, Inc. ALL RIGHTS RESERVED. Perfmon and Profiler 101.
Testing… Testing… 1, 2, 3.x... Performance Testing of Pi on NT George Krc Mead Paper.
Server Performance, Scaling, Reliability and Configuration Norman White.
GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Status of the D1 (Event) and D2 (Spacecraft Data) Database Prototypes for DC1 Robert.
1 Database mini workshop: reconstressing athena RECONSTRESSing: stress testing COOL reading of athena reconstruction clients Database mini workshop, CERN.
ESRI User Conference 2004 ArcSDE. Some Nuggets Setup Performance Distribution Geodatabase History.
1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,
Hosted SharePoint. Part 3/3: Office Live as a WSS solution Speaker Name Microsoft Corporation Hosted.
LNGS Scanning Station Status Nicola D’Ambrosio (LNGS Group) Ankara, 2 April 2009.
CSU - DCE Webmaster I Scaling Issues - Fort Collins, CO Copyright © XTR Systems, LLC Web Site Scaling Issues (or Size Really Does Matter) Instructor:
Development of the European Scanning System Progress Report Development of the European Scanning System Cristiano Bozza – European Emulsion Group - LNF,
1 Part VII Component-level Performance Models for the Web © 1998 Menascé & Almeida. All Rights Reserved.
Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in.
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing,
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
European Scanning System: status report. DRY Fill factor 92.4 ± 1.6 % DB-driven Scan-back and Total Scan in Bari OIL Fill factor 93.1 ± 1.2 % Brick #8,
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
Storage Systems CSE 598d, Spring 2007 OS Support for DB Management DB File System April 3, 2007 Mark Johnson.
SySal Analysis tools: Status and outlook Cristiano Bozza Salerno Emulsion Group Bern, March 2004.
January 20, 2000K. Sliwa/ Tufts University DOE/NSF ATLAS Review 1 SIMULATION OF DAILY ACTIVITITIES AT REGIONAL CENTERS MONARC Collaboration Alexander Nazarenko.
FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Experiment #2: LAN Performance of PC VS. LAN Performance of Mac Ben Teichman Randy Janzen Matt McAndrews.
The Database Project a starting work by Arnauld Albert, Cristiano Bozza.
© Janice Regan, CMPT 128, Jan 2007 CMPT 371 Data Communications and Networking HTTP 0.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Top 10 Non-SharePoint Technical Issues That Can Doom Your Implementation Robert Bogue (317)
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
Understanding and Improving Server Performance
DIT314 ~ Client Operating System & Administration
Cluster Active Archive
Run experience Cover FW SRB FW DAQ and monitoring
Migration Strategies – Business Desktop Deployment (BDD) Overview
Admission Control and Request Scheduling in E-Commerce Web Sites
Specifications Clean & Match
Hybrid Buffer Pool The Good, the Bad and the Ugly
Presentation transcript:

Scaling up from local DB to distributed DB Cristiano Bozza European Emulsion Group Nagoya, Jan 2004 Presented by Giuseppe Grella

Report on DB development activities The development of the European Scanning DB continues steadily, involving several groups and resources Salerno 2 dual-CPU servers, 3 rd coming OS: Windows 2003 Server Oracle 9iDS 32-bit version Lyon 1 server OS: Solaris Oracle 9iDS 64-bit version Bari (NEW!) 1 PC, 1 dual-CPU server coming OS: Windows XP, to become Windows 2003 Server Oracle 9iDS 32-bit version DB Server: Oracle 9iDS

Report on DB development activities All the groups share the same DB schema.NET access library is part of the SySal development (already works and is being upgraded / debugged) Java and C++ libraries for OPERA needs are under development (at some point in space-time, people working on the official SW should meet people working on this DB to check possible interplay) Stable design: just 1 field added in 8 months over more than 30 tables Largest development effort underway now: Replication tests + performance estimation

Replication tests Lyon Salerno #1 Salerno #2 Bari Current replication topology Replication policy: MultiMaster Replication (will switch to Materialized Views when more sites come up)

Replication tests Replication has been set up and verified in 2-site subconfiguration (only pairs of DB servers were connected at each time) Everything OK! Next test: check with fully interconnected network, all 4 sites together Lyon Salerno #1 Salerno #2 Bari

Performance estimation Most difficult situation: large bunches of INSERTs Optimize indices Optimize data formatting Optimize network communication Optimize access library Optimize SQL OperaDb library v.2 is 8 times faster than v.1 (in certain operations)... but the code is becoming increasingly complicated Data retrieval tests (SELECT) are easy to perform Almost no parameters to optimize Foreword

Performance estimation INSERT bunch (server and client on the same machine)

Performance estimation An interesting test: load 57 aligned scanning zones from the 57 plates of the spring-packed brick (Frascati, Oct 2003) Test details: Data sample: Base tracks, Microtracks Query: parameterized bunch of INSERTs wrapped in a single transaction DB Server: Salerno #1 server (Win2003) 2 Pentium-IV Xeon CPUs at 2.4 GHz 1 GB RAM 3×65 GB SCSI HDD (RAID-5 by HW RAID controller) 1 Gbps Ethernet card, switches to 100 Mbps on external LAN DB Client: Salerno file server (Win2000) 2 Pentium-III CPUs at 933 MHz 512 MB RAM 3×72 GB SCSI HDD (RAID-5 by Windows SW) 100 Mbps Ethernet card Network traffic: almost zero (tested on Sundays) Benchmarking tool: Windows Performance Counter DB Access SW: OperaDb v.2  (ODP.NET) PRELIMINARY tests to estimate Oracle performance for OPERA

Start!!! Client machine Start!!! DB server machine Performance estimation Snapshots of the start

Performance estimation During the query The network load is low “Bursts” of INSERTs DB Server

Performance estimation During the query When caches are full, disks are a bottleneck DB Server machine

Performance estimation Results of the test: Available bandwidth: 100 Mbps Data sample: Base tracks, Microtracks Data size on disk (SySal files): 41 MB in TLG files Data size on disk (Oracle data file): 122 MB Total duration: 0:25:50 (0:25:16 excluding client program code)

Performance estimation Replication performance: the insertion of these data naturally forced a replication onto the Bari DB server Results of the replication Available bandwidth: Unknown, < 2 Mbps Data sample: Base tracks, Microtracks Total duration: The sample was checked after 1 hours and was found already completely replicated on the Bari DB Server Further tests are needed to improve the precision However, the replication speed is expected to be not lower than the SQL INSERT speed

Performance estimation Final considerations after preliminary performance tests Processor occupancy and disk activity are relatively low The network performance could be improved Find the best access method (DB access performance has been found to depend dramatically on the library used) Improve TCP packet usage by access library INSERT server (only if really needed) Oracle confirms again to fit OPERA needs

Conclusions No big difficulty met up to now in developing the European Emulsion Scanning DB for OPERA Oracle confirms to be a good multi-platform DB server Easy development – people can concentrate on creating tools instead than on solving problems Performances are already fair but there are hints that can be improved with a little more work