Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.

Slides:



Advertisements
Similar presentations
Lecture 12: MapReduce: Simplified Data Processing on Large Clusters Xiaowei Yang (Duke University)
Advertisements

M. Muztaba Fuad Masters in Computer Science Department of Computer Science Adelaide University Supervised By Dr. Michael J. Oudshoorn Associate Professor.
June, 20013rd ROOT Workshop1 PROOF and ROOT Grid Features Fons Rademakers.
PROOF and AnT in PHOBOS Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,
1 PROOF & GRID Update Fons Rademakers. 2 Parallel ROOT Facility The PROOF system allows: parallel execution of scripts parallel analysis of trees in a.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
June 21, PROOF - Parallel ROOT Facility Maarten Ballintijn, Rene Brun, Fons Rademakers, Gunter Roland Bring the KB to the PB.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
ROOT An object oriented HEP analysis framework.. Computing in Physics Physics = experimental science =>Experiments (e.g. at CERN) Planning phase Physics.
PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI Bring the KB to the PB not the PB to the KB.
The ALICE Analysis Framework A.Gheata for ALICE Offline Collaboration 11/3/2008 ACAT'081A.Gheata – ALICE Analysis Framework.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
ROOT Tutorials - Session 101 PROOT Tutorials – Session 10 PROOF, GRID, AliEn Fons Rademakers Bring the KB to the PB not the PB to the KB.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Introduction to HP Availability Manager.
Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
1 Marek BiskupACAT2005PROO F Parallel Interactive and Batch HEP-Data Analysis with PROOF Maarten Ballintijn*, Marek Biskup**, Rene Brun**, Philippe Canal***,
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.
Testing the dynamic per-query scheduling (with a FIFO queue) Jan Iwaszkiewicz.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
Storage and Data Movement at FNAL D. Petravick CHEP 2003.
Online Monitoring System at KLOE Alessandra Doria INFN - Napoli for the KLOE collaboration CHEP 2000 Padova, 7-11 February 2000 NAPOLI.
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
1 Status of PROOF G. Ganis / CERN Application Area meeting, 24 May 2006.
Overview, Major Developments, Directions1 ROOT Project Status Major developments Directions NSS05 Conference 25 October Ren é Brun CERN Based on my presentation.
September, 2002CSC PROOF - Parallel ROOT Facility Fons Rademakers Bring the KB to the PB not the PB to the KB.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Sept. 2000CERN School of Computing1 PROOF and ROOT Grid Features Fons Rademakers.
ROOT and PROOF Tutorial Arsen HayrapetyanMartin Vala Yerevan Physics Institute, Yerevan, Armenia; European Organization for Nuclear Research (CERN)
PROOF on multi-core machines G. GANIS CERN / PH-SFT for the ROOT team Workshop on Parallelization and MultiCore technologies for LHC, CERN, April 2008.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Introduction to Distributed Platforms
Definition of Distributed System
PROOF – Parallel ROOT Facility
BDII Performance Tests
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Grid Canada Testbed using HEP applications
Comparison of LAN, MAN, WAN
The Globus Toolkit™: Information Services
Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting
Support for ”interactive batch”
The Design of a Grid Computing System for Drug Discovery and Design
PROOF - Parallel ROOT Facility
Wide Area Workload Management Work Package DATAGRID project
Status and plans for bookkeeping system and production tools
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal / FNAL CHEP 2004

September, 2004Super Scaling PROOF to Very Large Clusters2 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

September, 2004Super Scaling PROOF to Very Large Clusters3 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

September, 2004Super Scaling PROOF to Very Large Clusters4 PROOF – Parallel ROOT Facility  Interactive analysis of very large sets of ROOT data files on a cluster of computers  Employ inherent parallelism in event data  The main design goals are:  Transparency, scalability, adaptability  On the GRID, extended from local cluster to wide area virtual cluster or cluster of clusters  Collaboration between ROOT group at CERN and MIT Heavy Ion Group

September, 2004Super Scaling PROOF to Very Large Clusters5 PROOF, continued  Multi Tier architecture  Optimize for Data Locality  WAN Ready and GRID compatible Internet Master Slave User

September, 2004Super Scaling PROOF to Very Large Clusters6 PROOF - Architecture  Data Access Strategies  Local data first, also rootd, rfio, SAN/NAS  Transparency  Input objects copied from client  Output objects merged, returned to client  Scalability and Adaptability  Vary packet size (specific workload, slave performance, dynamic load)  Heterogeneous Servers  Migrate to multi site configurations

September, 2004Super Scaling PROOF to Very Large Clusters7 Outline  PROOF Overview  Benchmark Package  Dataset generation  Benchmark TSelector  Statistics and Event Trace  Benchmark results  Other developments  Future plans

September, 2004Super Scaling PROOF to Very Large Clusters8 Dataset generation  Use the ROOT “Event” example class  Script for creating PAR file is provided  Generate data on all nodes with slaves  Slaves generate data files in parallel  Specify location, size and number of files % make_event_par.sh % root root[0] gROOT->Proof() root[1].X make_event_trees.C(“/tmp/data”,100000,4) root[2].L make_tdset.C root[2] TDSet *d = make_tdset.C()

September, 2004Super Scaling PROOF to Very Large Clusters9 Benchmark TSelector  Three selectors are used  EventTree_NoProc.C – Empty Process() function, reads no data  EventTree_Proc.C – Reads all data and fills histogram (actually only 35% read in this test)  EventTree_ProcOpt.C – Reads a fraction of the data (20%) and fills histogram

September, 2004Super Scaling PROOF to Very Large Clusters10 Statistics and Event Trace  Global Histograms to monitor master  Number of packets, number of events, processing time, get packet latency; per slave  Can be viewed using standard feedback  Trace Tree, detailed log of events during query  Master only or Master and Slave  Detailed List of recorded events follows  Implemented using standard ROOT classes and PROOF facilities

September, 2004Super Scaling PROOF to Very Large Clusters11 Events recorded in Trace  Each event contains a timestamp and the recording slave or master  Begin and End of Query  Begin and End of File  Packet details and processing time  File Open statistics (slaves)  File Read statistics (slaves)  Easy to add new events

September, 2004Super Scaling PROOF to Very Large Clusters12 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

September, 2004Super Scaling PROOF to Very Large Clusters13 Benchmark Results  CDF cluster at Fermilab  160 nodes, initial tests  Pharm, Phobos private cluster, 24 nodes  6, 730 MHz P3 dual  6, 930 MHz P3 dual  12, 1.8 GHz P4 dual  Dataset:  1 files per slave, events, 100 Mb

September, 2004Super Scaling PROOF to Very Large Clusters14 Results on Pharm

September, 2004Super Scaling PROOF to Very Large Clusters15 Results on Pharm, continued

September, 2004Super Scaling PROOF to Very Large Clusters16 Local and remote File open Local local remote

September, 2004Super Scaling PROOF to Very Large Clusters17 Slave I/O Performance

September, 2004Super Scaling PROOF to Very Large Clusters18 Benchmark Results  Phobos-RCF, central facility at BNL, 370 nodes total  75, 3.05 Ghz P4 dual, IDE  99, 2.4 Ghz P4 dual, IDE  18, 1.4 Ghz P3 dual, IDE  Dataset:  1 files per slave, events, 100 Mb

September, 2004Super Scaling PROOF to Very Large Clusters19 PHOBOS RCF LAN Layout

September, 2004Super Scaling PROOF to Very Large Clusters20 Results on Phobos-RCF

September, 2004Super Scaling PROOF to Very Large Clusters21 Looking at the problem

September, 2004Super Scaling PROOF to Very Large Clusters22 Processing time distributions

September, 2004Super Scaling PROOF to Very Large Clusters23 Processing time, detailed

September, 2004Super Scaling PROOF to Very Large Clusters24 Request packet from Master

September, 2004Super Scaling PROOF to Very Large Clusters25 Benchmark Conclusions  The benchmark and measurement facility has proven to be a very useful tool  Don’t use NFS based home directories  LAN topology is important  LAN speed is important  More testing is required to pinpoint sporadic long latency

September, 2004Super Scaling PROOF to Very Large Clusters26 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

September, 2004Super Scaling PROOF to Very Large Clusters27 Other developments  Packetizer fixes and new dev version  PROOF Parallel startup  TDrawFeedback  TParameter utility class  TCondor improvements  Authentication improvements  Long64_t introduction

September, 2004Super Scaling PROOF to Very Large Clusters28 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

September, 2004Super Scaling PROOF to Very Large Clusters29 Future plans  Understand and Solve LAN latency problem  In prototype stage  TProof::Draw()  Multi level master configuration  Documentation  HowTo  Benchmarking  PEAC PROOF Grid scheduler

September, 2004Super Scaling PROOF to Very Large Clusters30 The End  Questions?

September, 2004Super Scaling PROOF to Very Large Clusters31 Parallel Script Execution root Remote PROOF Cluster proof TNetFile TFile Local PC $ root ana.C stdout/obj node1 node2 node3 node4 $ root root [0].x ana.C $ root root [0].x ana.C root [1] gROOT->Proof(“remote”) $ root root [0] tree->Process(“ana.C”) root [1] gROOT->Proof(“remote”) root [2] dset->Process(“ana.C”) ana.C proof proof = slave server proof proof = master server #proof.conf slave node1 slave node2 slave node3 slave node4 *.root TFile

September, 2004Super Scaling PROOF to Very Large Clusters32 Simplified message flow Client Master Slave(s) SendFile Process(dset,sel,inp,num,first) GetEntries Process(dset,sel,inp,num,first) GetPacket ReturnResults(out,log)

September, 2004Super Scaling PROOF to Very Large Clusters33 TSelector control flow TProof Slave(s) Begin() TSelector SlaveBegin() Send Input Objects Terminate() SlaveTerminate() Return Output Objects Process()...

September, 2004Super Scaling PROOF to Very Large Clusters34 PEAC System Overview

September, 2004Super Scaling PROOF to Very Large Clusters35 Active Files during Query

September, 2004Super Scaling PROOF to Very Large Clusters36 Pharm Slave I/O

September, 2004Super Scaling PROOF to Very Large Clusters37

September, 2004Super Scaling PROOF to Very Large Clusters38 Active Files during Query