September, 2002CSC 20021 PROOF - Parallel ROOT Facility Fons Rademakers Bring the KB to the PB not the PB to the KB.

Slides:



Advertisements
Similar presentations
1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005.
Advertisements

Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Virtual Machine Technology Dr. Gregor von Laszewski Dr. Lizhe Wang.
B4 Application Environment Load Balancing Job and Queue Management Tim Smith CERN/IT.
June, 20013rd ROOT Workshop1 PROOF and ROOT Grid Features Fons Rademakers.
PROOF and AnT in PHOBOS Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting.
High Performance Computing Course Notes Grid Computing.
CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
1 PROOF & GRID Update Fons Rademakers. 2 Parallel ROOT Facility The PROOF system allows: parallel execution of scripts parallel analysis of trees in a.
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
June 21, PROOF - Parallel ROOT Facility Maarten Ballintijn, Rene Brun, Fons Rademakers, Gunter Roland Bring the KB to the PB.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Grids and Globus at BNL Presented by John Scott Leita.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
10 Oct. 2001CMS/ROOT Workshop1 Networking, Remote Access, SQL, PROOF and GRID Fons Rademakers René Brun.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI Bring the KB to the PB not the PB to the KB.
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
Dynamic Firewalls and Service Deployment Models for Grid Environments Gian Luca Volpato, Christian Grimm RRZN – Leibniz Universität Hannover Cracow Grid.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting June 13-14, 2002.
Presented by Xiaoyu Qin Virtualized Access Control & Firewall Virtualization.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
ROOT Tutorials - Session 101 PROOT Tutorials – Session 10 PROOF, GRID, AliEn Fons Rademakers Bring the KB to the PB not the PB to the KB.
Grid Computing I CONDOR.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
Evolution of Parallel Programming in HEP F. Rademakers – CERN International Workshop on Large Scale Computing VECC, Kolkata.
1 Marek BiskupACAT2005PROO F Parallel Interactive and Batch HEP-Data Analysis with PROOF Maarten Ballintijn*, Marek Biskup**, Rene Brun**, Philippe Canal***,
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Privilege separation in Condor Bruce Beckles University of Cambridge Computing Service.
Communicating Security Assertions over the GridFTP Control Channel Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Frank Siebenlist 1,2 and Ian Foster 1,2,3 1.
ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.
Server to Server Communication Redis as an enabler Orion Free
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
1 PROOF The Parallel ROOT Facility Gerardo Ganis / CERN CHEP06, Computing in High Energy Physics 13 – 17 Feb 2006, Mumbai, India Bring the KB to the PB.
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Introduction to Grid Computing and its components.
A prototype for an extended PROOF What is PROOF ? ROOT analysis model … … on a multi-tier architecture Status New development Prototype based on XRD Demo.
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
SQL SERVER 2008 Installation Guide A Step by Step Guide Prepared by Hassan Tariq.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May
1 Status of PROOF G. Ganis / CERN Application Area meeting, 24 May 2006.
March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB.
Oct 20024th Intl. ROOT Workshop1 New Infrastructure Features Since ROOT 2001 Fons Rademakers.
By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Sept. 2000CERN School of Computing1 PROOF and ROOT Grid Features Fons Rademakers.
October 2014 HYBRIS ARCHITECTURE & TECHNOLOGY 01 OVERVIEW.
Introduction to Distributed Platforms
StoRM: a SRM solution for disk based storage systems
PROOF – Parallel ROOT Facility
April HEPCG Workshop 2006 GSI
Partner: LMU (Atlas), GSI (Alice)
LCG middleware and LHC experiments ARDA project
Distributed Data Bases & GRID Interactive Access
Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting
Support for ”interactive batch”
PROOF - Parallel ROOT Facility
Presentation transcript:

September, 2002CSC PROOF - Parallel ROOT Facility Fons Rademakers Bring the KB to the PB not the PB to the KB

September, 2002CSC PROOF Collaboration between core ROOT group at CERN and MIT Heavy Ion Group Fons Rademakers Maarten Ballintijn Part of and based on ROOT framework uses heavily ROOT networking and other infrastructure classes Currently no external technologies

September, 2002CSC Parallel ROOT Facility The PROOF system allows: parallel analysis of trees in a set of files parallel analysis of objects in a set of files parallel execution of scripts on clusters of heterogeneous machines Its design goals are: transparency, scalability, adaptability Prototype developed in 1997 as proof of concept, full version nearing completion now

September, 2002CSC Parallel Script Execution root Remote PROOF Cluster proof TNetFile TFile Local PC $ root ana.C stdout/obj node1 node2 node3 node4 $ root root [0].x ana.C $ root root [0].x ana.C root [1] gROOT->Proof(“remote”) $ root root [0] tree.Process(“ana.C”) root [1] gROOT->Proof(“remote”) root [2] chain.Process(“ana.C”) ana.C proof proof = slave server proof proof = master server #proof.conf slave node1 slave node2 slave node3 slave node4 *.root TFile

September, 2002CSC Data Access Strategies Each slave get assigned, as much as possible, packets representing data in local files If no (more) local data, get remote data via rootd and rfio (needs good LAN, like GB eth) In case of SAN/NAS just use round robin strategy

September, 2002CSC PROOF Transparency On demand, make available to the PROOF servers any objects created in the client Return to the client all objects created on the PROOF slaves the master server will try to add “partial” objects coming from the different slaves before sending them to the client

September, 2002CSC PROOF Scalability Scalability in parallel systems is determined by the amount of communication overhead (Amdahl’s law) Varying the packet size allows one to tune the system. The larger the packets the less communications is needed, the better the scalability Disadvantage: less adaptive to varying conditions on slaves

September, 2002CSC PROOF Adaptability Adaptability means to be able to adapt to varying conditions (load, disk activity) on slaves By using a “pull” architecture the slaves determine their own processing rate and allows the master to control the amount of work to hand out disadvantage: too fine grain packet size tuning hurts scalability

September, 2002CSC PROOF Error Handling Handling death of PROOF servers death of master fatal, need to reconnect death of slave master can resubmit packets of death slave to other slaves Handling of ctrl-c OOB message is send to master, and forwarded to slaves, causing soft/hard interrupt

September, 2002CSC PROOF Authentication PROOF supports secure and un-secure authentication mechanisms Un-secure mangled password send over network Secure SRP, Secure Remote Password protocol (Stanford Univ.), public key technology Kerberos5 Soon: Globus authentication

September, 2002CSC PROOF Grid Interface PROOF can use a Grid Resource Broker to detect which nodes in a cluster can be used in the parallel session PROOF can use Grid File Catalogue and Replication Manager to map LFN’s to chain of PFN’s PROOF can use Grid Monitoring Services Access will be via abstract Grid interface

September, 2002CSC Running a PROOF Job // Analyze generic data sets in parallel gROOT->Proof(); TDSet *objset = new TDSet("MyEvent", "*", "/events"); objset->Add("lfn://alien.cern.ch/alice/prod2002/file1");... objset->Add(set2003); objset->Process(“myselector.C++”); // Analyze TChains in parallel gROOT->Proof(); TChain *chain = new TChain(“AOD"); chain->Add("lfn://alien.cern.ch/alice/prod2002/file1");... chain->Process(“myselector.C”);

September, 2002CSC Different PROOF Scenarios – Static, stand-alone This scheme assumes: no third party grid tools remote cluster containing data files of interest PROOF binaries and libs installed on cluster PROOF daemon startup via (x)inetd per user or group authentication setup by cluster owner static basic PROOF config file In this scheme the user knows his data sets are on the specified cluster. From his client he initiates a PROOF session on the cluster. The master server reads the config file and fires as many slaves as described in the config file. User issues queries to analyse data in parallel and enjoy near real-time response on large queries. Pros: easy to setup Cons: not flexible under changing cluster configurations, resource availability, authentication, etc.

September, 2002CSC Different PROOF Scenarios – Dynamic, PROOF in Control This scheme assumes: grid resource broker, file catalog, meta data catalog, possible replication manager PROOF binaries and libraries installed on cluster PROOF daemon startup via (x)inetd grid authentication In this scheme the user queries a metadata catalog to obtain the set of required files (LFN's), then the system will ask the resource broker where best to run depending on the set of LFN's, then the system initiates a PROOF session on the designated cluster. On the cluster the slaves are created by querying the (local) resource broker and the LFN's are converted to PFN's. Query is performed. Pros: use grid tools for resource and data discovery. Grid authentication. Cons: require preinstalled PROOF daemons. User must be authorized to access resources.

September, 2002CSC Different PROOF Scenarios – Dynamic, AliEn in Control This scheme assumes: AliEn as resource broker and grid environment (taking care of authentication, possible via Globus) AliEn file catalog, meta data catalog, and replication manager In this scheme the user queries a metadata catalog to obtain the set of required files (LFN's), then hands over the PROOF master/slave creation to AliEn via an AliEn job. AliEn will find the best resources, copy the PROOF executables and start the PROOF master, the master will then connect back to the ROOT client on a specified port (callback port was passed as argument to AliEn job). In turn the slave servers are started again via the same mechanism. Once connections have been setup the system proceeds like in example 2. Pros: use AliEn for resource and data discovery. No pre-installation of PROOF binaries. Can run on any AliEn supported cluster. Fully dynamic. Cons: no guaranteed direct response due to the absence of dedicated "interactive" queues.

September, 2002CSC Different PROOF Scenarios – Dynamic, Condor in Control This scheme assumes: Condor as resource broker and grid environment (taking care of authentication, possible via Globus) Grid file catalog, meta data catalog, and replication manager This scheme is basically same as previous AliEn based scheme. Except for the fact that in the Condor environment Condor manages free resources and as soon as a slave node is reclaimed by its owner, it will kill or suspend the slave job. Before any of those events Condor will send a signal to the master so that it can restart the slave somewhere else and/or re-schedule the work of that slave on the other slaves. Pros: use grid tools for resource and data discovery. No pre- installation of PROOF binaries. Can run on any Condor pool. No specific authentication. Fully dynamic. Cons: no guaranteed direct response due to the absence of dedicated "interactive" queues. Slaves can come and go.

September, 2002CSC TGrid Class – Abstract Interface to AliEn class TGrid : public TObject { public: virtual Int_t AddFile(const char *lfn, const char *pfn) = 0; virtual Int_t DeleteFile(const char *lfn) = 0; virtual TGridResult *GetPhysicalFileNames(const char *lfn) = 0; virtual Int_t AddAttribute(const char *lfn, const char *attrname, const char *attrval) = 0; virtual Int_t DeleteAttribute(const char *lfn, const char *attrname) = 0; virtual TGridResult *GetAttributes(const char *lfn) = 0; virtual void Close(Option_t *option="") = 0; virtual TGridResult *Query(const char *query) = 0; static TGrid *Connect(const char *grid, const char *uid = 0, const char *pw = 0); ClassDef(TGrid,0) // ABC defining interface to GRID services };

September, 2002CSC Running PROOF Using AliEn TGrid *alien = TGrid::Connect(“alien”); TGridResult *res; res = alien->Query(“lfn:///alice/simulation/ /V0.6*.root“); TDSet *treeset = new TDSet("TTree", "AOD"); treeset->Add(res); gROOT->Proof(res); // use files in result set to find remote nodes treeset->Process(“myselector.C”); // plot/save objects produced in myselector.C...