Presentation is loading. Please wait.

Presentation is loading. Please wait.

PROOF Status and Perspectives G. GANIS CERN / LCG VII ROOT Users workshop, CERN, March 2007.

Similar presentations


Presentation on theme: "PROOF Status and Perspectives G. GANIS CERN / LCG VII ROOT Users workshop, CERN, March 2007."— Presentation transcript:

1 PROOF Status and Perspectives G. GANIS CERN / LCG VII ROOT Users workshop, CERN, March 2007

2 27/03/2007G. Ganis, ROOT Users Workshop2 Outline (Very) quick introduction (Very) quick introduction What’s new since ROOT05 What’s new since ROOT05 Current developments and plans Current developments and plans

3 27/03/2007G. Ganis, ROOT Users Workshop3 PROOF in a slide PROOF: Dynamic approach to end-user HEP analysis on distributed systems exploiting the intrinsic parallelism of HEP data (see Backup slides) (Very) quick introduction (Very) quick introduction What’s new since ROOT05 Current developments and plans submaster workersMSS geographical domain topmaster submaster workers MSS submaster workersMSS master client list of output objects (histograms, …) commands,scripts PROOF enabled facility

4 27/03/2007G. Ganis, ROOT Users Workshop4 PROOF aspects / issues Connection layer Connection layer Xrootd, Authentication, Error handling Xrootd, Authentication, Error handling Software distribution Software distribution Optimized package / class handling Optimized package / class handling Data access Data access Optimized distribution of data on worker nodes Optimized distribution of data on worker nodes Classification / handling of the results Classification / handling of the results Query result manager Query result manager Resource sharing among users Resource sharing among users Client gets one ROOT session on each machine Client gets one ROOT session on each machine Scheduling Scheduling (Very) quick introduction (Very) quick introduction What’s new since ROOT05 Current developments and plans

5 27/03/2007G. Ganis, ROOT Users Workshop5 What’s new since ROOT05 Connection layer based on XROOTD Connection layer based on XROOTD Coordinator functionality Coordinator functionality Full implementation of “interactive batch” model Full implementation of “interactive batch” model Dataset management Dataset management Packetizer improvements Packetizer improvements Progress in uploading / enabling additional software Progress in uploading / enabling additional software Restructuring of the PROOF modules Restructuring of the PROOF modules Progress in the integration with experiment software Progress in the integration with experiment software PROOF Wiki pages PROOF Wiki pages PROOF Wiki pages PROOF Wiki pages ALICE experience at the CAF ( see J.F. Grosse-Oetringhaus talk) ALICE experience at the CAF ( see J.F. Grosse-Oetringhaus talk)

6 27/03/2007G. Ganis, ROOT Users Workshop6 Coordinator functionality Independent channel to control the cluster Independent channel to control the cluster Global view Global view Independent access to information (e.g. log files) Independent access to information (e.g. log files) Needed for full implementation of “interactive batch” Needed for full implementation of “interactive batch” Not directly achievable with proofd Not directly achievable with proofd Daemon instance “disappearing” into proofserv Daemon instance “disappearing” into proofserv Session lifetime same as client connection lifetime Session lifetime same as client connection lifetime Parent proofd not aware of childrens Parent proofd not aware of childrens Natural candidate: XROOTD Natural candidate: XROOTD Light weight, industrial strength, networking and protocol handler Light weight, industrial strength, networking and protocol handler What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

7 27/03/2007G. Ganis, ROOT Users Workshop7 New connection layer based on XROOTD New PROOF-related protocol: New PROOF-related protocol: XrdProofdProtocol (XPD) XrdProofdProtocol (XPD) XPD launches and controls PROOF sessions (proofserv) XPD launches and controls PROOF sessions (proofserv) Client connection (XrdProofConn) based on XrdClient Client connection (XrdProofConn) based on XrdClient Concept of physical (per client) / logical (per session) connection Concept of physical (per client) / logical (per session) connection Asynchronous reading via dedicated thread Asynchronous reading via dedicated thread Messages read as soon as available and added to a queue Messages read as soon as available and added to a queue setup a control interrupt network independent of OOB setup a control interrupt network independent of OOB Cleaner security system Cleaner security system Physical connection authenticated Physical connection authenticated Associated logical connections inherit the “token” Associated logical connections inherit the “token” Client disconnection / reconnection handled naturally Client disconnection / reconnection handled naturally What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

8 27/03/2007G. Ganis, ROOT Users Workshop8 XPD role XrdProofdProtocol: client gateway to proofserv XrdProofdProtocol: client gateway to proofserv XPD links XrdProofdProtocol staticarea MT stuff proofserv Worker servers client PROOF Farm XROOTD links XrdXrootdProtocol files MT stuff client File Server XrdXrootdProtocol: client gateway to files What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

9 27/03/2007G. Ganis, ROOT Users Workshop9 XPD communication layer … client xc worker n XrdProofd XS worker 1 XrdProofd proofslave XS master XrdProofd proofserv xc XS xc XRD links TXSocket xc proofslave fork() fork() fork() PROOF Farm What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans client xc

10 27/03/2007G. Ganis, ROOT Users Workshop10 Stateless connection and “Interactive batch” “Interactive batch”: flexible submission system keeping advantages of interactivity and batch “Interactive batch”: flexible submission system keeping advantages of interactivity and batch If a query is taking too long have the option to abort it, to stop and retrieve the results, or to leave it running on the system coming back later on to browse / retrieve / archive the results If a query is taking too long have the option to abort it, to stop and retrieve the results, or to leave it running on the system coming back later on to browse / retrieve / archive the results Ingredients Ingredients Non-blocking running mode (  v5.04.00, ROOT05 ) Non-blocking running mode (  v5.04.00, ROOT05 ) Query result management (  v5.04.00, ROOT05 ) Query result management (  v5.04.00, ROOT05 ) Stateless client connection (  v5.08.00 ) Stateless client connection (  v5.08.00 ) Ctrl-Z functionality (soon) Ctrl-Z functionality (soon) What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

11 27/03/2007G. Ganis, ROOT Users Workshop11 Exploiting the coordinator: client side Not yet fully exploited: Not yet fully exploited: new functionality added regularly new functionality added regularly Examples: Examples: Log retrieval Log retrieval TProofLog contains log files as TMacro and implements display, grep, save, … functionality TProofLog contains log files as TMacro and implements display, grep, save, … functionality Session reset Session reset Cleanup of user’s entry in the coordinator Cleanup of user’s entry in the coordinator Only way-out when something bad happen Only way-out when something bad happen What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans root[] TProofLog *pl = TProof::Mgr(“user@master”)->GetSessionLogs() root[] pl->Grep(“violation”) TProof::Reset(“user@master”)

12 27/03/2007G. Ganis, ROOT Users Workshop12 Exploiting the coordinator: server side Static control of resource usage Static control of resource usage Max number of users Max number of users Max number of workers per user Max number of workers per user Access, usage control Access, usage control Role of server Role of server List of users allowed to connect List of users allowed to connect Define ROOT versions available on the cluster Define ROOT versions available on the cluster Extendable to packages Extendable to packages … What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

13 27/03/2007G. Ganis, ROOT Users Workshop13 Dataset uploader Optimized distribution of data files on the farm using XROOTD functionality Optimized distribution of data files on the farm using XROOTD functionality By direct upload By direct upload By staging out from mass storage By staging out from mass storage Direct upload Direct upload Sources: local directory, list of URLs Sources: local directory, list of URLs XROOTD/OLBD pool insures optimal distribution XROOTD/OLBD pool insures optimal distribution No special configuration (except for clean-up) No special configuration (except for clean-up) Using a stager Using a stager Requires XROOTD configuration Requires XROOTD configuration e.g. CASTOR for ALICE @ CAF e.g. CASTOR for ALICE @ CAF What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

14 27/03/2007G. Ganis, ROOT Users Workshop14 Dataset manager Data-sets are identified by name Data-sets are identified by name Data-sets can be retrieved by name to automatically create TDSet’s Data-sets can be retrieved by name to automatically create TDSet’s What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans root[0] TProof *proof = TProof::Open(“master”); root[1] proof->UploadDataSet(“MCppH”,”/data1/mc/ppH_*”); Uploading file:///data1/mc/ppH_01.root to \ root://poolurl//poolpath/ppH_01.root [TFile::Cp] Total 20.34 MB |===============| 100.00 % [6.9 MB/s] root[2] proof->ShowDataSets(); Existing Datasets: MCppH root[]TDSet *dset = new TDSet(proof->GetDataSet(“MCppH”));

15 27/03/2007G. Ganis, ROOT Users Workshop15 Dataset manager Metadata stored in sandbox on the master Metadata stored in sandbox on the master New sub-directory /dataset New sub-directory /dataset Concept of private / public data-sets Concept of private / public data-sets User’s private definitions User’s private definitions readable / writable by owner only readable / writable by owner only User’s public definitions User’s public definitions readable by anybody readable by anybody Global public definitions Global public definitions Workgroup- / experiment-wide (e.g. 2008 runs) Workgroup- / experiment-wide (e.g. 2008 runs) readable by anybody (group restrictions?) readable by anybody (group restrictions?) writable by privileged account writable by privileged account What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

16 27/03/2007G. Ganis, ROOT Users Workshop16 Packetizer improvements Packetizer’s goal: optimize work distribution to process queries as fast as possible Packetizer’s goal: optimize work distribution to process queries as fast as possible Standard TPacketizer’s strategy Standard TPacketizer’s strategy first process local files, than try to process remote data first process local files, than try to process remote data End-of-query bottleneck End-of-query bottleneck What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans Active workers Processing time

17 27/03/2007G. Ganis, ROOT Users Workshop17 New strategy: TAdaptivePacketizer Predict processing time of local files for each worker Predict processing time of local files for each worker Keep assigning remote files from start of the query to workers expected to finish faster Keep assigning remote files from start of the query to workers expected to finish faster Processing time improved by up to 50% Processing time improved by up to 50% Remote packets Samescale Processing rate for all packets for all packets NEW OLD

18 27/03/2007G. Ganis, ROOT Users Workshop18 Progress in using additional software Package enabling Package enabling Separated behaviour client / cluster Separated behaviour client / cluster Real-time feedback during build Real-time feedback during build Load mechanism extended to single class / macro Load mechanism extended to single class / macro Selectors / macros / classes binaries are now cached Selectors / macros / classes binaries are now cached Decreases initialization time Decreases initialization time API to modify include / library paths on the workers API to modify include / library paths on the workers Use packages globally available on the cluster Use packages globally available on the cluster root[] TProof *proof = TProof::Open(“master”) root[] proof->Load(“MyClass.C”) What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

19 27/03/2007G. Ganis, ROOT Users Workshop19 Restructuring of PROOF modules Reduce dependencies Reduce dependencies Better control size of executables (proofserv) Better control size of executables (proofserv) Faster worker startup Faster worker startup First step: First step: Get rid of TVirtualProof and PROOF dependencies in ‘tree’ Get rid of TVirtualProof and PROOF dependencies in ‘tree’ All PROOF in ‘proof’, ‘proofx’, ‘proofd’ All PROOF in ‘proof’, ‘proofx’, ‘proofd’ Still ‘proofserv’ needs a lot of libs Still ‘proofserv’ needs a lot of libs 2nd step (current situation): 2nd step (current situation): Separate out TProofPlayer, TPacketizer, … in ‘proofplayer’ (new libProofPlayer, v5.15.04) Separate out TProofPlayer, TPacketizer, … in ‘proofplayer’ (new libProofPlayer, v5.15.04) proofserv size on workers reduced by a factor of ~2 at startup proofserv size on workers reduced by a factor of ~2 at startup What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

20 27/03/2007G. Ganis, ROOT Users Workshop20 Further optimization of PROOF libs Differentiate setups on client and cluster Differentiate setups on client and cluster Client: Client: Needs graphics Needs graphics May not need all experiment software May not need all experiment software TSelector: compile only Begin() and Terminate() TSelector: compile only Begin() and Terminate() Servers: Servers: Need all experiment software Need all experiment software Do not need graphics Do not need graphics TSelector: do not compile Begin() and Terminate() TSelector: do not compile Begin() and Terminate() Client and Server versions of basic libs Client and Server versions of basic libs What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

21 27/03/2007G. Ganis, ROOT Users Workshop21 Additional improvements (incomplete) GUI controller GUI controller Integration of the data set manager Integration of the data set manager Integration of the new features of package manager Integration of the new features of package manager Improved session / query history bookkeeping Improved session / query history bookkeeping Improved user-friendliness of parameter setting Improved user-friendliness of parameter setting Automatic support dynamic environment setting Automatic support dynamic environment setting proofserv is a script launching proofserv.exe proofserv is a script launching proofserv.exe Envs to define the context in which to run Envs to define the context in which to run Useful for experiment specific settings (see later) and/or for debugging purposes (e.g. run valgrind on worker …) Useful for experiment specific settings (see later) and/or for debugging purposes (e.g. run valgrind on worker …) What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans root[] TProof *proof = TProof::Open(“master”) root[] proof->SetParameter(“factor”, 1.1)

22 27/03/2007G. Ganis, ROOT Users Workshop22 Integration with experiment software Finding, using the experiment software Finding, using the experiment software Environment settings, libraries loading Environment settings, libraries loading Implementing the analysis algorithms Implementing the analysis algorithms TSelector framework TSelector framework Structured analysis and automated interaction with trees (chains) (+) Structured analysis and automated interaction with trees (chains) (+) Tightly coupled with the tree (-) Tightly coupled with the tree (-) New analysis implies new selector New analysis implies new selector Change in the tree definition implies a new selector Change in the tree definition implies a new selector May conflict with existing experiment technologies May conflict with existing experiment technologies Add new layer to hide details irrelevant for the end-user Add new layer to hide details irrelevant for the end-user What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

23 27/03/2007G. Ganis, ROOT Users Workshop23 Setting the environment Experiment software available on nodes Experiment software available on nodes Additional dedicated software handled by the PROOF package manager Additional dedicated software handled by the PROOF package manager Allows user to run her/his own modifications Allows user to run her/his own modifications The experiment environment can be set The experiment environment can be set Statically (e.g. ALICE) Statically (e.g. ALICE) before starting xrootd (inherited by proofserv) before starting xrootd (inherited by proofserv) Dynamically (e.g. CMS) Dynamically (e.g. CMS) evaluating a user defined script in front of proofserv evaluating a user defined script in front of proofserv Allows to select different versions at run time Allows to select different versions at run time What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

24 27/03/2007G. Ganis, ROOT Users Workshop24 Dynamic environment setting: CMS CMS needs to run SCRAM before proofserv CMS needs to run SCRAM before proofserv PROOF_INITCMD contains the path of a script (NEW) PROOF_INITCMD contains the path of a script (NEW) The script initializes the CMS environment using SCRAM The script initializes the CMS environment using SCRAM TProof::AddEnvVar(“PROOF_INITCMD”, “~maartenb/proj/cms/CMSSW_1_1_1/setup_proof.sh”) #!/bin/sh # Export the architecture export SCRAM_ARCH=slc3_ia32_gcc323 # Init CMS defaults cd ~maartenb/proj/cms/CMSSW_1_1_1. /app/cms/cmsset_default.sh # Init runtime environment scramv1 runtime -sh > /tmp/dummy cat /tmp/dummy What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

25 27/03/2007G. Ganis, ROOT Users Workshop25 Examples of implementing analysis algorithms ALICE: ALICE: Generic AliSelector hiding details Generic AliSelector hiding details User’s selector derives from AliSelector User’s selector derives from AliSelector Access to ESD event by member fESD Access to ESD event by member fESD Alternative technology using tasks Alternative technology using tasks See J.F. Grosse-Oetringhaus talk See J.F. Grosse-Oetringhaus talk TAM technology @ PHOBOS TAM technology @ PHOBOS Based on modularized tasks Based on modularized tasks Separate analysis tasks from interaction with tree Separate analysis tasks from interaction with tree See C. Reed at ROOT05 See C. Reed at ROOT05 What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans

26 27/03/2007G. Ganis, ROOT Users Workshop26 CMSSW: provides EDAnalyzer for analysis CMSSW: provides EDAnalyzer for analysis Algorithms with a well defined interface can be used with both technologies (EDAnalyzer and TSelector) Algorithms with a well defined interface can be used with both technologies (EDAnalyzer and TSelector) Used in a TSelector templated framework TFWLiteSelector Used in a TSelector templated framework TFWLiteSelector Selector libraries distributed as PAR file Selector libraries distributed as PAR file Analysis algorithms in CMS What’s new since ROOT05 (Very) quick introduction What’s new since ROOT05 Current developments and plans class MyAnalysisAlgorithm { void process( const edm::Event & ); void postProcess( TList & ); void terminate( TList & ); }; // Load framework library gSystem->Load(“libFWCoreFWLite”); // Load TSelector library gSystem->Load(“libPhysicsToolsParallelAnalysis”); TSelector *mysel = new TFWLiteSelector

27 27/03/2007G. Ganis, ROOT Users Workshop27 Current developments and plans Scheduling Scheduling Consolidation, error handling Consolidation, error handling Improved but still cases when we lose control of the session Improved but still cases when we lose control of the session Processing error report Processing error report Associate to a query an object detailing what went wrong (e.g. data set elements not analyzed) and why Associate to a query an object detailing what went wrong (e.g. data set elements not analyzed) and why Non-input-file-driven based analysis Non-input-file-driven based analysis Current processing is based on tree or object files Current processing is based on tree or object files Local multi-core desktop optimization Local multi-core desktop optimization No daemons, UNIX sockets (no master?) No daemons, UNIX sockets (no master?) GUI: integration in a more general GUI ROOT controller GUI: integration in a more general GUI ROOT controller

28 27/03/2007G. Ganis, ROOT Users Workshop28 PROOF exploiting multi-cores Alice search for  0 ’s Alice search for  0 ’s 4 GB simulated data 4 GB simulated data Instantaneous rates Instantaneous rates (evt/s, MB/s) (evt/s, MB/s) Clear advantage of Clear advantage of quad core quad core Additional computing Power fully exploited Demo at Intel Quad-Core Launch – Nov 2006

29 27/03/2007G. Ganis, ROOT Users Workshop29 PROOF: scheduling multi-users Fair resource sharing Fair resource sharing System scheduler not enough if N users >= ~ N workers / 2 System scheduler not enough if N users >= ~ N workers / 2 Enforce priority policies Enforce priority policies Two approaches Two approaches Quota-based worker level load balancing Quota-based worker level load balancing Simple and solid implementation, no central unit Simple and solid implementation, no central unit Group quotas defined in the configuration file Group quotas defined in the configuration file Central scheduler Central scheduler Per-query decisions based on cluster load, resources need by the query, user history and priorities Per-query decisions based on cluster load, resources need by the query, user history and priorities Generic interface to external schedulers planned Generic interface to external schedulers planned MAUI, LSF, … MAUI, LSF, … Current developments and plans (Very) quick introduction What’s new since ROOT05 Current developments and plans

30 27/03/2007G. Ganis, ROOT Users Workshop30 Quota-based worker level load balancing Lower priority processes slowdown Lower priority processes slowdown sleep before next packet request sleep before next packet request Sleeping time proportional to the used CPU time Sleeping time proportional to the used CPU time factor depends on # users and the quotas factor depends on # users and the quotas Example: userA, quota 2/3; userB, quota 1/3 Example: userA, quota 2/3; userB, quota 1/3 After T seconds: After T seconds: CPU(A) = T/2, CPU(B) = T/2 CPU(A) = T/2, CPU(B) = T/2 Sleep B form T/2 seconds Sleep B form T/2 seconds After T + T/2 seconds After T + T/2 seconds CPU(A) = T/2 + T/2 = 2 * CPU(B) = T/2 CPU(A) = T/2 + T/2 = 2 * CPU(B) = T/2 General case of N users brings a tri-diagonal linear system General case of N users brings a tri-diagonal linear system Current developments and plans (Very) quick introduction What’s new since ROOT05 Current developments and plans

31 27/03/2007G. Ganis, ROOT Users Workshop31 Quota-based worker level load balancing Group quotas defined in the xrootd configuration file Group quotas defined in the xrootd configuration file Factors recalculated by the master XPD each time that a user start or ends processing Factors recalculated by the master XPD each time that a user start or ends processing Only active users considered Only active users considered A low priority user will get 100% of resources when alone A low priority user will get 100% of resources when alone Under linux processes SCHER_RR system scheduling enforced Under linux processes SCHER_RR system scheduling enforced The default, dynamic, SCHED_OTHER scheme screws up the all idea, as sleeping processes get higher priority at restart The default, dynamic, SCHED_OTHER scheme screws up the all idea, as sleeping processes get higher priority at restart xpd.group tpc usra,usrb xpd.grpparam tpc quota:70% Current developments and plans (Very) quick introduction What’s new since ROOT05 Current developments and plans

32 27/03/2007G. Ganis, ROOT Users Workshop32 Demo Same sample analysis (h1 slightly slowed-down) repeated for 20 times Same sample analysis (h1 slightly slowed-down) repeated for 20 times 2 users 2 users gganis: reserved quota 70% gganis: reserved quota 70% ganis: taking what left ganis: taking what left Histogram show processing rate in MB/s Histogram show processing rate in MB/s Current developments and plans (Very) quick introduction What’s new since ROOT05 Current developments and plans

33 27/03/2007G. Ganis, ROOT Users Workshop33 Demo Current developments and plans (Very) quick introduction What’s new since ROOT05 Current developments and plans

34 27/03/2007G. Ganis, ROOT Users Workshop34 Central scheduling Entity running on master XPD, loaded as plug-in Entity running on master XPD, loaded as plug-in Abstract interface XrdProofSched defined Abstract interface XrdProofSched defined Input: Input: Query info (via XrdProofServProxy ->proofserv) Query info (via XrdProofServProxy ->proofserv) Cluster status via OLBD control network Cluster status via OLBD control network Policy Policy Output: Output: List of workers to continue with List of workers to continue with Current developments and plans (Very) quick introduction What’s new since ROOT05 Current developments and plans class XrdProofSched { … public: virtual int GetWorkers(XrdproofServProxy *xps, std::list &wrks)=0; … };

35 27/03/2007G. Ganis, ROOT Users Workshop35 Central scheduling Current developments and plans (Very) quick introduction What’s new since ROOT05 Current developments and plans TProofPlayer(session) DatasetLookup TProof ClientMasterScheduler TPacketizer(query) XPD PLB (olbd) Schematic view Schematic view Needed ingredients: Needed ingredients: Full exploitation of the OLBD network Full exploitation of the OLBD network Come&Go functionality for workers Come&Go functionality for workers …

36 27/03/2007G. Ganis, ROOT Users Workshop36 Summary Several improvements in PROOF since ROOT05 Several improvements in PROOF since ROOT05 Coordinator functionality Coordinator functionality Data set manager Data set manager Resource control Resource control ALICE is stress testing the system in LHC environment using a test-CAF at CERN ALICE is stress testing the system in LHC environment using a test-CAF at CERN a lot of useful feedback a lot of useful feedback Efforts now concentrated on Efforts now concentrated on Further consolidation and optimization Further consolidation and optimization Scheduling Scheduling PROOF is steadily improving: getting ready for LHC data PROOF is steadily improving: getting ready for LHC data

37 27/03/2007G. Ganis, ROOT Users Workshop37 Credits PROOF team PROOF team M. Ballintijn, B. Bellenot, L. Franco, G.G., J. Iwaszkiewizc, F. Rademakers M. Ballintijn, B. Bellenot, L. Franco, G.G., J. Iwaszkiewizc, F. Rademakers J.F. Grosse-Oetringhaus, A. Peters (ALICE) J.F. Grosse-Oetringhaus, A. Peters (ALICE) A. Hanushevsky (SLAC) A. Hanushevsky (SLAC)

38 27/03/2007G. Ganis, ROOT Users Workshop38 Backup See also presentations at previous ROOT workshops and at CHEPxx See also presentations at previous ROOT workshops and at CHEPxx

39 27/03/2007G. Ganis, ROOT Users Workshop39 The ROOT data model: Trees & Selectors Begin() Create histos, … Define output list Process() preselection analysis Terminate() Final analysis (fitting, …) output list Selector loop over events OK event branch leaf branch leaf 12 n last n read needed parts only Chain branch leaf Backup

40 27/03/2007G. Ganis, ROOT Users Workshop40 Motivation for PROOF Provide an alternative, dynamic, approach to end- user HEP analysis on distributed systems Provide an alternative, dynamic, approach to end- user HEP analysis on distributed systems Typical HEP analysis is a continuous refinement cycle Typical HEP analysis is a continuous refinement cycle Data sets are collections of independent events Data sets are collections of independent events Large (e.g. ALICE ESD+AOD: ~350 TB / year) Large (e.g. ALICE ESD+AOD: ~350 TB / year) Spread over many disks and mass storage systems Spread over many disks and mass storage systems Exploiting intrinsic parallelism is the only way to analyze the data in reasonable times Exploiting intrinsic parallelism is the only way to analyze the data in reasonable times Implement algorithm Run over data set Make improvements Backup

41 27/03/2007G. Ganis, ROOT Users Workshop41 The PROOF approach catalog Storage PROOF farm scheduler query MASTER PROOF query: data file list, myAna.C files feedbacks final outputs (merged)  farm perceived as extension of local PC  same syntax as in local session  more dynamic use of resources  real time feedback  automated splitting and merging Backup

42 27/03/2007G. Ganis, ROOT Users Workshop42 PROOF design goals Transparency Transparency Minimal impact on the ROOT user habits Minimal impact on the ROOT user habits Scalability Scalability Full exploitation of the available resources Full exploitation of the available resources Adaptability Adaptability Cope transparently with heterogeneous environments Cope transparently with heterogeneous environments Preserve Real-time interaction and feedback Preserve Real-time interaction and feedback Intended for Intended for Central Analysis Facilities Central Analysis Facilities Departmental workgroup computing facilities (Tier-2’s) Departmental workgroup computing facilities (Tier-2’s) Multi-core / multi-disk desktops Multi-core / multi-disk desktops Backup

43 27/03/2007G. Ganis, ROOT Users Workshop43 PROOF dynamic load balancing Pull architecture guarantees scalability Pull architecture guarantees scalability Adapts to variations in performance Adapts to variations in performance Worker 1Worker N Master Backup

44 27/03/2007G. Ganis, ROOT Users Workshop44 PROOF intrinsic scalability Strictly concurrent user jobs Strictly concurrent user jobs at CAF (100% CPU used) at CAF (100% CPU used) In-memory data In-memory data Dual Xeon, 2.8 GHz Dual Xeon, 2.8 GHz CMS analysis CMS analysis 1 master, 80 workers 1 master, 80 workers Dual Xeon 3.2 GHz Dual Xeon 3.2 GHz Local data: 1.4 GB / node Local data: 1.4 GB / node Non-Blocking GB Ethernet Non-Blocking GB Ethernet 1 user 2 users 4 users 8 users I. Gonzales, Cantabria Backup

45 27/03/2007G. Ganis, ROOT Users Workshop45 PROOF essentials: what can be done? Ideally everything made of independent tasks Ideally everything made of independent tasks Currently available: Currently available: Processing of trees Processing of trees Processing of independent objects in a file Processing of independent objects in a file Tree processing and drawing functionality complete Tree processing and drawing functionality complete // Create a chain of trees root[0] TChain *c = CreateMyChain.C; // MySelec is a TSelector root[1] c->Process(“MySelec.C+”); // Create a chain of trees root[0] TChain *c = CreateMyChain.C; // Start PROOF and tell the chain // to use it root[1] TProof::Open(“masterURL”); root[2] c->SetProof() // Process goes via PROOF root[3] c->Process(“MySelec.C+”); PROOFLOCAL Backup

46 27/03/2007G. Ganis, ROOT Users Workshop46 The PROOF target Short analysis using local resources, e.g. - end-analysis calculations - visualization Long analysis jobs with well defined algorithms (e.g. production of personal trees) Medium term jobs, e.g. analysis design and development using also non-local resources  Optimize response for short / medium jobs  Perceive medium as short Backup

47 27/03/2007G. Ganis, ROOT Users Workshop47 PROOF: additional remarks Intrinsic serial overhead small Intrinsic serial overhead small requires reasonable connection between a (sub-)master and its workers requires reasonable connection between a (sub-)master and its workers Hardware considerations Hardware considerations IO bound analysis (frequent in HEP) often limited by hard drive access: N small disks are much better than 1 big one IO bound analysis (frequent in HEP) often limited by hard drive access: N small disks are much better than 1 big one Good amount of RAM for efficient data caching Good amount of RAM for efficient data caching Data access is The Issue: Data access is The Issue: Optimize for data locality, when possible Optimize for data locality, when possible Low-latency access to mass storage Low-latency access to mass storage Backup

48 27/03/2007G. Ganis, ROOT Users Workshop48 PROOF: data access issues Low latency in data access is essential for high performance Low latency in data access is essential for high performance Not only a PROOF issue Not only a PROOF issue File opening overhead File opening overhead Minimized using asynchronous open techniques Minimized using asynchronous open techniques Data retrieval Data retrieval caching, pre-fetching of data segments to be analyzed caching, pre-fetching of data segments to be analyzed Recently introduced in ROOT for TTree Recently introduced in ROOT for TTree Techniques improving network performance, e.g. InfiniBand, or file access (e.g. memory-based file serving, PetaCache) should be evaluated Techniques improving network performance, e.g. InfiniBand, or file access (e.g. memory-based file serving, PetaCache) should be evaluated Backup

49 27/03/2007G. Ganis, ROOT Users Workshop49 PROOF: PAR archive files Allow client to add software to be used in the analysis Allow client to add software to be used in the analysis Simple structure Simple structure package/ package/ Source / binary files Source / binary files package/PROOF-INF/BUILD.sh package/PROOF-INF/BUILD.sh How to build the package (makefile) How to build the package (makefile) package/PROOF-INF/SETUP.C package/PROOF-INF/SETUP.C How to enable the package (load, dependencies) How to enable the package (load, dependencies) A PAR is a gzip’ed tar-ball of the package tree A PAR is a gzip’ed tar-ball of the package tree Versioning support being added Versioning support being added Backup

50 27/03/2007G. Ganis, ROOT Users Workshop50 PROOF essentials: monitoring Internal Internal File access rates, packet latencies, processing time, etc. File access rates, packet latencies, processing time, etc. Basic set of histograms available at tunable frequency Basic set of histograms available at tunable frequency Client temporary output objects can also be retrieved Client temporary output objects can also be retrieved Possibility of detailed tree for further analysis Possibility of detailed tree for further analysis MonALISA-based MonALISA-based Each host reports Each host reports CPU, memory, CPU, memory, swap, network swap, network Each worker reports Each worker reports CPU, memory, evt/s, CPU, memory, evt/s, IO vs. network rate IO vs. network rate pcalimonitor.cern.ch:8889 pcalimonitor.cern.ch:8889 pcalimonitor.cern.ch:8889 Network traffic between nodes Backup

51 27/03/2007G. Ganis, ROOT Users Workshop51 PROOF GUI controller Allows full on-click control Allows full on-click control define a new session define a new session submit a query, execute submit a query, execute a command a command query editor query editor create / pick up a chain create / pick up a chain choose selectors choose selectors online monitoring of feedback histograms online monitoring of feedback histograms browse folders with results of query browse folders with results of query retrieve, delete, archive functionality retrieve, delete, archive functionality Backup


Download ppt "PROOF Status and Perspectives G. GANIS CERN / LCG VII ROOT Users workshop, CERN, March 2007."

Similar presentations


Ads by Google