Presentation is loading. Please wait.

Presentation is loading. Please wait.

Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.

Similar presentations


Presentation on theme: "Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal."— Presentation transcript:

1 Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal / FNAL CHEP 2004

2 September, 2004Super Scaling PROOF to Very Large Clusters2 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

3 September, 2004Super Scaling PROOF to Very Large Clusters3 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

4 September, 2004Super Scaling PROOF to Very Large Clusters4 PROOF – Parallel ROOT Facility  Interactive analysis of very large sets of ROOT data files on a cluster of computers  Employ inherent parallelism in event data  The main design goals are:  Transparency, scalability, adaptability  On the GRID, extended from local cluster to wide area virtual cluster or cluster of clusters  Collaboration between ROOT group at CERN and MIT Heavy Ion Group

5 September, 2004Super Scaling PROOF to Very Large Clusters5 PROOF, continued  Multi Tier architecture  Optimize for Data Locality  WAN Ready and GRID compatible Internet Master Slave User

6 September, 2004Super Scaling PROOF to Very Large Clusters6 PROOF - Architecture  Data Access Strategies  Local data first, also rootd, rfio, SAN/NAS  Transparency  Input objects copied from client  Output objects merged, returned to client  Scalability and Adaptability  Vary packet size (specific workload, slave performance, dynamic load)  Heterogeneous Servers  Migrate to multi site configurations

7 September, 2004Super Scaling PROOF to Very Large Clusters7 Outline  PROOF Overview  Benchmark Package  Dataset generation  Benchmark TSelector  Statistics and Event Trace  Benchmark results  Other developments  Future plans

8 September, 2004Super Scaling PROOF to Very Large Clusters8 Dataset generation  Use the ROOT “Event” example class  Script for creating PAR file is provided  Generate data on all nodes with slaves  Slaves generate data files in parallel  Specify location, size and number of files % make_event_par.sh % root root[0] gROOT->Proof() root[1].X make_event_trees.C(“/tmp/data”,100000,4) root[2].L make_tdset.C root[2] TDSet *d = make_tdset.C()

9 September, 2004Super Scaling PROOF to Very Large Clusters9 Benchmark TSelector  Three selectors are used  EventTree_NoProc.C – Empty Process() function, reads no data  EventTree_Proc.C – Reads all data and fills histogram (actually only 35% read in this test)  EventTree_ProcOpt.C – Reads a fraction of the data (20%) and fills histogram

10 September, 2004Super Scaling PROOF to Very Large Clusters10 Statistics and Event Trace  Global Histograms to monitor master  Number of packets, number of events, processing time, get packet latency; per slave  Can be viewed using standard feedback  Trace Tree, detailed log of events during query  Master only or Master and Slave  Detailed List of recorded events follows  Implemented using standard ROOT classes and PROOF facilities

11 September, 2004Super Scaling PROOF to Very Large Clusters11 Events recorded in Trace  Each event contains a timestamp and the recording slave or master  Begin and End of Query  Begin and End of File  Packet details and processing time  File Open statistics (slaves)  File Read statistics (slaves)  Easy to add new events

12 September, 2004Super Scaling PROOF to Very Large Clusters12 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

13 September, 2004Super Scaling PROOF to Very Large Clusters13 Benchmark Results  CDF cluster at Fermilab  160 nodes, initial tests  Pharm, Phobos private cluster, 24 nodes  6, 730 MHz P3 dual  6, 930 MHz P3 dual  12, 1.8 GHz P4 dual  Dataset:  1 files per slave, 60000 events, 100 Mb

14 September, 2004Super Scaling PROOF to Very Large Clusters14 Results on Pharm

15 September, 2004Super Scaling PROOF to Very Large Clusters15 Results on Pharm, continued

16 September, 2004Super Scaling PROOF to Very Large Clusters16 Local and remote File open Local local remote

17 September, 2004Super Scaling PROOF to Very Large Clusters17 Slave I/O Performance

18 September, 2004Super Scaling PROOF to Very Large Clusters18 Benchmark Results  Phobos-RCF, central facility at BNL, 370 nodes total  75, 3.05 Ghz P4 dual, IDE  99, 2.4 Ghz P4 dual, IDE  18, 1.4 Ghz P3 dual, IDE  Dataset:  1 files per slave, 60000 events, 100 Mb

19 September, 2004Super Scaling PROOF to Very Large Clusters19 PHOBOS RCF LAN Layout

20 September, 2004Super Scaling PROOF to Very Large Clusters20 Results on Phobos-RCF

21 September, 2004Super Scaling PROOF to Very Large Clusters21 Looking at the problem

22 September, 2004Super Scaling PROOF to Very Large Clusters22 Processing time distributions

23 September, 2004Super Scaling PROOF to Very Large Clusters23 Processing time, detailed

24 September, 2004Super Scaling PROOF to Very Large Clusters24 Request packet from Master

25 September, 2004Super Scaling PROOF to Very Large Clusters25 Benchmark Conclusions  The benchmark and measurement facility has proven to be a very useful tool  Don’t use NFS based home directories  LAN topology is important  LAN speed is important  More testing is required to pinpoint sporadic long latency

26 September, 2004Super Scaling PROOF to Very Large Clusters26 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

27 September, 2004Super Scaling PROOF to Very Large Clusters27 Other developments  Packetizer fixes and new dev version  PROOF Parallel startup  TDrawFeedback  TParameter utility class  TCondor improvements  Authentication improvements  Long64_t introduction

28 September, 2004Super Scaling PROOF to Very Large Clusters28 Outline  PROOF Overview  Benchmark Package  Benchmark results  Other developments  Future plans

29 September, 2004Super Scaling PROOF to Very Large Clusters29 Future plans  Understand and Solve LAN latency problem  In prototype stage  TProof::Draw()  Multi level master configuration  Documentation  HowTo  Benchmarking  PEAC PROOF Grid scheduler

30 September, 2004Super Scaling PROOF to Very Large Clusters30 The End  Questions?

31 September, 2004Super Scaling PROOF to Very Large Clusters31 Parallel Script Execution root Remote PROOF Cluster proof TNetFile TFile Local PC $ root ana.C stdout/obj node1 node2 node3 node4 $ root root [0].x ana.C $ root root [0].x ana.C root [1] gROOT->Proof(“remote”) $ root root [0] tree->Process(“ana.C”) root [1] gROOT->Proof(“remote”) root [2] dset->Process(“ana.C”) ana.C proof proof = slave server proof proof = master server #proof.conf slave node1 slave node2 slave node3 slave node4 *.root TFile

32 September, 2004Super Scaling PROOF to Very Large Clusters32 Simplified message flow Client Master Slave(s) SendFile Process(dset,sel,inp,num,first) GetEntries Process(dset,sel,inp,num,first) GetPacket ReturnResults(out,log)

33 September, 2004Super Scaling PROOF to Very Large Clusters33 TSelector control flow TProof Slave(s) Begin() TSelector SlaveBegin() Send Input Objects Terminate() SlaveTerminate() Return Output Objects Process()...

34 September, 2004Super Scaling PROOF to Very Large Clusters34 PEAC System Overview

35 September, 2004Super Scaling PROOF to Very Large Clusters35 Active Files during Query

36 September, 2004Super Scaling PROOF to Very Large Clusters36 Pharm Slave I/O

37 September, 2004Super Scaling PROOF to Very Large Clusters37

38 September, 2004Super Scaling PROOF to Very Large Clusters38 Active Files during Query


Download ppt "Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal."

Similar presentations


Ads by Google