Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The D0 NIKHEF Farm Kors Bos Ton Damen Willem van Leeuwen Fermilab, May 23 2001.

Similar presentations


Presentation on theme: "1 The D0 NIKHEF Farm Kors Bos Ton Damen Willem van Leeuwen Fermilab, May 23 2001."— Presentation transcript:

1 1 The D0 NIKHEF Farm Kors Bos Ton Damen Willem van Leeuwen Fermilab, May 23 2001

2 2 Layout of this talk D0 Monte Carlo needs The NIKHEF D0 farm The data we produce The SAM data base A Grid intermezzo The network The next steps Fermilab, May 23 2001

3 3 D0 Monte Carlo needs D0 Trigger rate is 100 Hz, 10 7 seconds/yr  10 9 events/yr We want 10% of that be simulated  10 8 events/yr To simulate 1 QCD event takes ~3 minutes (size ~2 Mbyte) –On a 800 MHz PIII So 1 cpu can produce ~10 5 events/yr (~200 Gbyte) –Assuming a 60% overall efficiency So our 100 cpu farm can produce ~10 7 events/yr (~20 Tbyte) –And this is only 10% of the goal we set ourselves –Not counting Nijmegen D0 farm yet So we need another 900 cpu’s –UTA (50), Lyon (200), Prague(10), BU(64), –Nijmegen(50), Lancaster(200), Rio(25),

4 4 How it looks

5 5 The NIKHEF D0 Farm Farm Server 100 Mbit/s Surfnet 1 Gbit/s SARA network Tape robot @SARA 1 Gbit/s switch 100 Mbit/s NIKHEF network 1 Gbit/s 1.5 TB disk cache File Server 1 Gbit/s Sam station Farm nodes.. Etc. Meta data @Fermilab 155 Mbit/s Sam station

6 6 50 Farm nodes (100 cpu’s) Dell Precision Workstation 220 Dual Pentium III processor 800 MHz / 256 kB cache each 512 MB PC800 ECC RDRAM 40 GB (7200 rpm) ATA-66 disk drive no screen no keyboard no mouse wake up on Lan functionality

7 7 The File Server Elonex EIDE Server Dual Pentium III 700 MHz 512 MB SDRAM 20 GByte EIDE disk 1.2 Tbyte : 75 GB EIDE disks 2 x Gigabit Netgear GA620 network card The Farm Server Dell Precision 620 workstation Dual Pentium III Xeon 1 GHz 512 MB RDRAM 72.8 GByte SCSI disk Will also serve as D0 software server for the NIKHEF/D0 people

8 8 Software on the farm Boot via the network Standard Redhat Linux 6.2 Ups/upd on the server D0 software on the server FBSNG on the server, deamon on the nodes SAM on the file server Used to test new machines …

9 9 What we run on the farm Particle Generator: Pythia or Isajet Geant Detector simulation: d0gstar Digitization, adding min.bias: psim Check the data: mc_analyze Reconstruction: preco Analysis: reco_analyze

10 10 Example: Min.bias Did a run with 1000 events on all cpu’s –Took ~2 min./event –So ~1.5 days for the whole run –Ouput file size ~575 MByte We left those files on the nodes reason for enough local disk space Intend to repeat that “sometimes”

11 11 Output data -rw-r--r-- 1 a03 computer 298 Nov 5 19:25 RunJob_farm_qcdJob308161443.params -rw-r--r-- 1 a03 computer 1583995325 Nov 5 10:35 d0g_mcp03_pmc03.00.01_nikhef.d0farm_isajet_qcd-incl- PtGt2.0_mb-none_p1.1_308161443_2000 -rw-r--r-- 1 a03 computer 791 Nov 5 19:25 d0gstar_qcdJob308161443.params -rw-r--r-- 1 a03 computer 809 Nov 5 19:25 d0sim_qcdJob308161443.params -rw-r--r-- 1 a03 computer 47505408 Nov 3 16:15 gen_mcp03_pmc03.00.01_nikhef.d0farm_isajet_qcd-incl- PtGt2.0_mb-none_p1.1_308161443_2000 -rw-r--r-- 1 a03 computer 1003 Nov 5 19:25 import_d0g_qcdJob308161443.py -rw-r--r-- 1 a03 computer 912 Nov 5 19:25 import_gen_qcdJob308161443.py -rw-r--r-- 1 a03 computer 1054 Nov 5 19:26 import_sim_qcdJob308161443.py -rw-r--r-- 1 a03 computer 752 Nov 5 19:25 isajet_qcdJob308161443.params -rw-r--r-- 1 a03 computer 636 Nov 5 19:25 samglobal_qcdJob308161443.params -rw-r--r-- 1 a03 computer 777098777 Nov 5 19:24 sim_mcp03_psim01.02.00_nikhef.d0farm_isajet_qcd-incl- PtGt2.0_mb-poisson-2.5_p1.1_308161443_2000 -rw-r--r-- 1 a03 computer 2132 Nov 5 19:26 summary.conf

12 12 Output data translated 0.047 Gbyte gen_* 1.5 Gbyte d0g_* 0.7 Gbyte sim_* import_gen_*.py import_d0g_*.py import_sim_*.py isajet_*.params RunJob_Farm_*.params d0gstar_*.params d0sim_*.params samglobal_*.params Summary.conf 12 files for generator+d0gstar+psim But of course only 3 big ones Total ~2 Gbyte

13 13 Data management sam NIKHEF D0 FARM Fermilab d0mino SARA TERAS reconstructed data Import_gen.py generator data Import_d0g.py geant data (hits) Import_sim.py sim data (digis) Import_reco.py parameters

14 14 Automation Mc_runjob (modified) –Prepares MC jobs (gen+sim+reco+anal) (f.e.) 300 events per job/cpu Repeat (f.e.) 500 times –Submits them into the batch (FBS) Ran on the nodes –Copy to fileserver after completion A separate batch job onto the fileserver –Submits them into SAM Sam does file transfers to Fermi and SARA Runs for a week …

15 15 farm server file server node SAM DB datastore fbs(rcp) fbs(sam) fbs(mcc) mcc request mcc input mcc output 1.2 TB 40 GB FNAL SARA control data metadata fbs job: 1 mcc 2 rcp 3 sam 50 +

16 16 sam Fermilab d0mino SARA TERAS This is a grid! NIKHEF D0 FARM in2p3 D0 FARM KUN D0 FARM

17 17 The Grid Not just D0, but for the LHC expts. Not just SAM, but for any database Not just farms, but any cpu resource Not just SARA, but any mass storage Not just FBS, but any batch system Not just HEP, but any science, EO, …

18 18 European Datagrid Project 3 yr. Project for 10 M€ Manpower to develop grid tools Cern, in2p3, infn, pparc, esa, fom Nikhef + sara + knmi –Farm management –Mass storage management –Network management –Testbed –HEP & EO applications

19 19 LHC - Regional Centres Department Atlas LHCbAlice Desktop CERN – Tier 0 Tier 1 FNAL NIKHEF/ SARA IN2P3 Tier2 Vrije Univ. Amsterdam RAL INFN Brussel Leuven Utrecht Nijmegen SURFnet possibly KEK BNL

20 20 DataGrid : Test bed sites Nikhef

21 The NL-Datagrid Project

22 22 NL-Datagrid Goals National test bed for middleware development –WP4, WP5, WP6, WP7, WP8, WP9 To become an LHC Tier-1 center –ATLAS, LHCb, Alice To use it for the existing program –D0, Antares To use it for other sciences –EO, Astronomy, Biology for tests with other (Trans Atlantic) grids –D0 –PPDG, GriPhyN

23 23 NL-Datagrid Testbed Sites Nijmegen Univ. (Atlas) CERN RAL FNAL ESA Univ.Utrecht (Alice) Vrije Univ. (LHCb) Univ.Amsterdam (Atlas)

24 24 Utrecht Univ. Dutch Grid topology NIKHEF Free Univ. Surfnet SARA KNMI FNAL ESA D-PAF Munchen CERN Geneva Nijmegen Univ. LHCb D0 Atlas Alice D0 Atlas LHCb Alice

25 25 End of the Grid intermezzo Back to The NIKHEF D0 farm and Fermilab: The network

26 26 Network bandwidth NIKHEF  SURFnet1 Gbit SURFnet: Amsterdam  Chicago 622 Mbit Esnet: Chicago  Fermilab155 Mbit ATM But ftp gives us ~4 Mbit/sec bbftp gives us ~25 Mbit/sec bbftp processes in parallel ~45 Mbit/sec For 2002 NIKHEF  SURFnet2.5 Gbit SURFnet: Amsterdam  Chicago622 Mbit SURFnet: Amsterdam  Chicago2.5 Bbit optical Chicago  Fermilab? but more..

27 27 ftp++ ftp gives you 4 Mb/s to Fermilab bbftp: increased buffer, # streams gsiftp: with security layer, increased buffer,.. grid_ftp: increased buffer, # streams, #sockets, fail-over protection, security bbftp  ~20 Mb/s grid_ftp  ~25 Mb/s Multiple ftp in //  factor 2 seen Should get to > 100 Mbit/sec Or ~1 Gbyte/minute

28 28 SURFnet5 access capacity Access capacity 100 Gbit/s 1 Gbit/s 10 Mbit/s 100 Mbit/s 10 Gbit/s 1999200020012002 155 Mbit/s 2,5 Gbit/s 20 Gbit/s SURFnet5 10 Gbit/s 1.0 Gbit/s SURFnet4

29 29 NL SURFnet Geneva UK SuperJANET4 Abilene ESNET MREN It GARR-B GEANT NewYork Fr Renater STAR-TAP STAR-LIGHT 622 Mb 2.5 Gb TA access capacity

30 30 Network load last week Needed for 100 MC CPU’s: ~10 Mbit/s (200 GB/day) Available to Chicago: 622 Mbit/s Available to FNAL: 155 Mbit/s Needed next year (double cap.): ~25 Mbit/s Available to Chicago: 2.5 Gbit/s: factor 100 more !! Available to FNAL: ??

31 31 New nodes for D0 In a 2u 19” mounting Dual 1 GHz PIII 1 Gbyte RAM 40 Gbyte disk 100 Mbit ethernet Cost ~k$2 Dell machines were ~k$4 (tax incl)  FACTOR 2 cheaper!! assembly time 1/hour 1 switch k$2.5 (24 ports) 1 rack k$2 (46u high) Requested for 2001: k$60 22 dual cpu’s 1 switch 1 19” rack

32 32

33 33 The End Kors Bos Fermilab, May 23 2001


Download ppt "1 The D0 NIKHEF Farm Kors Bos Ton Damen Willem van Leeuwen Fermilab, May 23 2001."

Similar presentations


Ads by Google