Presentation is loading. Please wait.

Presentation is loading. Please wait.

Status of LHCb-INFN Computing

Similar presentations


Presentation on theme: "Status of LHCb-INFN Computing"— Presentation transcript:

1 Status of LHCb-INFN Computing
CSN1, Catania, September 18, 2002 Domenico Galli, Bologna

2 LHCb Computing Constraints
Urgent Need of production and analysis of large number of MC data sets in a short time. LHCb-light detector design. Trigger design, TDRs. Need to optimize the hardware and software configuration to minimize dead time and system administration effort. Status of LHCb-INFN Computing, 2 Domenico Galli

3 LHCb Farm Architecture (I)
Article in press on Computer Physics Communications: “A Beowulf-class computing cluster for the Monte Carlo production of the LHCb experiment”. Disk-less computing nodes, with operating systems centralized on a file server (Operating System Server). Very flexible configuration, allows adding and removing nodes from the system without any local installation. Useful for computing resources shared among different experiments. Extremely stable system: no side effects at all in more than 1 year of work. System administration duties minimized. Status of LHCb-INFN Computing, 3 Domenico Galli

4 LHCb Farm Architecture (II)
Security Usage of private IP addresses and Virtual LAN. High level of isolation from the Internet network. Extern accesses (afs servers, bookkeeping database, CASTOR library at CERN) through Network Address Translation technology on a Gateway node. Potential system “Single Points of Failure” equipped with redundant disk configuration. RAID-5 (2 NAS). RAID-1 (Gateway and Operating System Server). Status of LHCb-INFN Computing, 4 Domenico Galli

5 LHCb Farm Architecture (III)
NAS Red Hat 7.2 (kernel ) DNS NAT (IP masquerading) Disk-less node CERN Red Hat 6.1 Kernel PBS Master MC control server Farm Monitoring Gateway Fast Ethernet Switch Power Distributor Ethernet Link Power Control Control Node Processing Node 1 Processing Node n Red Hat 7.2 Various services: Home directories PXE remote boot, DHCP, NIS 1TB RAID 5 Uplink Mirrored disks (RAID 1) Public VLAN Private Disk-less nodes PBS Slave OS file-systems Master Server Status of LHCb-INFN Computing, 5 Domenico Galli

6 Rack (1U dual-processor MB)
Fast ethernet switch Rack (1U dual-processor MB) NAS, 1TB Ethernet controlled power distributor (32 channels) Status of LHCb-INFN Computing, 6 Domenico Galli

7 Status of LHCb-INFN Computing, 7
Data Storage Files containing reconstructed events (OODST-ROOT format) are transferred to CERN using bbftp and automatically stored on the CASTOR tape library. Data transfer from CNAF to CERN performed with a maximum throughput of 70 Mb/s (on a 100 Mb/s link). To be compared with ~15 Mb/s using ftp. Status of LHCb-INFN Computing, 7 Domenico Galli

8 2002 Monte Carlo Production
Target Production of large event statistics for the design of the LHCb-light detector and of the trigger system (trigger TDR). Software: Simulation (FORTRAN) and reconstruction (C++) code to be used in the production supplied in July. LHCb Data Challenge ongoing (August-September) Participating Computing Centers : CERN, INFN-CNAF, Liverpool, IN2P3-Lyon, NIKHEF, RAL, Bristol, Cambridge, Oxford, ScotGrid (Glasgow & Edinburgh) Status of LHCb-INFN Computing, 8 Domenico Galli

9 Status of Summer LHCb-Italy Monte Carlo Production (Data Challenge)
Events produced in Bologna (Aug., 1 –Sep., 12): 1,053,500 Bd0 -> pi+ pi- 79,000 Bd0 -> D*-(D0_bar(K+ pi-) pi-) pi+ 19,000 Bd0 -> K+ pi- 55,500 Bs0 -> K- pi+ 8,000 Bs0 -> K+ K- Bs0 -> J/psi(mu+ mu-) eta(gamma gamma) Bd0 -> phi(K+ K-) Ks0(pi+ pi-) Bs0 -> mu+ mu- Bd0 -> D+(K- pi+ pi+) D-(K+ pi- pi-) Bs0 -> Ds-(K+ K- pi-) K+ Bs0 -> J/psi(mu+ mu-) phi(K+ K-) Bs0 -> J/psi(e+ e-) phi(K+ K-) Minimum bias 47,500 c c_bar -> inclusive (at least one c hadron in 400 mrad) 275,500 b b_bar -> inclusive (at least one b hadron in 400 mrad) 505,000 Status of LHCb-INFN Computing, 9 Domenico Galli

10 Status of LHCb-INFN Computing, 10
Distribution of Produced Events Among Production Centers (August, 1–September, 12) The other above mentioned centres are late on the Data Challenge start date. Status of LHCb-INFN Computing, 10 Domenico Galli

11 Usage of the CNAF Tier-1 Computing Resources
Computing, Control and Service Nodes: 130 PIII CPUs (clock ranges from 866 MHz to 1.4 GHz) Disk Storage Servers 1 TB NAS (14 x 80 GB IDE disks + hotspare in RAID5). 1TB NAS (7 x 170 GB SCSI disks + hotspare in RAID5). All the stuff is working at a very high duty-cycle. CPU LOAD Status of LHCb-INFN Computing, 11 Domenico Galli

12 Plan for Analysis Activities
In autumn the analysis of the data produced during the Data Challenge is foreseen. Complete porting to Bologna of the development environment of the analysis code (DaVinci C++ code) already performed and in use on a mini-farm since 2 months. Need of an extension of the analysis mini-farm to a grater number of nodes for the need of the Italian LHCb collaboration. Data produced in Bologna are kept stored on Bologna disks, data produced in the other centers need to be transferred to Bologna on user-demand with an automatic procedure. Analysis jobs (on ~100 CPUs) need an I/O throughput (~100MB/s) greater than supplied by NAS (~10MB/s). Status of LHCb-INFN Computing, 12 Domenico Galli

13 High Performance I/O System (I)
An I/O parallelization system (through the use of a parallel file system) was successfully tested. PVFS (Parallel Virtual File System). File striping of data among local disks of several I/O servers (ION). Scalable System (throughput ~ 100 Mbit/s x n_ION) CN 1 CN 2 CN m ION 1 ION 2 ION n MGR I/O nodes Management Node Clients Network Status of LHCb-INFN Computing, 13 Domenico Galli

14 High Performance I/O System (II)
With 10 ION we were able to reach the Aggregate I/O of 110 MB/s (30 client nodes reading data). To be compared with: 20-40 MB/s (local disk) 10 MB/s (100Base-T NAS) 50 MB/s (1000Base-T NAS) With a single file hierarchy. Status of LHCb-INFN Computing, 14 Domenico Galli

15 Test of a PVFS-Based Analysis Facility (I)
Test performed using the OO DaVinci algorithm for B+– selection. Analyzed 44.5k signal events and 484k bb inclusive events in 25 minutes (to be compared with 2 days on a single PC). Completely performed with the Bologna Farm parallelizing the analysis algorithm over 106 CPUs (80 x 1.4 GHz PIII CPUs + 26 x 1 GHz PIII CPUs). DaVinci processes read OODST from PVFS. Status of LHCb-INFN Computing, 15 Domenico Galli

16 Test of a PVFS-Based Analysis Facility (II)
OODST ION 1 CN 1 OODST CN 2 ION 2 Nt-ple PVFS ION 10 OODST CN 106 MGR Login Node Status of LHCb-INFN Computing, 16 Domenico Galli

17 Test of a PVFS-Based Analysis Facility (III)
106 DaVinci processes reading from PVFS. 968 files (500 OODST events each) x 120 MB. 116 GB read and processed in 1500 s. Status of LHCb-INFN Computing, 17 Domenico Galli

18 B+–: Pion Momentum Resolution
p / p for identified pions coming from B0 |p / p| vs p for identified pions coming from B0 p / p FWMH  0.01 p / p p [GeV/c] Status of LHCb-INFN Computing, 18 Domenico Galli

19 All pi+ pi- pairs with all cuts
B0 Mass Plots All pi+ pi- pairs with no cuts All pi+ pi- pairs with all cuts All pi+ pi- pairs with all cuts (magnified) 3425 events MeV/c2 MeV/c2 FWMH  66 MeV Pt > 800 MeV/c d/d > 1.6 lB0 > 1 mm 105 events MeV/c2 Status of LHCb-INFN Computing, 19 Domenico Galli

20 bb Inclusive Background Mass Plot
Total number of events 484k. Only events with single interaction taken into account at the moment: ~240k. 213 events in mass region after all cuts. 32/213 are ghosts. All pi+ pi- pairs with all cuts GeV/c2 Status of LHCb-INFN Computing, 20 Domenico Galli

21 Signal Efficiency and Mass Plots for Tighter Cuts
Final Efficiency (tighter zero bb inclusive background (240k events) = 871/22271 = 4% Rejection against bb inclusive background > 1-1/ = % 871 signal events in mass region 16 BG events from signal sample in mass region (all ghosts) Pt > 2.8 GeV/c d/d > 2.5 lB0 > 0.6 mm GeV/c2 GeV/c2 Status of LHCb-INFN Computing, 21 Domenico Galli

22 Status of LHCb-INFN Computing, 22
Conclusions MC production farm stably running (with increasing resources) since more than 1 year. INFN Tier-1 is the second most active LHCb MC production centre (after CERN). The collaboration with the CNAF staff is excellent. Still we aren’t using GRID tools in production, but we plan to move as soon as the detector design is stable. An analysis mini-farm for interactive work is running since more than 1 month and we plan to extend the number of nodes depending on the availability of the resources. Massive analysis system architecture already tested using a parallel file system and 106 CPUs. We need at least to keep the present computing power at CNAF (but more resources to keep production running in parallel with massive analysis activities would be welcome) to supply the analysis facility to the LHCb-Italian collaboration. Status of LHCb-INFN Computing, 22 Domenico Galli


Download ppt "Status of LHCb-INFN Computing"

Similar presentations


Ads by Google