Presentation is loading. Please wait.

Presentation is loading. Please wait.

Offline Discussion M. Moulson 22 October 2004 Datarec status Reprocessing plans MC status MC development plans Linux Operational issues Priorities AFS/disk.

Similar presentations


Presentation on theme: "Offline Discussion M. Moulson 22 October 2004 Datarec status Reprocessing plans MC status MC development plans Linux Operational issues Priorities AFS/disk."— Presentation transcript:

1 Offline Discussion M. Moulson 22 October 2004 Datarec status Reprocessing plans MC status MC development plans Linux Operational issues Priorities AFS/disk space

2 2 Datarec DBV-20 DC geometry updated Global shift:  y =  550 μm,  z =  1080 μm Implemented in datarec for Run > 28000 Thickness of DC wall not changed (  75 μm) Modifications to DC timing calibrations Independence from EmC timing calibrations Modifications to event classification (EvCl) New KSTAG algorithm ( K S tagged by vertex in DC) Bunch spacing by run number in T0_FIND step 1 for ksl 2.715 ns for 2004 data (also for MC, some 2000 runs) Boost values Runs not reconstructed without BMOM v.3 in HepDB p x values from BMOM(3) now used in all EvCl routines Run > 31690

3 3 Datarec operations Runs 28479 (29 Apr) to 32380 (21 Oct, 00:00) 413 pb -1 to disk with tag OK 394 pb -1 with tag = 100 (no problems) 388 pb -1 with full calibrations 371 pb -1 reconstructed (96%) 247 pb -1 DSTs (except K  K  ) fsun03-fsun10 decommissioned 11 Oct Necessary for installation of new tape library datarec submission moved from fsun03 to fibm35 DST submission moved from fsun04 to fibm36 150 keV offset in  s discovered!

4 4 150 keV offset in  s Discovered while investigating ~100 keV discrepancies between physmon and datarec +150 keV adjustment to fit value of  s not implemented in physmon in datarec when final BVLAB  s values written to HepDB Plan of action: 1.New Bhabha histogram for physmon fit, taken from data 2.Sync datarec fit with physmon 3.Fix BVLAB fit before final 2004 values computed 4.Update 2001-2002 values in DB records histogram_history and HepDB BMOM 2001-2002 currently from BVLAB scan, need to add 150 KeV Update of HepDB technically difficult, need a solution

5 5 Reprocessing plans Issues of compatibility with MC DC geometry, T0_FIND modifications by run number DC timing modifications do not impact MC chain Additions to event classification would require new MCDSTs only In principle possible to use run number range to fix p x values for backwards compatibility Use batch queues? Main advantage: Increased stability

6 6 Further datarec modifications Modification of inner DC wall thickness (  75 μm) Implement by run number Cut DC hits with drift times  2.5 μs Suggested by P. de Simone in May to reduce fraction of split tracks Others?

7 7 Program Events (10 6 ) LSF Time (B80 days) Size (TB) e  e    3661200.8 e + e    (ISR only) 3661200.8  rad 11454801.7 ee  eeee  ee 380.152200.6  all 2520.211006.9  all (21 pb -1 scan) 2911300.7  K S K L 4111210011.0  K  K  6111262018.0 Total 1527-689040.5  K S K L  rare 6220*320 est.1.7 est. MC production status

8 8 Generation of rare K S K L events K S          3   K L             (DE)     Peak cross section: 7.5 nb Approx 2x sum of BRs for rare KL channels In each event, either K S or K L decays to rare mode Random selection Scale factor of 20 applies to K L For K S, scale factor is ~100

9 9 MC development plans Beam pipe geometry for 2004 data (Bloise) LSB insertion code (Moulson) Fix  generator (Nguyen, Bini) Improve MC-data consistency on tracking resolution (Spadaro, others) MC has better core resolution and smaller tails than data in E miss  p miss distribution in  background for K S  e analysis Improving agreement would greatly help for precision studies involving signal fits, spectra, etc. Need to systematically look at other topologies/ variables Need more people involved

10 10 Linux software for KLOE analysis P. Valente had completed an earlier port based on free software VAST F90-to-C preprocessor Clunky to build and maintain M. Matsyuk has completed a KLOE port based on the Intel Fortran compiler for Linux Individual, non-commercial license is free libkcp code compiles with zero difficulty Reconsider issues related to maintenance of KLOE software for Linux

11 11 Linux usage in KLOE analysis Most users currently processing YBOS DSTs into Ntuples on farm machines and transferring Ntuples to PCs AFS does not handle random-access data well i.e., writing CWNs as analysis output Multiple jobs on a single farm node stress AFS cache Farm CPU (somewhat) limited AFS disk space perennially at a premium KLOE software needs minimal for most analysis jobs YBOS to Ntuple: No DC reconstruction, etc. Analysis jobs on user PCs accessing DSTs via KID and writing Ntuples locally should be quite fast Continuing interest on part of remote users

12 12 KLOE software on Linux: Issues 1.Linux machines at LNF for hosting/compilation 3 of 4 Linux machines in Computer Center are down, including klinux (mounts /kloe/soft, used by P. Valente for VAST build) 2.KLOE code distribution User PCs do not mount /kloe/soft Move /kloe/soft to network-accessible storage? Use CVS for distribution? Elegant solution but user must periodically update… 3. Individual users must install Intel compiler 4. KID Has been built for Linux in the past 5. Priority/manpower

13 13 Operational issues Offline expert training 1-2 day training course for all experts General update PC backup system Commercial tape backup system available to users to backup individual PCs

14 14 Priorities and deadlines In order of priority, for discussion: 1.Complete MC production: K S K L rare 2.Reprocessing 3.MC diagnostic work 4.Other MC development work for 2004 5.Linux Deadlines?

15 15 Disk resources Current recalled areas Production0.7 TB User recalls2.1 TB DST cache 12.9 TB (10.2 TB added in April) 2001 – 2002 Total DSTs 7.4 TB Total MCDSTs 7.0 TB 2004 DST volume scales with  L 3.2 TB added to AFS cell Not yet assigned to analysis groups 2.0 TB available but not yet installed Reserved for testing new network-accessible storage solutions

16 16 Limitations of AFS Initial problems with random-access files blocking AFS on farm machines resolved Nevertheless, AFS has some intrinsic limitations: Volume sizes at most 100 GB Already pushed to the limit – max spec is 8 GB! Cache must be much larger than AFS-directed data volume for all jobs on farm machine Problem characteristic of random-access files (CWNs) Current cache sizes 3.5 GB on each farm machine More than sufficient for a single job Possible problems with 4 big jobs/machine Enlarging cache sizes requires purchase of more local disk for farm machines

17 17 Network storage: Future solutions Possible alternatives to AFS 1.NFS v. 4 kerberos authentication – use klog as with AFS Size of data transfers smaller, expect fewer problems with random-access files 2.Storage Area Network (SAN) filesystem Currently under consideration as a Grid solution Works only with Fibre Channel (FC) interfaces FC – SCSI/IP interface implemented in hardware/software Availability expected in 2005 Migration away from AFS probable within ~6 months 2 TB allocated to tests of new network storage solutions Current AFS system will remain interim solution

18 18 Current AFS allocations Volumes Space (GB) Working group cpwrk195Neutral K kaon170Neutral K kwrk200Charged K phidec400Radiative ecl149 mc90 recwrk30 trg100 trk90 365 200 400

19 19 A fair proposal? Each of the 3 physics WGs gets 1400 GB total Total disk space (incl. already installed) divided equally Physics WGs similar in size and diversity of analyses WGs can make intelligent use of space e.g.: Some degree of Ntuple sharing already present Substantial increases for everyone anyway

20 20 Additional information

21 21 Offline CPU/disk resources for 2003 Available hardware: 23 IBM B80 servers: 92 CPU’s 10 Sun E450 servers: 18 B80 CPU-equivalents 6.5 TB NFS-mounted recall disk cache Easy to reallocate between production and analysis Allocation of resources in 2003: 64 to 76 CPU’s on IBM B80 servers for production 800 GB of disk cache for I/O staging Remainder of resources open to users for analysis

22 22 Analysis environment for 2003 Production of histograms/Ntuples on analysis farm: 4 to 7 IBM B80 servers + 2 Sun E450 servers DST’s latent on 5.7 TB recall disk cache Output to 2.3 TB AFS cell accessed by user PC’s Analysis example: 440M K S K L events, 1.4 TB DST’s 6 days elapsed for 6 simultaneous batch processes Output on order of 10-100 GB Final-stage analysis on user PC/Linux systems

23 23 CPU power requirements for 2004 Input rate (KHz) Avg L (10 30 cm  2 s  1 ) B80 CPU’s needed to follow acquisition MC DST recon 76 CPU offline farm

24 24 CPU/disk upgrades for 2004 Additional servers for offline farm: 10 IBM p630 servers: 10×4 POWER4+ 1.45 GHz Adds more than 80 B80 CPU equivalents to offline farm Additional 20 TB disk space To be added to DST cache and AFS cell More resources already allocated to users 8 IBM B80 servers now available for analysis Can maintain this allocation during 2004 data taking Ordered, expected to be on-line by January

25 25 Installed tape storage capacity IBM 3494 tape library: 12 Magstar 3590 drives, 14 MB/s read/write 60 GB/cartridge (upgraded from 40 GB this year) 5200 cartridges (5400 slots) Dual active accessors Managed by Tivoli Storage Manager Maximum capacity: 312 TB (5200 cartridges) Currently in use: 185 TB

26 26 Tape storage requirements for 2004 Stored vol. by type (GB/pb  1 ) 2002 2004 est. Incl. streaming mods 118 98 16 43 57 49 16 43 Today+780 pb  1 +1210 pb  1 +2000 pb  1 Tape library usage (TB) free raw recon DST MC

27 27 Tape storage for 2004 Additional IBM 3494 tape library 6 Magstar 3592 drives: 300 GB/cartridge, 40 MB/s Initially 1000 cartridges (300 TB) Slots for 3600 cartridges (1080 TB) Remotely accessed via FC/SAN interface Definitive solution for KLOE storage needs Bando di gara submitted to Gazzetta Ufficiale Reasonably expect 6 months to delivery Current space sufficient for a few months of new data

28 28 Machine background filter for 2004 Background filter (FILFO) last tuned on 1999-2000 data 5% inefficiency for  events, varies with background level Mainly traceable to cut to eliminate degraded Bhabhas Removal of this cut:Reduces inefficiency to 1% Increases stream volume 5-10% Increases CPU time 10-15% New downscale policy for bias-study sample: Fraction of events not subject to veto, written to streams Need to produce bias-study sample for 2001-2002 data To be implemented as reprocessing of a data subset with new downscale policy Will allow additional studies on FILFO efficiency and cuts

29 29 Other offline modifications for 2004 Modifications to physics streaming: Bhabha stream: keep only subset of radiative events Reduces Bhabha stream volume by 4  Reduces overall stream volume by >40% K S K L stream: clean up choice of tags to retain Reduces K S K L stream volume by 35% K  K  stream: new tag using dE/dx Fully incorporate dE/dx code into reconstruction Eliminate older tags, will reduce stream volume Random trigger as source of MC background for 2004 20 Hz of random triggers synched with beam crossing allows background simulation for L up to 2  10 32 cm  2 s  1

30 30 KLOE computing resources tape library IBM 3494, 5400 60GB slots, 2 robots, TSM 324 TB 12 Magstar E1A drives, 14 MB/sec each managed disk space 0.8 TB SSA: offline staging 6.5 TB 2.2 TB SSA + 3.5 TB FC: latent disk cache offline farm 19 IBM B80 4×POWER3 375 8 Sun E450 4×UltraSPARC-II 400 AFS cell 2 IBM H70 4×RS64-III 340 1.7 TB SSA + 0.5 TB FC disk online farm 7 IBM H50 4×PPC604e 332 1.4 TB SSA disk analysis farm 4 IBM B80 4×POWER3 375 2 Sun E450 4×UltraSPARC-II 400 file servers 2 IBM H80 6×RS64-III 500 DB2 server IBM F50 4×PPC604e 166 CISCO Catalyst 6000 nfs afs 100 Mbps1 Gbps

31 31 2004 CPU estimate: details Extrapolated from 2002 data with some MC input 2002  L  = 36  b  1 /s  T3  = 1560 Hz 345 Hz  + Bhabha 680 Hz unvetoed CR 535 Hz bkg 2004  L  = 100  b  1 /s (assumed)  T3  = 2175 Hz 960 Hz  + Bhabha 680 Hz unvetoed CR 535 Hz bkg (assumed constant) From MC:   = 3.1  b (assumed)  + Bhabha trigger:  = 9.6  b  + Bhabha FILFO:  = 8.9  b CPU(  + Bhabha) = 61 ms avg. CPU time calculation: 4.25 ms to process any event + 13.6 ms for 60% of bkg evts + 61 ms for 93% of  + Bha evts 2002: 19.6 ms/evt overall – OK 2004: 31.3 ms/evt overall (  10%)

32 32 2004 tape space estimate: details 2001: 274 GB/pb  1 2002: 118 GB/pb  1 Highly dependent on luminosity 2004: Estimate a priori Assume: 2175 Hz @ 2.6 KB/evt Raw event size assumed same for all events (has varied very little with background over KLOE history) Assume:  L  = 100  b  1 /s 1 pb  1 = 10 4 s: 25.0 GB for 9.6M physics evts 31.7 GB for 12.2M bkg evts (1215 Hz of bkg for 10 4 s) 56.7 GB/pb-1 total Stream 2001-2002 GB/pb  1 2004 GB/pb  1 KKKK 11.6 KSKLKSKL 19.712.8  3.3 radiative6.4 Bhabha56.014.0 other0.8 Total9849 rawrecon Include effects of streaming changes: MC Assumes 1.7M evt/pb  1 produced  all (1:5) and   K S K L (1:1)


Download ppt "Offline Discussion M. Moulson 22 October 2004 Datarec status Reprocessing plans MC status MC development plans Linux Operational issues Priorities AFS/disk."

Similar presentations


Ads by Google