Grid Computing Status Report Jeff Templon PDP Group, NIKHEF NIKHEF Scientific Advisory Committee 20 May 2005
Jeff Templon – NIKHEF SAC, HEP Computing Model u Tier-0 : measurement center (CERN) n Dedicated computers (L2/L3 trigger farms) n Archival of raw data u Tier-1 : data centers n Archival of 2nd copy of raw data n Large-scale computing farms (e.g. reprocessing) n Spread geographically n Strong Support u Tier-2 : user facilities for data analysis / Monte Carlo
Jeff Templon – NIKHEF SAC, Worldwide HEP Computing Needs
Jeff Templon – NIKHEF SAC, Amsterdam : Tier-1 for LHC u Three experiments : LHCb / ATLAS / ALICE u Overall scale determined by estimating funding in NL u Contribution to experiments scaled by NIKHEF presence: 3:2:1 u Resulting NIKHEF share of total Tier-1 needs: n LHCb: 23% n ATLAS: 11,5% n ALICE: 5,75%
Jeff Templon – NIKHEF SAC, Amsterdam Tier-1 Numbers Status: GOOD! Basic collaboration with SARA in place Attitude adjustment needed (response time) Appropriate funding line in NCF long-term draft plan Just enough; concerns about ‘me-too’ (grids are popular) Community-building (VL-E project) Pull me-too people into same infrastructure
Jeff Templon – NIKHEF SAC, Overall Status LHC Computing u LCG a successful service n 14,000 CPUs and well-ordered operations, active community n Monte Carlo productions working well (next slide) u Data Management a problem n Software never converged in EDG n May not be converging in EGEE (same team) n Risk losing HEP community on DM n Makes community-forming (generic middleware) difficult: I’ll just build my own, this one stinks
Jeff Templon – NIKHEF SAC, Results of “Data Challenge ’04” u Monte Carlo tasks distributed to computers across world u Up to 3000 simultaneous “jobs” per experiment u 2.2 million CPU-hours (250 years) used in one month u Total data volume > 25 TB For LHCb: NIKHEF ~ 6% of global total See it in action Backup
Jeff Templon – NIKHEF SAC, Transport of primary data to Tier-1s
LCG Service Challenge II “The Dutch Contribution”
Jeff Templon – NIKHEF SAC, Local Status u Positioned well in LCG & EGEE n Present on ‘blessed’ Tier-1 list n One of ‘best run’ sites n One of first sites (#3 in EDG, compare #4 in WWW) n Membership on: s Middleware Design Team (US collaboration here too) s Project Technical Forum s LCG Grid Applications Group (too bad, almost defunct) s Middleware Security Group s Etc etc etc u D. Groep chairs world-recognized EUGridPMA u K. Bos chairs LHC Grid Deployment Board
Jeff Templon – NIKHEF SAC, Local Status #2 u NIKHEF “grid site” n Roughly 300 CPUs / 10 terabytes of storage n Several distinct components s LCG / VL-E production s LCG pre-production s EGEE testing s VL-E certification u Manpower: 8 staff, interviews this week for three more (project funding)
Jeff Templon – NIKHEF SAC, PDP Group Activities u Middleware (3 FTE) n Mostly “security” -- best bang for buck + local world expert u Operations (3 FTE) n How does one operate a terascale / kilocomputer site? n Knowledge transfer to SARA (they have support mandate) n Contribute regularly to operational middleware u Applications (3 FTE) n Strong ties to local HEP (ATLAS “Rome” production, LHCb Physics Performance Report, D0 “SAMGrid”) n Community forming: LOFAR & KNMI, looking for others
Jeff Templon – NIKHEF SAC, Industrial Interest GANG NIKHEF IBM, LogicaCMG, Philips, HPC, UvA, SARA, NIKHEF … 16 industrial participants (24 total)