Presentation is loading. Please wait.

Presentation is loading. Please wait.

T. Bowcock Liverpool Sept 00. Sept 11 2000LHCb-GRID T. Bowcock 2 University of Liverpool Successes Issues Improving the system Comments.

Similar presentations


Presentation on theme: "T. Bowcock Liverpool Sept 00. Sept 11 2000LHCb-GRID T. Bowcock 2 University of Liverpool Successes Issues Improving the system Comments."— Presentation transcript:

1 T. Bowcock Liverpool Sept 00

2 Sept 11 2000LHCb-GRID T. Bowcock 2 University of Liverpool Successes Issues Improving the system Comments

3 Sept 11 2000LHCb-GRID T. Bowcock 3 Architecture Master External Ethernet MAP Slaves Hub (Switch- 00) Hub (Switch - 00) 100BaseT

4 Sept 11 2000LHCb-GRID T. Bowcock 4 Successes Many events generated for –LHCb –H1 –DELPHI –ATLAS –(CDF) Software –Fault tolerant on transfers –Can distribute MC’s, update system etc. Prototype a big system

5 Sept 11 2000LHCb-GRID T. Bowcock 5 Issues Ease of use –Expert system –Robustness OS (RH5.1 and 6.2) Storage Up-time since Jan’00 only - 90% –Development + Recovery –Typically run with 95% CPU’s in MAP

6 Sept 11 2000LHCb-GRID T. Bowcock 6 Issues COMPASS nodes –Do not appear as a single volume 10 disks /scsi1 /scsi2….. User unfriendly? Yup. Do we care? –Who is our user?

7 Sept 11 2000LHCb-GRID T. Bowcock 7 Bigger Issues Hardware failures –NIC (20%): low-cost but annoying Replacement fast. –Disk’s –Power Supplies Problem with sleave-bearing fans. Leading to higher failure rate. Higher quality fans will be installed (Sept ’00) Down-time expected of about 1 week. –Hubs replaced by Switches

8 Sept 11 2000LHCb-GRID T. Bowcock 8 Improvements needed Multi-Redundant Masters –Plan 6-fold redundancy for security and simplicity of operation (Oct ’00) MAP-FCS –Need to bullet-proof it and make it ‘idiot proof’ Interface to User –Grid or remote login needs development Storage & Transfer –How do disks appear to the outside world? Data-Analysis Capability

9 Sept 11 2000LHCb-GRID T. Bowcock 9 Software Improvements Sept ’00 –Complete upgrade to RH6.2 –Complete upgrade of MAP-FCS Oct-Dec ’00 –reduce system vulnerability Large improvement from above –GLOBUS interface?

10 Sept 11 2000LHCb-GRID T. Bowcock 10 Expanding MAP’s role MAP conceived of as a MC engine –Throwaway MC(or ntuples) Keeping MC involves moving data to COMPASS –Done How do we (re-analyse) larger chunks of data? –Assuming we want to do this….

11 Sept 11 2000LHCb-GRID T. Bowcock 11 Challenge - Example(a) –LHCb. Want to produce 10 6 events. Reprocess it once or twice and then analyse it for a while. Optimistically 10 6 is about 1TByte of space –Solution Increase disk store on each MAP node, and store data there. Analysis/repro possible. –But implies we now need resource management Disks can get full up. Who gets to play?!

12 Sept 11 2000LHCb-GRID T. Bowcock 12 Challenge - Example(b) –CDF. Want to import data from FNAL. Want to analyse at Liverpool. –Solution(none yet!) Importing data tricky. Rely on transfer from tape to stage (e.g. COMPASS). –At 5MBytes/s (Fast Ethernet) 1Tbyte would take 200000s!!!! To get onto nodes (2days). Using 6 COMPASS nodes about 8 hrs Installation Gbit –Expensive, reduces to about 1Hr.

13 Sept 11 2000LHCb-GRID T. Bowcock 13 Architecture Modification Currently: (MC-mode) 6-fold redundant master can control 300-10000 PC’s Split into subfarms to increase the I/O bandwidth

14 Sept 11 2000LHCb-GRID T. Bowcock 14 Subfarm Solution Possible –but substantial development of system MC is the biggest problem –But we still need to analyse the data –Where is the balance?

15 Sept 11 2000LHCb-GRID T. Bowcock 15 … so Suggest following steps –Complete installation of the 6 MAP masters (COMPASS nodes), 0.5Tbyes/each. O/P from jobs can be directed there –Increase disk capacity on existing nodes Purchase 300 20Gbyte disks (about 50KChF) Hopefully by Oct 1. (Total disk capacity of 6Tbytes on MAP, 3 on COMPASS nodes). –Allow users to create persistant stores on MAP –Business (bazaar style) as usual…

16 Sept 11 2000LHCb-GRID T. Bowcock 16 Further Improvements Make MAP accessible! –Globus –Care required More users More store More management…. –Package the software for distribution But does anybody want it????? Hardware Upgrades –More nodes

17 Sept 11 2000LHCb-GRID T. Bowcock 17 Comments Can any one system provide all the facilities and capabilities? –cpu, storage, data-access, i/o? How do institutes/regional centres really fit in? –Balance of politics and effectiveness Lessons for 2004…


Download ppt "T. Bowcock Liverpool Sept 00. Sept 11 2000LHCb-GRID T. Bowcock 2 University of Liverpool Successes Issues Improving the system Comments."

Similar presentations


Ads by Google