SouthGrid August 2010 7 JET Stable operation, (SL5 WNs) Could handle more opportunistic LHC work 1772HS06 1.5TB
SouthGrid August 2010 8 Birmingham Just purchased 40TB Storage –total storage to 10TB + 6*20 + 2*40 = 210 TB in a week or two Two new 64 bit servers –(SL5) Site BDII + monitoring VMs –(SL5) DPM head node Everything (except mon) is SL5 Both clusters have dual lcg- CE/CreamCE front ends Sluggish response/instabilities with GPFS on Shared Cluster –Installed 4TB NFS mounted file server for experiment software/middleware/user areas Taken on someone else's proprietary (non SL5) smart phone. He couldn't get signal in there either.
Bristol LCG StoRM SE with gpfs, 102TB 90% full of CMS data StoRM developers are finishing testing 1.5.4 on SL5 64bit, plan to provide 1.5.4 both for slc4 ia32 and sl5 x86_64 to Early Adopters this month (August). Bristol is waiting for stable well-tested StoRM v1.5 SL5 64-bit release. In the meantime Bristol's StoRM v1.3 (32-bit on SL4) working very well! On 1Gbps network, getting good bandwidth utilization Servers (StoRM & gridftp) very responsive despite load:
Prior WN: Intel XEON 2.0GHz; Dec2009 new WN: AMD 2.4GHz each AMD WN = 2 x 1TB drive, part of 1 disk = WN space Dr Metson experimenting with HDFS using rest of 1 disk + 2 nd disk, working with INFN on possibility of StoRM on top of HDFS Also experimenting with using Hadoop to process CMS data In Other News... Swingeing IT staff cuts being planned at U Bristol (and downgrades for those few remaining) Started planning that SouthGrid will take over Bristol LCG Site Admin from April 2011 Consolidate & reduce PP servers so Astro admin can inherit PP Staff will best-effort support Bristol AFS server (IS won't) HDFS with StoRM
SouthGrid August 2010 12 Bristol Plan to try to run the ces and other control nodes on Virtual machines using an identical setup to Oxford, to enable remote management. The StoRM SE on GPFS will be run by Bob Cregan on site.
SouthGrid August 2010 13 Cambridge 32 cores CPU installed April 2010: bought from GridPP3 tranche 2. Server to host several virtual machines (BDII, Mon, etc.) just delivered. Network upgraded last November to provide gigabit ethernet to all GRID systems. Storage is still 140TB; CPU will be increased due to the purchase in the first point. Atlas production is the main VO running on this site. Investigating current under utilisation, possible Accounting issues?
SouthGrid August 2010 14 RALPP We believe we are now through all the messing about with air conditioning, with our machine room now running on the refurbished/upgraded AC plant. Happy days, all except for the leaks shortly after they turned it on! We've been running well below nominal capacity for most of this year, but are pretty much back now. Joining with the Tier 1 for the tender process. Testing argus and glexec RGMA and site BDII now moved to SL5 VMs Working on setting up a test instance of dCache, working with the Tier 1, using Tier 2 hardware.
SouthGrid August 2010 15 Oxford Last 6 months cluster running with very high utilisation. Completed the tender for new kit and placed orders in July. Unfortunately the orders had to be cancelled due to manufacturing delays on the particular motherboard we ordered and a pricing problem. Now re-evaluating all suppliers with updated quotes. New Argus server installed. (Report by Kashif) –Installing Argus was easy and configuring was also OK once I understood the basic concept of policies but it took me a considerable time because of a bug in Argus which is partly due to old style of host certificate issued by UK CA. The same issue was responsible for gridpp voms server problem. I have reported this to UK CA. –Argus uses glexec on the WN, it is being tested the glexec installed on t2wn41. –Details on gridpp wiki http://www.gridpp.ac.uk/wiki/Oxfordhttp://www.gridpp.ac.uk/wiki/Oxford Oxford has become an early adopter for CREAM and ARGUS.
SouthGrid August 2010 16 Grid Cluster setup CREAM ce & pilot setup t2ce02 CREAM Glite 3.2 SL5 T2wn41 glexec enabled t2argus02 t2ce06 CREAM Glite 3.2 SL5 T2wn40 -87 Oxford
SouthGrid August 2010 17 gridppnagios Oxford runs the UKI Regional Nagios monitoring site. The Operations dashboard takes information from this. https://gridppnagios.physics.ox.ac.uk/nagios/ https://twiki.cern.ch/twiki/bin/view/LCG/Gr idServiceMonitoringInfo https://gridppnagios.physics.ox.ac.uk/nagios/ https://twiki.cern.ch/twiki/bin/view/LCG/Gr idServiceMonitoringInfo
Oxford Dashboard SouthGrid August 2010 18 Thanks to Glasgow for the idea / code
Oxfords Atlas dashboard SouthGrid August 2010 19
SouthGrid August 2010 20 Conclusions SouthGrid sites utilisation generally improving Many had recent upgrades for hardware using Gridpp3 second tranche, others putting out tenders, some delays following issues with vendor at Oxford RALPPD back to full strength following AC upgrade Monitoring for production running improving Concerns over reduced manpower at sites as we move into GridPP 4
Future Meetings Look forward to GridPP 26 in Sheffield next April If you look in the right places the views are as good as here in the lakes.