Martin Bly RAL CSF Tier 1/A RAL Tier 1/A Status HEPiX-HEPNT NIKHEF, May 2003.

Slides:



Advertisements
Similar presentations
Setting up repositories: Technical Requirements, Repository Software, Metadata & Workflow. Repository services Iryna Kuchma, eIFL Open Access program manager,
Advertisements

Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
12th September 2002Tim Adye1 RAL Tier A Tim Adye Rutherford Appleton Laboratory BaBar Collaboration Meeting Imperial College, London 12 th September 2002.
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH Home server AFS using openafs 3 DB servers. Web server AFS Mail Server.
23rd April 2002HEPSYSMAN April Oxford University Particle Physics Site Report Pete Gronbech Systems Manager.
UCL HEP Computing Status HEPSYSMAN, RAL,
24-Apr-03UCL HEP Computing Status April DESKTOPS LAPTOPS BATCH PROCESSING DEDICATED SYSTEMS GRID MAIL WEB WTS SECURITY SOFTWARE MAINTENANCE BACKUP.
A couple of slides on RAL PPD Chris Brew CCLRC - RAL - SPBU - PPD.
IBM Software Group ® Integrated Server and Virtual Storage Management an IT Optimization Infrastructure Solution from IBM Small and Medium Business Software.
PowerEdge T20 Customer Presentation. Product overview Customer benefits Use cases Summary PowerEdge T20 Overview 2 PowerEdge T20 mini tower server.
Report of Liverpool HEP Computing during 2007 Executive Summary. Substantial and significant improvements in the local computing facilities during the.
Tier1A Status Andrew Sansum GRIDPP 8 23 September 2003.
Martin Bly RAL Tier1/A RAL Tier1/A Site Report HEPiX-HEPNT Vancouver, October 2003.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
Status of analysis & MC farms 30/08/2002 Concezio Bozzi INFN Ferrara.
Title US-CMS User Facilities Vivian O’Dell US CMS Physics Meeting May 18, 2001.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Gareth Smith RAL PPD HEP Sysman. April 2003 RAL Particle Physics Department Site Report.
EU funding for DataGrid under contract IST is gratefully acknowledged GridPP Tier-1A Centre CCLRC provides the GRIDPP collaboration (funded.
Edinburgh Site Report 1 July 2004 Steve Thorn Particle Physics Experiments Group.
Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
Dave Newbold, University of Bristol24/6/2003 CMS MC production tools A lot of work in this area recently! Context: PCP03 (100TB+) just started Short-term.
Online Systems Status Review of requirements System configuration Current acquisitions Next steps... Upgrade Meeting 4-Sep-1997 Stu Fuess.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
30-Jun-04UCL HEP Computing Status June UCL HEP Computing Status April DESKTOPS LAPTOPS BATCH PROCESSING DEDICATED SYSTEMS GRID MAIL WEB WTS.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
23 Oct 2002HEPiX FNALJohn Gordon CLRC-RAL Site Report John Gordon CLRC eScience Centre.
HEPiX/HEPNT TRIUMF,Vancouver 1 October 18, 2003 NIKHEF Site Report Paul Kuipers
Tier1 Status Report Martin Bly RAL 27,28 April 2005.
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
Martin Bly RAL Tier1/A RAL Tier1/A Report HepSysMan - July 2004 Martin Bly / Andrew Sansum.
Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.
Linux Servers with JASMine K. Edwards, A. Kowalski, S. Philpott HEPiX May 21, 2003.
28 April 2003Imperial College1 Imperial College Site Report HEP Sysman meeting 28 April 2003.
Software Scalability Issues in Large Clusters CHEP2003 – San Diego March 24-28, 2003 A. Chan, R. Hogue, C. Hollowell, O. Rind, T. Throwe, T. Wlodek RHIC.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
Laboratório de Instrumentação e Física Experimental de Partículas GRID Activities at LIP Jorge Gomes - (LIP Computer Centre)
SLAC Site Report Chuck Boeheim Assistant Director, SLAC Computing Services.
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
22nd March 2000HEPSYSMAN Oxford Particle Physics Site Report Pete Gronbech Systems Manager.
The GRID and the Linux Farm at the RCF HEPIX – Amsterdam HEPIX – Amsterdam May 19-23, 2003 May 19-23, 2003 A. Chan, R. Hogue, C. Hollowell, O. Rind, A.
2-3 April 2001HEPSYSMAN Oxford Particle Physics Site Report Pete Gronbech Systems Manager.
19th September 2003Tim Adye1 RAL Tier A Status Tim Adye Rutherford Appleton Laboratory BaBar UK Collaboration Meeting Royal Holloway 19 th September 2003.
TRIUMF Site Report – HEPiX/HEPNT – NIKHEF, Amsterdam, May 19-23/2003 TRIUMF Site Report HEPiX/HEPNT NIKHEF, Amsterdam May 19-23/2003 Corrie Kost.
Martin Bly RAL Tier1/A Centre Preparations for the LCG Tier1 Centre at RAL LCG CERN 23/24 March 2004.
Tier1 Andrew Sansum GRIDPP 10 June GRIDPP10 June 2004Tier1A2 Production Service for HEP (PPARC) GRIDPP ( ). –“ GridPP will enable testing.
CASPUR Site Report Andrei Maslennikov Lead - Systems Amsterdam, May 2003.
Tier1A Status Andrew Sansum 30 January Overview Systems Staff Projects.
RAL Site report John Gordon ITD October 1999
RAL Site Report John Gordon HEPiX/HEPNT Catania 17th April 2002.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
Gareth Smith RAL PPD RAL PPD Site Report. Gareth Smith RAL PPD RAL Particle Physics Department Overview About 90 staff (plus ~25 visitors) Desktops mainly.
2-Sep-02Steve Traylen, RAL WP6 Test Bed Report1 RAL and UK WP6 Test Bed Report Steve Traylen, WP6
Partner Logo A Tier1 Centre at RAL and more John Gordon eScience Centre CLRC-RAL HEPiX/HEPNT - Catania 19th April 2002.
Sydney Region Servers. Windows 2003 Standard Configuration Able to be supported remotely Antivirus updates managed from server.
The 2001 Tier-1 prototype for LHCb-Italy Vincenzo Vagnoni Genève, November 2000.
Tier1A Status Martin Bly 28 April CPU Farm Older hardware: –108 dual processors (450, 600 and 1GHz) –156 dual processor 1400MHz PIII Recent delivery:
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
Hans Wenzel CDF CAF meeting October 18 th -19 th CMS Computing at FNAL Hans Wenzel Fermilab  Introduction  CMS: What's on the floor, How we got.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
LCG Deployment in Japan
UK GridPP Tier-1/A Centre at CLRC
QMUL Site Report by Dave Kant HEPSYSMAN Meeting /09/2019
Presentation transcript:

Martin Bly RAL CSF Tier 1/A RAL Tier 1/A Status HEPiX-HEPNT NIKHEF, May 2003

Martin Bly RAL CSF Tier 1/A CPU Farm – Existing Hardware 108 dual processors (450, 600 and 1GHz) –Up to 1GB RAM –Desktop towers on warehouse shelves 156 dual processor 1400MHz PIII –133MHz FSB, 1Gb RAM each –1U rackmount, remote power switching –RedHat 7.2

Martin Bly RAL CSF Tier 1/A New Hardware – Spring dual processor 1U rackmount units –2 x 2.66GHz P4 533MHz FSB –Hyper-threading –2048Mbyte memory –2x1Gb/s NICs (o/b) –RedHat 7.3 –3 racks, remote power switching Next delivery expected Summer 2003

Martin Bly RAL CSF Tier 1/A Operating Systems Operating Systems: –Redhat 6.2 service will close end May –Redhat 7.2 service has been in production for Babar for 6 months. –New Redhat 7.3 service now available for LHC/other experiments –Testing/benchmarking on new Xeon systems Increasing demands for security updates becoming problematic.

Martin Bly RAL CSF Tier 1/A Disk Farm – Existing Hardware 2002 – 26 servers, each with 2 external RAID arrays - 1.7TB disk per server, RAID 5: –Excellent performance, well balanced system –Problems with a bad batch of Maxtor drives – many failures and high error rate – all 620 drives now replaced by Maxtor. –Still outstanding problems with Accusys controller failing to eject bad drives from RAID set.

Martin Bly RAL CSF Tier 1/A Disk Farm – Spring Recent upgrade to disk farm: –11 dual P4 Xeon servers (2.4GHz, 1024Mb RAM, PCIx), each with 2 Infortrend IFT-6300 arrays via Ultra160 SCSI –12 Maxtor 200GB DiamondMax Plus 9 drives per array, RAID 5. Not yet in production – but a few snags: –Originally tendered Maxtor Maxline Plus II drive was found not to exist! –Infortrend array has 2TB limit per RAID set – pushing for a firmware update. –11+1spare better than 2 x 6 – 5Gb over 11 systems. Nick White for more

Martin Bly RAL CSF Tier 1/A New Projects Basic fabric performance monitoring (ganglia) Resource CPU accounting (based on PBS accounts/mysql) New CA in production New batch scheduler (MAUI) Deploy new helpdesk (May)

Martin Bly RAL CSF Tier 1/A Ganglia Urgently needed live performance and utilisation monitoring: –RAL Ganglia Monitoring Scalable solution based on multicast Very rapidly deployable - reasonable support on all Tier1A Hardware See:

Martin Bly RAL CSF Tier 1/A PBS Accounting Software Need to keep track of system CPU and disk usage. Home grown PBS accounting package (Derek Ross): –Upload PBS and disk stats into MYSQL –Process with Perl DBI script –Serve via Apache Contact Derek for more

Martin Bly RAL CSF Tier 1/A MAUI / PBS Maui scheduler has been in production for last 4 months. Allows extremely flexible scheduling with many features. But …. –Not all of it works – we have done much work with developers for fixes. –Major problem – MAUI schedules on wall clock time – not CPU time. Had to bodge it!!

Martin Bly RAL CSF Tier 1/A New Helpdesk Software Old helpdesk based/unfriendly. With additional staff, urgently need to deploy new solution. Expect new system to be based on free software – probably Request Tracker Hope that deployed system will also meet needs of Testbed and may also satisfy Tier 2 sites. Expect deployment by end of May.

Martin Bly RAL CSF Tier 1/A Outstanding issues / worries We have to run many distinct services. –Fermi Linux –RH 6.2/7.2/7.3… –EDG testbeds, LCG … Farm management is getting very complex. We need better tools and automation. Security is becoming a big concern again.