Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cambridge Site Report Cambridge Site Report HEP SYSMAN, RAL 10-11 th June 2010 Santanu Das Cavendish Laboratory, Cambridge Santanu.

Similar presentations


Presentation on theme: "Cambridge Site Report Cambridge Site Report HEP SYSMAN, RAL 10-11 th June 2010 Santanu Das Cavendish Laboratory, Cambridge Santanu."— Presentation transcript:

1 Cambridge Site Report Cambridge Site Report HEP SYSMAN, RAL 10-11 th June 2010 Santanu Das Cavendish Laboratory, Cambridge santanu@hep.phy.cam.ac.uk Santanu Das Cavendish Laboratory, Cambridge santanu@hep.phy.cam.ac.uk

2 11/06/2010santanu@hep.phy.cam.ac.uk HEP, Cavendish Laboratory 2 Man Power: For Group  John Hill – the main sys-admin  Steve Wotton – deputy sys-admin  Kavitha Nirmaladevi (half-time)  Around 1.2FTE of effort (part of Steve and John) For Grid system  Just myself For Group  John Hill – the main sys-admin  Steve Wotton – deputy sys-admin  Kavitha Nirmaladevi (half-time)  Around 1.2FTE of effort (part of Steve and John) For Grid system  Just myself

3 11/06/2010santanu@hep.phy.cam.ac.uk HEP, Cavendish Laboratory 3 Group System: Hardware/OS  Around 70 desktops (roughly 2:1 Linux:Windows)  Mainly SLC-5.5 and Windows XP (still some SLC4 desktops)  Servers are SLC4 at present, serving ~27TB of storage Present Work  18TB of clustered storage to replace ~7TB of the old storage mentioned earlier  Migrating from Windows 2000 to Windows 2008 domain  Buy ~35TB of storage for LHC n-tuple (and equivalent) Hardware/OS  Around 70 desktops (roughly 2:1 Linux:Windows)  Mainly SLC-5.5 and Windows XP (still some SLC4 desktops)  Servers are SLC4 at present, serving ~27TB of storage Present Work  18TB of clustered storage to replace ~7TB of the old storage mentioned earlier  Migrating from Windows 2000 to Windows 2008 domain  Buy ~35TB of storage for LHC n-tuple (and equivalent)

4 11/06/2010santanu@hep.phy.cam.ac.uk HEP, Cavendish Laboratory 4 Group System: Network  Gigabit backbone with a 1Gbps connection onto the University network  10Gbps (??) University connection on to JANET Future Plan  Nothing big  Buy ~35TB of storage for LHC n-tuple (and equivalent)  Traffic levels are rising and we may be forced to consider a upgrade Network  Gigabit backbone with a 1Gbps connection onto the University network  10Gbps (??) University connection on to JANET Future Plan  Nothing big  Buy ~35TB of storage for LHC n-tuple (and equivalent)  Traffic levels are rising and we may be forced to consider a upgrade

5 11/06/2010santanu@hep.phy.cam.ac.uk HEP, Cavendish Laboratory 5 Grid Status [hardware] : Head nodes  Dell PE-1050s (quad-core/8Gb) for CE, SE and UI Worker nodes  33 x Dell PE1950 (2*dual-core 5150 2.66GHz; shared with CamGrid)  4 x Viglan (2*quad-core E5420 @ 2.50Ghz)  4 x SunFires (2*quad-core L5520 @ 2.27GHz)  4 x Dell R410 (2*quad-core E5540 @ 2.53Ghz) Storage  108TB online (~100TB reserved for atlas).  ~30TB being used by local project; will be added to the grid soon. Head nodes  Dell PE-1050s (quad-core/8Gb) for CE, SE and UI Worker nodes  33 x Dell PE1950 (2*dual-core 5150 2.66GHz; shared with CamGrid)  4 x Viglan (2*quad-core E5420 @ 2.50Ghz)  4 x SunFires (2*quad-core L5520 @ 2.27GHz)  4 x Dell R410 (2*quad-core E5540 @ 2.53Ghz) Storage  108TB online (~100TB reserved for atlas).  ~30TB being used by local project; will be added to the grid soon.

6 11/06/2010santanu@hep.phy.cam.ac.uk HEP, Cavendish Laboratory 6 Grid Status [middleware] :  gLite 3.2 for the WNs.  gLite 3.1. for MON and UI.  gLite 3.2 for the CE, SE, site-BDII.  DPM v1.7.2 on the head node.  DPM v1.7.3 on of the DPM disk servers  XFS file system for the storage  Condor (v7.2.4) is used as the batch system.  Supported VOs: Mainly Atlas, LHCb and Camont  Additional VO support: Alice, Biomed, Calice, CMS, dteam, euindia, gridpp and obviously ops  gLite 3.2 for the WNs.  gLite 3.1. for MON and UI.  gLite 3.2 for the CE, SE, site-BDII.  DPM v1.7.2 on the head node.  DPM v1.7.3 on of the DPM disk servers  XFS file system for the storage  Condor (v7.2.4) is used as the batch system.  Supported VOs: Mainly Atlas, LHCb and Camont  Additional VO support: Alice, Biomed, Calice, CMS, dteam, euindia, gridpp and obviously ops

7 11/06/2010santanu@hep.phy.cam.ac.uk HEP, Cavendish Laboratory 7 Grid System: Network  Gigabit backbone with separate 1Gbps connection onto the University network  All the WNs are on the gigabit-network  10Gbps (??) University connection on to JANET Future Plan  Nothing big  Buy ~35TB of storage for LHC n-tuple (and equivalent)  But traffic levels are rising and we may be forced to consider a upgrade Network  Gigabit backbone with separate 1Gbps connection onto the University network  All the WNs are on the gigabit-network  10Gbps (??) University connection on to JANET Future Plan  Nothing big  Buy ~35TB of storage for LHC n-tuple (and equivalent)  But traffic levels are rising and we may be forced to consider a upgrade

8 11/06/2010santanu@hep.phy.cam.ac.uk HEP, Cavendish Laboratory 8 Grid System [issues]: Network  Middleware is too buggy for Condor  No proper/practical support, yet  All the previously written scripts are almost no longer maintained  Most of the “info-provider” scripts rewritten /modified locally  Every new release breaks the condor-glite integration  Cannot use yaim on CE  Spending too much time on fixing glite scripts rather trying new things.  GridPP4 money, of course. Network  Middleware is too buggy for Condor  No proper/practical support, yet  All the previously written scripts are almost no longer maintained  Most of the “info-provider” scripts rewritten /modified locally  Every new release breaks the condor-glite integration  Cannot use yaim on CE  Spending too much time on fixing glite scripts rather trying new things.  GridPP4 money, of course.

9 11/06/2010santanu@hep.phy.cam.ac.uk HEP, Cavendish Laboratory 9 Grid System [plans]: Network  Upgrade Condor  More job-slots and disk space  Condor on Cream-CE  Install Scas,glexec Network  Upgrade Condor  More job-slots and disk space  Condor on Cream-CE  Install Scas,glexec

10 11/06/2010santanu@hep.phy.cam.ac.uk HEP, Cavendish Laboratory 10 Questions??


Download ppt "Cambridge Site Report Cambridge Site Report HEP SYSMAN, RAL 10-11 th June 2010 Santanu Das Cavendish Laboratory, Cambridge Santanu."

Similar presentations


Ads by Google