Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cluster Configuration Update Including LSF Status Thorsten Kleinwort for CERN IT/PDP-IS HEPiX I/2001 LAL Orsay Tuesday, December 08, 2015.

Similar presentations


Presentation on theme: "Cluster Configuration Update Including LSF Status Thorsten Kleinwort for CERN IT/PDP-IS HEPiX I/2001 LAL Orsay Tuesday, December 08, 2015."— Presentation transcript:

1 Cluster Configuration Update Including LSF Status Thorsten Kleinwort for CERN IT/PDP-IS HEPiX I/2001 LAL Orsay Tuesday, December 08, 2015

2 8 December 2015Thorsten Kleinwort IT/PDP/IS Cluster Configuration Update and LSF Status FunctionSoftware Hardware Management Cluster Configuration

3 8 December 2015Thorsten Kleinwort IT/PDP/IS Cluster Configuration Update and LSF Status FunctionSoftware Hardware Management Cluster Configuration

4 8 December 2015Thorsten Kleinwort IT/PDP/IS Function CERN IT/PDP-IS responsible for: Central Unix based batch & interactive platforms: LXPLUS, LXBATCH, RSPLUS, DXPLUS, HPPLUS Installation, maintenance & support Dedicated clusters for several experiments (batch & interactive): Different setups, different HW, user mgmt… Individual configurations

5 8 December 2015Thorsten Kleinwort IT/PDP/IS Function

6 8 December 2015Thorsten Kleinwort IT/PDP/IS Function LEP Experiments: ‘Old’ Experiments,all kind of legacy platforms: leave until 2003, freezing earlier not practical Non-LEP Experiments: Transition to Linux/Solaris ASAP Merge experiment clusters into LXBATCH/LXPLUS: Reduce diversity More efficient use of shared resources

7 8 December 2015Thorsten Kleinwort IT/PDP/IS Cluster Configuration Update and LSF Status FunctionSoftware Hardware Management Cluster Configuration

8 8 December 2015Thorsten Kleinwort IT/PDP/IS Software In the past: All Unix flavours Now: Mainly Linux (RedHat) Solaris as 2 nd platform: Check software for platform dependencies Enhanced debugging/development tools on Solaris AFS for software/homedir/scratch Started recently to investigate OpenAFS RFIO for data access: we want to avoid NFS

9 8 December 2015Thorsten Kleinwort IT/PDP/IS Software: Installation Kickstart & Jumpstart (Linux & Solaris): For basic system installation SUE: For post installation & configuration ASIS: For software installation in /usr/local: now whole ASIS (~3GB) is local LSF

10 8 December 2015Thorsten Kleinwort IT/PDP/IS Software: Batch LSF with Multicluster option: Interactive nodes: submission hosts (cluster) Batch nodes: execution hosts (cluster) Some interactive nodes have night/weekend queues On public cluster (LXBATCH): Dedicated resources for experiments Some clusters are “cross linked”, e.g. submission from a dedicated cluster to LXBATCH Open question of scalability

11 8 December 2015Thorsten Kleinwort IT/PDP/IS Software: LSF Multicluster Submit Cluster: Execution Cluster: LXPLUSLXBATCH Queue:1nd cms_1nd CMS_CLUSTERCMS_BATCH cms_queue

12 8 December 2015Thorsten Kleinwort IT/PDP/IS Software: Batch Shared batch facility requirements: If dedicated resource is unused, it should be available for others On the other hand, allocation of dedicated nodes ASAP, if needed Queues/Resources should be controlled by UNIX groups rather than users to handle huge number and frequently changing users “Wish list” for LSF in preparation, to send to Platform Computing

13 8 December 2015Thorsten Kleinwort IT/PDP/IS Cluster Configuration Update and LSF Status FunctionSoftware Hardware Management Cluster Configuration

14 8 December 2015Thorsten Kleinwort IT/PDP/IS Hardware All kind of legacy HW in clusters: IBM, SGI, DEC, HP… Now concentrating on Intel PC running Linux (on both client & server side) Sun (Solaris) as 2 nd HW platform: Building development cluster SUNDEV RISC decommissioning in progress

15 8 December 2015Thorsten Kleinwort IT/PDP/IS Hardware: RISC Decommissioning

16 8 December 2015Thorsten Kleinwort IT/PDP/IS Hardware: Intel PC Still utilize boxes: Financial rules & difficult TCO definition for rack mounted solutions But plans to go to rack-mounted solutions in the future Intel PCs: differences on each offer: (1 or 2 disks; 2,4,8,12,20,30 GB) Experiments buying equipment: Broadens diversity

17 8 December 2015Thorsten Kleinwort IT/PDP/IS Hardware

18 8 December 2015Thorsten Kleinwort IT/PDP/IS Hardware On the server/service side: Going from RISC/SCSI to Intel/EIDE: Mirrored 1.5TB 20x75GB EIDE disks servers Testing RAID 5 All Tape Services are now on PCs AFS servers are now on SUNs: Experimenting with AFS scratch on Linux

19 8 December 2015Thorsten Kleinwort IT/PDP/IS Cluster Configuration Update and LSF Status FunctionSoftware Hardware Management Cluster Configuration

20 8 December 2015Thorsten Kleinwort IT/PDP/IS Management Currently: Merging clusters into LXPLUS/LXBATCH Aligning individual setups into global ones Continue RISC decommissioning: Restrict usage to LEP Experiments Transferring users to public facilities Face rapidly growing number of clients Automate & optimise

21 8 December 2015Thorsten Kleinwort IT/PDP/IS Management Starting Testbed (Intel/Linux Dual PCs) In 2000 ~ 100 machines In 2001 ~ 200 machines In addition: LHC Test facility Testbed for the DataGrid Project It will grow over the next two years to reach a significant fraction of the LHC scale by 2003

22 8 December 2015Thorsten Kleinwort IT/PDP/IS Testbed Schedule

23 8 December 2015Thorsten Kleinwort IT/PDP/IS Management Collaboration with DataGrid: WP4 (Computing Fabric): Installation Task Configuration Task Monitoring Task We contribute to WP4 and want to benefit from it Talk by Philippe Defert on DataGrid

24 8 December 2015Thorsten Kleinwort IT/PDP/IS Management New internal projects started: User account management: “How to manage /etc/passwd, /etc/groups,…” Investigate central service (LDAP) Accounting: How to control access & usage of shared facilities by different groups Security: Increase the host based security by checking the integrity of the system

25 8 December 2015Thorsten Kleinwort IT/PDP/IS Outlook Reducing diversity of HW/SW Continue merging of clusters Facing growing number of PCs Starting internal projects Benefit from DataGrid WP4 Going for LHC: prepare now to be ready when it starts


Download ppt "Cluster Configuration Update Including LSF Status Thorsten Kleinwort for CERN IT/PDP-IS HEPiX I/2001 LAL Orsay Tuesday, December 08, 2015."

Similar presentations


Ads by Google