“Managing a farm without user jobs would be easier” Clusters and Users at CERN Tim Smith CERN/IT
2002/10/25HEPiX fall 2002: Contents The road to shared clusters Batch cluster Configuration User challenges Addressing the challenges Interactive cluster Load balancing Conclusions
2002/10/25HEPiX fall 2002: The Demise of Free Choice
2002/10/25HEPiX fall 2002: Cluster Aggregation
2002/10/25HEPiX fall 2002: Organisational Compromises Clusters per Groups Sized for the average users Sized for user peaks users financiers : wasted resources Invest effort in recooperating cycles for other groups Configuration differences / specialities Bulk Production Clusters Production fluctuations dwarf those in user anal Complex cross-submission links
2002/10/25HEPiX fall 2002: Production Farm: Planning
2002/10/25HEPiX fall 2002: Shared Clusters lxplus001 lxbatch001 DNS load balancing LSF disk001 rfio tape001 rfio disk001 tape Batch Servers 70 Interactive Servers 120 Disk Servers
2002/10/25HEPiX fall 2002: Simple, Uniform Shared Cluster ?
2002/10/25HEPiX fall 2002: Partitioning Still have identified resources Uniform configuration Sharing Repartitioning or soak-up queues If owner experiment reclaims resources, must suspend soak-up jobs – stranded jobs ALICEATLASCMSLHCbALEPHDELPHIL3OPALCOMPASSNtofOPERASLAPPARCPARC IntCVSBUILDDELPHI IntCSFPublic
2002/10/25HEPiX fall 2002: LSF Fair-Share Trade-in partition for a share Multilevel ATLAS 10%, CMS 12%, … cmsprod 45%, HiggsWG 15%, … usera 10%, userb 80%, userc 10% Extra shares for productions Effort: Juggling resources to Accounting Demonstrating fairness Protecting Policing
2002/10/25HEPiX fall 2002: Facts and Figures Accounting LSF job records Process with C-program Load into Oracle DB Prepare plots/tables with Crystal Reports package LSFAnalyser ? Monitoring Poll the user access tools SiteAssure ?
2002/10/25HEPiX fall 2002: CPU Time / Week Merged user analysis and production farms
2002/10/25HEPiX fall 2002: Performance of Batch Job Slot Analysis ThuFriSa 10 min / tick
2002/10/25HEPiX fall 2002: Challenging Batch (I) Probing boundaries Flooding Concurrent starts Uncontrolled status polling Hitting limits Disk space /tmp /pool /var Memory, Swap Full Guarantees for other user jobs? System Issues Queue drainers
2002/10/25HEPiX fall 2002: Challenging Batch (II) Un-Fair-Share Logging onto batch machines Batch jobs which resubmit themselves Forking sessions back to remote hosts Wasting resources Spawning processes which outlive the jobs Sleeping processes Copying large AFS trees Establishing connections to dead machines
2002/10/25HEPiX fall 2002: Counter Measures File system quotas Virtual memory limits Concurrent jobs limits per user/group Restricted access through PAM Instant response queues Master node setup Dedicated, 1GB memory Failover cluster
2002/10/25HEPiX fall 2002: Shared Clusters lxplus001 lxbatch001 DNS load balancing LSF disk001 rfio tape001 rfio disk001 tape Batch Servers 70 Interactive Servers 120 Disk Servers LSF MultiCluster
2002/10/25HEPiX fall 2002: Shared Clusters lxplus001 lxbatch001 DNS load balancing LSF disk001 rfio tape001 rfio disk001 tape Batch Servers 70 Interactive Servers 120 Disk Servers Single Cluster
2002/10/25HEPiX fall 2002: Interactive Cluster DNS load balancing (ISS) Weighted load indexes load, memory swap rate, disk IO rate # processes, # sessions, # window mgr sessions Exclusion thresholds file systems full, nologins DNS publish 2 every 30 seconds Random from lowest 5
2002/10/25HEPiX fall 2002: Daily Users 35 users / node
2002/10/25HEPiX fall 2002: Challenging Interactive Sidestep load balancing Parallel sessions across farm Running daemons Brutal logouts Open connections Defunct processes CPU sapping orphaned processes Monitoring + beniced + Monthly reboots
2002/10/25HEPiX fall 2002: Interactive Reboots
2002/10/25HEPiX fall 2002: Conclusions Shared clusters present more user opportunities Both Good and Bad ! Don’t represent a panacea for sysadmins !