Willkommen Welcome Bienvenue How we work with users in a small environment Patrik Burkhalter
How we work with users in a small environment Patrik Burkhalter System administrator HPC cluster at Empa At Empa since 2012 Linux system admin before Empa (mainly web, db and app servers)
Situation at Empa Agenda Situation at Empa Cluster support User support Enforcement
Situation at Empa At the moment, we have 2 clusters at Empa Ipazia, the cluster which we have since 2006 Hypatia, the new cluster which we have built this year The computing nodes from the old cluster will be dettached from Ipazia and connected to Hypatia step by step
Situation at Empa Ipazia the Empa HPC cluster 102 nodes (Dell) Built with the help of Partec and CSCS Parastation cluster middleware from Partec Torque resource manager Maui scheduler Infiniband DDR interconnect Lustre file systems
Situation at Empa Ipazia hardware Front end node PowerEdge * Intel(R) Xeon(R) CPU 2.33GHz (4 cores) 4GB RAM 1TB shared /home Computing nodes Node : deactivated, old 4 core pizza boxes Node : PowerEdge M605 2 * Quad-Core AMD Opteron(tm) Processor GB RAM Node 47…102: PowerEdge M610 2 * Intel(R) Xeon(R) CPU 2.53GHz 24GB RAM
Situation at Empa Hypatia the new Empa cluster Built from scratch by Empa 32 nodes in 2 Dell M1000e chassis Torque resource manager Maui scheduler Infiniband FDR interconnect Lustre file systems Know-How (we have support for the SAN units) Well documented In production. Nodes from Ipazia are getting migrated to Hypatia soon.
Situation at Empa Hypatia hardware Front end node PowerEdge R620 2 * Intel(R) Xeon(R) CPU E GHz (16*2 cores, hyper threading) 32GB RAM Computing nodes PowerEdge M620 2 * Intel(R) Xeon(R) CPU E GHz (16 cores) 64GB RAM
Situation at Empa pbstop on Ipazia
Situation at Empa pbstop on Hypatia (new cluster)
Situation at Empa Lustre storage available to both clusters 25TB for backuped data (/project) 35TB speed optimized space (/scratch) Due to the amount of disks
Situation at Empa We changed our support model this year from external support to inhouse support. Why did we do this? We felt confident, that it is possible We can save money on the service contracts We can now fix (almost) everything by ourselves We can provide a better user support, because we have a deeper understanding How did we minimize the risk that we break the cluster We built a new cluster and did leave the running cluster alone A lot of users are using the new cluster already We can migrate the nodes to the new cluster when the stability is proven
Situation at Empa Ipazia Pizza nodes removed 2 new chassis 1 new front end 1 new SSD storage
Situation at Empa Support team Daniele Passerone (5% FTE) Carlo Pignedoli (5% FTE) Patrik Burkhalter (50% FTE)
Cluster Support Agenda Situation at Empa Cluster support User support Enforcement
Cluster Support Support we provide Introduction to basic Linux usage Connecting to the system using a SSH client Linux basic commands File system hierarchy Introduction of new users to the cluster Planning of future jobs Reservation of nodes for users Installation, compilation and testing of new software GNU and Intel compilers, MPI (openmpi/mvapich2), OpenFOAM, Abaqus Every software requested by the user System updates Hardware, OS Software updates Acquiring and installing new hardware New nodes GPU node Replacing failed hardware
Cluster Support Documentation of the cluster architecture
Cluster Support Documentation of the cluster usage
Cluster Support Lustre file system maintenance and extension At the moment, we are migrating our Lustre file systems workspc and storage to project and scratch while the file systems are online 1 complete new file system named project using new hardware SSD Meta data (MDT)
Cluster Support Lustre file system maintenance and extension 1 new file system scratch out of the file systems workspc and storage We deactivate one OST per fs from the old file systems We are using `lfs find’ to find the files having stripes on the deactivated OSTs We copy the files to a new location on the same fs Finally, we move them to the origin location
Cluster Support OST gets disabled temporary on the ionode This makes sure the OST will stay readable lctl dl | grep ‘ osc ‘ lctl –-device deactivate
Cluster Support Migration for files with an access time > 14 days Copies quickly but is kind of dirty TMPDIR="/mnt/storage/tmp" for i in $(lfs find --obd storage-OST atime +14 /mnt/storage); do DIR=$(dirname $i) FILE=$(basename $i) TMPPATH="$TMPDIR/$FILE"; SRCPATH="$DIR/$FILE"; # testing above values, continue to next entry if one test fails echo -en "$SRCPATH: " cp -p $SRCPATH $TMPPATH || exit 1 mv $TMPPATH $SRCPATH || exit 1 echo done done
Cluster Support Migration for newer files Checks if file was changed during the migration process Does not check if file is open on another node Therefore we only touch users which have no jobs and no running processes on the front end node lfs find --obd storage-OST0003 /mnt/storage/pbu | lfs_migrate -y
Cluster Support After the migration, the nodes gets deactivated permanently lctl conf_param storage-OST0003.osc.active=0
Cluster Support Situation after the migration
Cluster Support Problems we experience during the migration A lot of small files are hard to migrate The user tends to “hoard” data
Cluster Support We also provide several shell environments for the users to ease up the cluster usage. We are using the Modules environment ( A module can be loaded with the command: `module load / ` The module sets the user environment variables as defined in the module We provide modules for each self compiled app and library This is particular handy for users which like to compile their own software We started to use this approach this year
Cluster Support Modules on Ipazia
Cluster Support Modules on Hypatia New modules are getting installed by user request
Cluster Support Example output of a module A simple module for ffmpeg We are trying to get rid of LD_LIBRARY_PATH and use RPATH instead This makes sure that a compiled binary uses the proper libraries independently from the user environment The module concept was new to our users but was accepted well
User Support Agenda Situation at Empa Cluster support User support Enforcement
Situation at Empa Users from Empa and Eawag ~120 users 40 active users in the last 30 days last | awk '{print $1}' | sort | uniq | wc –l
User Support Typical vendor to customer situation does not work at Empa We cannot provide a Service Level Agreement (SLA) We only can provide support on a best effort basis No support during the night or on weekend Unplanned down time can happen
User Support Typical IT user support does not work We cannot offer out of the box solution We don’t like to “just solve the problem now” We often don’t know the solution right now
User Support User as partner does work best for us The user gets threaded as equal. “If you think your users are idiots, only idiots will use it.” Linus Torvalds
User as a Partner The user has a strong scientific know how and sometimes just uses the software The engineer has a strong know how about clusters, but this means: A request by a scientist has to be reduced to the point at which the engineer is able to understand it The problem gets fixed by the engineer The solution gets communicated to the scientist in detail, until the scientist understands the particular situation It gets tested by the user It is important that each side understands the issue, otherwise potential optimization of the system gets lost.
User as a Partner If an user is experienced, tasks are getting delegated to the user. This could be: Compilation of apps and libraries Testing of a new package Problem analysis The solution always gets deployed by root to make sure all standards are fulfilled. If it is in the repository of our Linux distribution, it gets installed using the package manager If it is too old or not available, it gets compiled and installed in /share/apps or /share/libs The are modules provided to set the user environment module load / Our software gets compiled on a computing node and installed on the share file system
User as a Partner Example, Abaqus A Finite Element Method (FEM) software used by the mechanical systems engineering department of Empa. The users have a strong background in mechanical engineering The users are using Abaqus on Windows to engineer parts We made a wrapper to simplify the job submission
Enforcement Agenda Situation at Empa Cluster support User support Enforcement
At the moment, we only do enforcement of: Obviously - the root password is not given to the users Disk quotas are in place (size and inodes) Maui scheduling configuration Optimization is planned for Hypatia, the new cluster
Enforcement login screen provides some information to make the user aware of the cluster situation
Thanks for listening Any questions, thoughts?