Presentation is loading. Please wait.

Presentation is loading. Please wait.

Universität Karlsruhe (TH) Rechenzentrum How to use the System SSCK Workshop – Introduction to HP XC6000 Cluster Karlsruhe, March 9 – 11, 2005 Hartmut.

Similar presentations


Presentation on theme: "Universität Karlsruhe (TH) Rechenzentrum How to use the System SSCK Workshop – Introduction to HP XC6000 Cluster Karlsruhe, March 9 – 11, 2005 Hartmut."— Presentation transcript:

1 Universität Karlsruhe (TH) Rechenzentrum How to use the System SSCK Workshop – Introduction to HP XC6000 Cluster Karlsruhe, March 9 – 11, 2005 Hartmut Häfner SSCK Universität Karlsruhe (TH) haefner@rz.uni-karlsruhe.de

2 SSCK Workshop, Karlsruhe, March 9, 2005 page 2 Universität Karlsruhe (TH) Rechenzentrum Interactive Login

3 SSCK Workshop, Karlsruhe, March 9, 2005 page 3 Universität Karlsruhe (TH) Rechenzentrum Available Services (1/2) HWW-Firewall XC1 ssh (scp) passive ftp » No print manager » No exported file system

4 SSCK Workshop, Karlsruhe, March 9, 2005 page 4 Universität Karlsruhe (TH) Rechenzentrum Available Services (2/2) » Login to HP XC6000 Cluster ssh @hwwxc1.hww.de » or within University Karlsruhe ssh @xc1.rz.uni-karlsruhe.de » SSH2 from RZ administrated workstations ssh2 –p 22 @hwwxc1.hww.de

5 SSCK Workshop, Karlsruhe, March 9, 2005 page 5 Universität Karlsruhe (TH) Rechenzentrum File Systems (1/2) 10 TB Quadrics QsNet II (single rail) 2x 16x 2x... 16x 2x FC Network 2x... $TMP $HOME$WORK

6 SSCK Workshop, Karlsruhe, March 9, 2005 page 6 Universität Karlsruhe (TH) Rechenzentrum File Systems (2/2) environment variable global/localpermanent/ temporary quotasbackup $HOMEglobalpermanentno, but monitoredyes $WORKglobalone weekno $TMPlocaltemporaryno » global - all nodes access the parallel file system HP SFS, based on Lustre » local – each node has ist own file system » permanent – files are stored permanently » temporary – files are removed at end of job or session

7 SSCK Workshop, Karlsruhe, March 9, 2005 page 7 Universität Karlsruhe (TH) Rechenzentrum Moving Files (HP XC Workstations) » Either by the command scp or by passive ftp scp @ws.institute.uni-karlsruhe.de:mydata $HOME ftp ws.institute.uni-karlsruhe.de

8 SSCK Workshop, Karlsruhe, March 9, 2005 page 8 Universität Karlsruhe (TH) Rechenzentrum Module Concept » module is a user interface to the Modules package. » Typically modulefiles instruct the module command to set or alter environment variables like PATH, MANPATH, …. » Syntax is: module [switches] [sub-command] [modulefile…|path…|directory…] » Important switches are: –--force, -f Force active dependency resolution. This will result in modules found on a prereq command inside a modulefile being loaded automatically. –--verbose, -v Enable verbose messages during module comand execution. Further switches control the amount of output of the module command.

9 SSCK Workshop, Karlsruhe, March 9, 2005 page 9 Universität Karlsruhe (TH) Rechenzentrum Modules (1/2) » module help [modulefile...] Print the useage of each subcommand. If an argument is given, print the Module specific help information for the modulefile. » module add|load modulefile [modulefile...] Load modulefile into the shell environment. » module unload|rm modulefile [modulefile...] Remove modulefile from the shell environment. » module switch|swap modulefile1 modulefile2 Switch loaded modulefile1 with modulefile2. » module display|switch modulefile [modulefile...] Display information about the modulefile. » module list List loaded modules. » module avail [path...] List all available modulefiles in the current MODULEPATH. » module purge Unload all loaded modulefiles. Further commands to add directories to MODULEPATH and to add|remove modulefiles to|from the shell dependent startup files.

10 SSCK Workshop, Karlsruhe, March 9, 2005 page 10 Universität Karlsruhe (TH) Rechenzentrum Modules (2/2) ModulefileDescription dotadds the current directory to your env. Variable PATH intel-compilers/7.1loads Intel Fortran and C/C++ compiler in version 7.1 intel-compilersloads Intel Fortran and C/C++ compiler in version 8.1 nag-compilersloads NAG Fortran90/95 compiler in version 4.2 hp-mpiloads HP MPI in version 2.0 intel-debuggersloads Intel debugger in version 8.0 ddtloads graphical Streamline debugger in version 1.8 mklloads Intel MKL library in version 7.2 mlib/7.1loads HP MLIB library in version 7.1 mlibloads HP MLIB library in version 8.0 naglibloads NAG Fortran library in version 8.0

11 SSCK Workshop, Karlsruhe, March 9, 2005 page 11 Universität Karlsruhe (TH) Rechenzentrum Modulefiles – containing modifications to the environment » modulefile is a file containing Tcl code + extensions for the Modules package. » modulefile contains the changes to a users environment needed to access an application. » modulefiles can also be used to implement site policies regarding the access and use of applications. » modulefiles also hide the notion of different types of shells. From the modulefile writers perspective, this means one set of information will take care of every type of shell. » Change default module environment by inserting module add in the setup file.bash_profile. » Add your own Modulefiles by extending the $MODULEPATH environment variable.

12 SSCK Workshop, Karlsruhe, March 9, 2005 page 12 Universität Karlsruhe (TH) Rechenzentrum Compilers (1/4) » Fortran: 2 Intel Compilers (ifort in V8.1 and efc in V7.1), NAG Compiler (f95), GNU Compiler (g77 - only Fortran77) » C/C++: 2 Intel Compilers (icc in V8.1 and ecc in V7.1), GNU Compiler (gcc) -- General options: -c, -I, -g, -0{0,1,2,3}, -L, -l, -o » NAG Fortran Compiler - best choice to check the Fortran90/95 conformity of your program » Important specific options of the NAG Fortran Compiler –-Ounsafe performs possibly unsafe optimizations –-dusty allows the compilation of legacy software (errors warning) –-ieee=full|nonstd|stop enables|disables all IEEE and deallocation facilities –-C compiles code with all possible runtime checks –-mtrace traces memory allocation and deallocation –-gline compiles code to generate a traceback in case of runtime errors –-gc enables automatic garbage collection of the executable –-tread_safe compiles code for safe execution in a multi-threaded environment –-static prevents linking with shared libraries

13 SSCK Workshop, Karlsruhe, March 9, 2005 page 13 Universität Karlsruhe (TH) Rechenzentrum Compilers (2/4) » Intel Fortran suffix names » NAG Fortran suffix names CommandFile name suffixSource formatLanguage level ifort.f.ftn.for.i-fixed -72-nostand ifort.F.FTN.FOR.fpp.FPP-fixed -72 -fpp-nostand ifort.f90.i90-free-nostand ifort.F90-free -fpp-nostand CommandFile name suffixSource formatLanguage level f95.f.ftn.for-fixed Fortran95 Standard; -strict95 for strict Fortran95 code; -dusty for legacy code. f95.F-fixed -fpp f95.f90.f95-free f95.F90.F95-free -fpp

14 SSCK Workshop, Karlsruhe, March 9, 2005 page 14 Universität Karlsruhe (TH) Rechenzentrum Compilers (3/4) » Change compiler by a simple module command (by default the Intel compiler in version 8.1 is used) : module add|load intel-compilers/7.1 » Using different compilers –don´t use explicit compiler names –use the $FC environment variable for the Fortran compiler –use the $CC environment variable for the C/C++ compiler name

15 SSCK Workshop, Karlsruhe, March 9, 2005 page 15 Universität Karlsruhe (TH) Rechenzentrum Compilers (4/4) » Compiling Fortran90/95 source code with Intel compiler ifort –c –O3 my_prog.f90 » Compiling Fortran90/95 source code with an arbitrary Fortran compiler $FC –c –O3 my_prog.f90 » Compiling C source code with Intel compiler icc –c –O3 my_prog.c » Compiling C++ source code with Intel compiler $CC –c –O3 my_prog.C

16 SSCK Workshop, Karlsruhe, March 9, 2005 page 16 Universität Karlsruhe (TH) Rechenzentrum Linking » Special compiler scripts to (compile and) link MPI programs (the scripts don´t work together with the GNU compilers) mpicc – (compile and) link C programs mpicc.mpich – (compile and) link C programs in MPICH compatibility mode mpiCC – (compile and) link C++ programs mpiCC.mpich – (compile and) link C++ programs in MPICH compatibility mode mpif77 or mpif90 – (compile and) link Fortran programs If MPICH compatibility mode is required, call mpif77.mpich or mpif90.mpich » Example for Fortran90/95 object code with Intel compiler mpif90 –o my_prog my_prog.o sub1.o sub2.o

17 SSCK Workshop, Karlsruhe, March 9, 2005 page 17 Universität Karlsruhe (TH) Rechenzentrum Benchmarks Measurements of Itanium2 (1.5 GHz) on HP XC6000 Cluster Vector- length Min/MaxAdditionMult.DivisionLinked Triad Vector Triad Dot Product 1Min Max 51 52 51 52 26 99 100 106 107 106 107 10Min Max 177 182 177 178 131 284 366 351 352 647 650 100Min Max 734 742 734 739 254 256 1031 1452 1061 1237 1316 1325 10 3 Min Max 1444 1458 1118 1454 280 288 2906 2928 2032 2044 1470 1482 10 4 Min Max 1201 1486 1119 1420 281 285 2396 2891 1514 1750 1490 1496 10 5 Min Max 1030 1046 1028 1042 286 288 2135 2147 1610 1629 1409 1413 10 6 Min Max 156 161 150 160 145 154 299 318 252 262 766 777 10 7 Min Max 163 168 164 166 157 159 329 333 271 274 766 776 Peak3000 -----6000 Max. L2-c.2000 -----400030006000 Max. L3-c.2000 -----400030006000 Max. mem267 -----534400800 η a,L2 0.490.48-----0.490.340.25 η a,L3 0.35 -----0.360.270.24 η a,mem 0.06 -----0.060.050.13 What is remarkable? The dot product runs very slow! The scattering of the performance rates, if the data are stored in the L2- cache is very high (up to 40 percent!!!).

18 SSCK Workshop, Karlsruhe, March 9, 2005 page 18 Universität Karlsruhe (TH) Rechenzentrum Benchmarks – Ping Pong within a node Neighbor send/receive speed test --------------------------------- --- Multiple simple Ping/Pong --- --------------------------------- Clock overhead is 0.1736E-07 secs per snd/rcv. bytes ms MB/s 0 0.001 0.000 4 0.001 4.590 8 0.001 7.875 16 0.001 15.526 32 0.001 34.528 64 0.001 73.807 128 0.001 127.790 256 0.001 209.114 512 0.001 436.936 1024 0.002 674.397 2048 0.007 308.211 4096 0.007 550.674 8192 0.010 834.013 16384 0.014 1181.921 32768 0.022 1507.639 65536 0.036 1835.203 131072 0.071 1854.967 262144 0.126 2074.492 524288 0.254 2060.727 1048576 0.502 2089.745 Neighbor send/receive speed test --------------------------------- --- Multiple double Ping/Pong --- --------------------------------- Clock overhead is 0.2670E-08 secs per snd/rcv. bytes ms MB/s 0 0.003 0.000 4 0.004 1.131 8 0.003 2.381 16 0.003 4.744 32 0.004 8.936 64 0.003 19.438 128 0.003 37.425 256 0.004 65.514 512 0.004 134.188 1024 0.004 253.168 2048 0.006 343.425 4096 0.008 541.139 8192 0.011 729.931 16384 0.018 914.383 32768 0.033 1002.130 65536 0.064 1021.981 131072 0.124 1055.018 262144 0.233 1127.460 524288 0.486 1078.485 1048576 0.911 1151.049

19 SSCK Workshop, Karlsruhe, March 9, 2005 page 19 Universität Karlsruhe (TH) Rechenzentrum Benchmarks – Ping Pong between nodes Neighbor send/receive speed test --------------------------------- --- Multiple simple Ping/Pong --- --------------------------------- Clock overhead is 0.1736E-07 secs per snd/rcv. bytes ms MB/s 0 0.003 0.000 4 0.003 1.441 8 0.003 2.905 16 0.003 5.828 32 0.003 11.605 64 0.004 16.514 128 0.004 30.021 256 0.006 45.949 512 0.006 87.778 1024 0.006 161.227 2048 0.008 271.353 4096 0.010 408.196 8192 0.015 546.295 16384 0.025 659.058 32768 0.045 735.468 65536 0.084 781.339 131072 0.164 797.490 262144 0.320 818.153 524288 0.660 794.346 1048576 1.266 828.447 Neighbor send/receive speed test --------------------------------- --- Multiple double Ping/Pong --- --------------------------------- Clock overhead is 0.2666E-08 secs per snd/rcv. bytes ms MB/s 0 0.009 0.000 4 0.009 0.443 8 0.009 0.899 16 0.009 1.739 32 0.009 3.497 64 0.010 6.495 128 0.010 12.508 256 0.012 22.125 512 0.012 43.344 1024 0.013 80.759 2048 0.016 129.800 4096 0.021 197.767 8192 0.031 267.897 16384 0.050 329.511 32768 0.089 367.656 65536 0.172 381.144 131072 0.334 392.161 262144 0.673 389.346 524288 1.313 399.440 1048576 2.816 372.366

20 SSCK Workshop, Karlsruhe, March 9, 2005 page 20 Universität Karlsruhe (TH) Rechenzentrum Benchmarks – Overlap for short messages between nodes Neighbor send/receive overlap test ---------------------------------- ------ Short messages --------- ---------------------------------- The used message length during computation is... 10 the used vectorlength during computation is... 10 all times in seconds, >>ol_fac in percent!!! Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 103548 12267030 1.03 1.03 1.90 0.16 15.6 3 103548 12267030 1.02 3.08 3.94 0.17 16.6 The used message length during computation is... 100 the used vectorlength during computation is... 100 all times in seconds, >>ol_fac in percent!!! Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 69722 5725738 1.03 1.03 1.72 0.34 32.9 3 69722 5725738 1.03 3.09 3.78 0.35 33.8 The used message length during computation is... 1000 the used vectorlength during computation is... 1000 all times in seconds, >>ol_fac in percent!!! Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 28496 979641 1.03 1.03 1.42 0.64 62.1 3 28496 979641 1.03 3.09 3.49 0.63 61.4

21 SSCK Workshop, Karlsruhe, March 9, 2005 page 21 Universität Karlsruhe (TH) Rechenzentrum Benchmarks – Overlap for long messages between nodes Neighbor send/receive overlap test ---------------------------------- ------ Long messages ---------- ---------------------------------- The used message length during computation is... 10000 the used vectorlength during computation is... 10000 all times in seconds, >>ol_fac in percent!!! Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 4670 68699 1.03 1.03 1.13 0.92 89.7 3 4670 68699 1.03 3.00 3.23 0.80 78.0 The used message length during computation is... 100000 the used vectorlength during computation is... 100000 all times in seconds, >>ol_fac in percent!!! Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 503 6101 1.06 1.05 1.13 0.98 92.2 3 503 6101 1.06 3.13 3.19 1.00 94.0 The used message length during computation is... 1000000 the used vectorlength during computation is... 1000000 all times in seconds, >>ol_fac in percent!!! Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 49 101 1.05 1.06 1.30 0.82 78.0 3 49 101 1.05 3.18 3.35 0.88 83.7

22 SSCK Workshop, Karlsruhe, March 9, 2005 page 22 Universität Karlsruhe (TH) Rechenzentrum Debugging with DDT » Commands module add ddt ddt hello

23 SSCK Workshop, Karlsruhe, March 9, 2005 page 23 Universität Karlsruhe (TH) Rechenzentrum HP MPI – Execution of Parallel Programs » The syntax to start a parallel application interactively is mpirun [mpirun_options] or mpirun [mpirun_options] –f mpirun OptionsBrief Explanation -n # or -np # MPI job is run on # processors (option is ignored in batch mode) -m block or –m cycle MPI processes will be mapped blockwise or cyclically to the processors -T prints user and system time for each MPI rank -1sided enables one-sided communication -i enables runtime instrumentation profiling for all processes -stdio= specifies standard IO options (refer to HP MPI User Guide) -mpich runs the application in MPICH compatibility mode

24 SSCK Workshop, Karlsruhe, March 9, 2005 page 24 Universität Karlsruhe (TH) Rechenzentrum HP MPI – Environment Variables » Many environment variables HP MPI Env. VariablesBrief Explanation MPI_FLAGSmodifies the general behaviour of MPI – l reports memory leaks caused by not freeing memory – f forces MPI errors to be fatal, ignoring the programmer´s choice of error handlers – z enables zero-buffering mode (MPI_SEND and MPI_RSEND MPI_SSEND) MPI_INSTRenables counter instrumentation for profiling HP MPI applications MPIRUN_OPTIONS sets mpirun options......

25 SSCK Workshop, Karlsruhe, March 9, 2005 page 25 Universität Karlsruhe (TH) Rechenzentrum Numerical Libraries » HP XC Mathematical LIBrary (MLIB) » Intel Mathematical Kernel Library (MKL) » NAG Libraries (non-commercial users) » LINear SOLver package (LINSOL)

26 SSCK Workshop, Karlsruhe, March 9, 2005 page 26 Universität Karlsruhe (TH) Rechenzentrum Well Established Open Source Libraries » BLAS –BLAS{1,2,3} included in HP XC MLIB and Intel MKL » LAPACK –included in HP XC MLIB and Intel MKL contains many functions for the solution of linear systems and eihenvalue problems for dense and banded matrices » ScaLAPACK –included in HP XC MLIB contains above mentioned functions for parallel computers » Metis –included in HP XC MLIB contains a special implementation of the graph partitioning and matrix reordering library

27 SSCK Workshop, Karlsruhe, March 9, 2005 page 27 Universität Karlsruhe (TH) Rechenzentrum HP XC MLIB (1/2) » Functions from several areas: linear equations, least squares, eigenvalue problems, singular value decomposition, vector and matrix computations, convolutions and Fourier Transforms » Four components: VECLIB, LAPACK, ScaLAPACK and SuperLU_DIST » VECLIB includes all BLAS{1,2,3} and sparse BLAS subroutines, sparse linear equation solvers, sparse eigenvalue and eigenvector solvers, FFTs, correlation and convolution subprograms, random number generators and METIS V4.0.1 » Load bevor use module add hp-mlib/7.1 for Intel compiler V7.1 and module add hp-mlib for Intel compiler V8.1

28 SSCK Workshop, Karlsruhe, March 9, 2005 page 28 Universität Karlsruhe (TH) Rechenzentrum HP XC MLIB (2/2) » Appropriate options at link time: –VECLIB $FC –L$MLIBPATH –lveclib –openmp –o myprog myprog.f90 –LAPACK $FC –L$MLIBPATH –llapack –openmp –o myprog myprog.f90 –ScaLAPCK mpif90 –L$MLIBPATH –lscalapack –openmp –o myprog myprog.f90 –SuperLU_DIST mpif90 –L$MLIBPATH –lsuperlu_dist –openmp –o myprog myprog.f90 » More details: http://www.rz.uni-karlsruhe.de/ssc/hpxc-mlib

29 SSCK Workshop, Karlsruhe, March 9, 2005 page 29 Universität Karlsruhe (TH) Rechenzentrum Intel MKL (1/2) » Many components: –BLAS, –Sparse BLAS, –LAPACK, –direct sparse solver PARDISO, –Vector Mathematical Library (VML) for core mathematical functions on vector arguments, –Vector Statistical Library (VSL) for generating vectors of pseudorandom numbers, –general Discrete Fourier Transform functions (DFT) and –a subset of FFTs » Load bevor use module add mkl

30 SSCK Workshop, Karlsruhe, March 9, 2005 page 30 Universität Karlsruhe (TH) Rechenzentrum Intel MKL (2/2) » Appropriate options at link time: –BLAS, FFT, VML, VSL etc. $FC –L$MKLPATH –lmkl_ipf –lguide –lpthread –o myprog myprog.f90 –LAPACK $FC –L$MKLPATH –lmkl_lapack –lmkl_ipf –lguide –lpthread –o myprog myprog.f90 –PARDISO mpif90 –L$MKLPATH –lmkl_solver –lmkl_ipf –lguide –lpthread –o myprog myprog.f90 » More details: http://www.rz.uni-karlsruhe.de/ssc/hpxc-mkl

31 SSCK Workshop, Karlsruhe, March 9, 2005 page 31 Universität Karlsruhe (TH) Rechenzentrum NAG Libraries » NAG Fortran, NAG Fortran90 and NAG C libraries only for non-commercial customers » Load bevor use module add naglib/7.1 module add mkl/7.1 for Intel compiler V7.1 and module add naglib module add mkl for Intel compiler V8.1 » Appropriate options at compile and link time: –NAG Fortran Library $FC myprog.f90 –I$NAGLIBPATH/interface_blocks –LNAGLIBPATH \ –lnag-mkl –L$MKLPATH –lmkl_lapack –lmkl_ipf –lguide -lpthread –NAG Fortran90 Library $FC myprog.f90 –I$NAGLIBPATH/nag_mod_dir –LNAGLIBPATH \ –lnagfl90-noblas –L$MKLPATH –lmkl_lapack –lmkl_ipf –lguide -lpthread –NAG C Library $CC myprog.c –I$NAGLIBPATH/include –L$NAGLIBPATH/nagc » More details: http://www.rz.uni-karlsruhe.de/ssc/hpxc-nag

32 SSCK Workshop, Karlsruhe, March 9, 2005 page 32 Universität Karlsruhe (TH) Rechenzentrum LINSOL » LINSOL is a program package to solve large sparse linear systems –many iterative solvers –several polyalgorithms –(I)LU direct solvers as preconditioners –optimized for workstations (cache reuse), vectorcomputers and parallel computers (MPI) –supporting 7 different storage patterns for sparse matrices (automatic optimization to the architecture of the computer) » Load bevor use module add linsol » Appropriate options at compile and link time: mpif90 –L$LINSOLPATH –llinsol –lMPI myprog.o running a MPI job $ FC –L$LINSOLPATH –llinsol –lnocomm myprog.o running a serial job » More details: http://www.rz.uni-karlsruhe.de/produkte/linsol


Download ppt "Universität Karlsruhe (TH) Rechenzentrum How to use the System SSCK Workshop – Introduction to HP XC6000 Cluster Karlsruhe, March 9 – 11, 2005 Hartmut."

Similar presentations


Ads by Google