Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Scientific Computing on BU’s Linux Cluster Doug Sondak Linux Clusters and Tiled Display Walls Boston University July 30 – August 1, 2002.

Similar presentations


Presentation on theme: "Introduction to Scientific Computing on BU’s Linux Cluster Doug Sondak Linux Clusters and Tiled Display Walls Boston University July 30 – August 1, 2002."— Presentation transcript:

1 Introduction to Scientific Computing on BU’s Linux Cluster Doug Sondak Linux Clusters and Tiled Display Walls Boston University July 30 – August 1, 2002

2 Outline hardware parallelization compilers batch system profilers Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

3 Hardware Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

4 BU’s Cluster 52 2-processor nodes specifications –2 Pentium III processors per node –1 GHz –1 GB memory per node –32 KB L1 cache per CPU –256 KB L2 cache per CPU Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

5 BU’s Cluster (2) Myrinet 2000 interconnects –sustained 1.96 Gb/s Linux Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

6 Some Timings CFD code, MPI, 4 procs. Origin2000 495 SP 329 Cluster, 2 procs. per box 174 Cluster, 1 proc. per box 153 Regatta 78 MachineSec.

7 Parallelization Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

8 Parallelization MPI is the recommended method –PVM may also be used some MPI tutorials –Boston University http://scv.bu.edu/Tutorials/MPI/ –NCSA http://pacont.ncsa.uiuc.edu:8900/public/MPI/ Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

9 Parallelization (2) OpenMP is available for SMP within a node mixed MPI/OpenMP not presently available –we’re working on it! Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

10 Compilers Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

11 Compilers Portland Group –pgf77 –pgf90 –pgcc –pgCC Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

12 Compilers (2) gnu –g77 –gcc –g++ Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

13 Compilers (3) Intel –Fortran ifc –C/C++ icc Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

14 Compilers (2) Polyhedron F77 Benchmarks http://www.polyhedron.com/ PG gnu Intel AC 8.66 12.38 6.13 ADI 8.48 9.27 6.83 AIR16.41 15.6513.45 CHESS11.67 10.0610.16 DODUC21.35 36.2318.18 LP8 4.31 7.88 4.16 MDB 3.62 3.81 2.94 MOLENR11.66 12.72 7.61 PI24.58 41.95 7.08 PNPOLY 3.81 5.24 4.86 RO10.75 10.31 3.92 TFFT18.84 20.2420.18

15 Compilers (3) Portland Group –pgf77 generally faster than g77 Intel –ifc generally faster than pgf77 Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

16 Compilers (4) Linux C/C++ compilers –gcc/g++ seems to be the standard, usually described as a good compiler Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

17 Portland Group -O2 –highest level of optimization -fast –same as - O2 -Munroll -Mnoframe - Minline –function inlining Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

18 Portland Group (2) -Mbyteswapio –swaps between big endian and little endian –useful for using files created on our SP, Regatta, or Origin2000 -Ktrap=fp –trap floating point invalid operation, divide by zero, or overflow –slows code down, only use for debugging Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

19 Portland Group (3) - Mbounds –array bounds checking –slows code down, only use for debugging -mp –process OpenMP directives -Mconcur –automatic SMP parallelization Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

20 Intel Need to set some environment variables –contained in /usr/local/IT/intel6.0/compiler60/ia32/bin/iccvars.csh –source this file, copy it into your.cshrc file, or source it in.cshrc –there’s an identical file called ifcvars.csh to avoid (create?) confusion Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

21 Intel (2) -O3 –highest level of optimization -ipo –interprocedural optimization - unroll –loop unrolling Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

22 Intel (3) - openmp -fpp –process OpenMP directives -parallel –automatic SMP parallelization -CB –array bounds checking Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

23 Intel (3) -CU –check for use of uninitialized variables Endian conversion by way of environment variables setenv F_UFMTENDIAN big all reads will be converted from big to little endian, all writes from little to big endian Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

24 Intel (4) Can specify units for endian conversion setenv F_UFMTENDIAN big:10,20 Can mix endian conversions setenv F_UFMTENDIAN little;big:10,20 all units are little endian except for 10 and 20, which wil be converted Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

25 Batch System Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

26 Batch System PBS –different than LSF on O2k’s, SP’s, Regattas there’s only one queue dque Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

27 qsub job submission done through script –script details will follow qsub scriptname returns job ID in working directory –std. out - scriptname.ojobid –std. err - scriptname.ejobid [sondak@hn003 run]$ qsub corrun 808.hn003.nerf.bu.edu

28 qstat Check status of all your jobs qstat lies about run time –often (always?) zero [sondak@hn003 run]$ qstat Job id Name User Time Use S Queue ---------------- ---------------- ---------------- ------------ - -------- 808.hn003 corrun sondak 0 R dque

29 qstat (2) S - job status –Q - queued –R - running –E - exiting (finishing up) qstat -f gives detailed status exec_host = nodem019/0+nodem018/0 +nodem017/0+nodem016/0 to specify jobid qstat jobid

30 Other PBS Commands kill job qdel jobid some less-important PBS commands –qalter, qhold, qrls, qmsg, qrerun –man pages are available for all commands Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

31 PBS Script For serial runs #!/bin/bash # Set the default queue #PBS -q dque # ppn is cpu's per node #PBS -l nodes=1:ppn=1,walltime=00:30:00 cd $PBS_O_WORKDIR myrun

32 PBS/MPI For MPI, set up gmi file in PBS script test -d ~/.gmpi || mkdir ~/.gmpi GMCONF=~/.gmpi/conf.$PBS_JOBID /usr/local/xcat/bin/pbsnodefile2gmconf $PBS_NODEFILE > $GMCONF cd $PBS_O_WORKDIR NP=$(head -1 $GMCONF) Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

33 PBS/MPI (2) To run MPI, end PBS script with (all on one line) mpirun.ch_gm --gm-f $GMCONF --gm-recv polling --gm-use-shmem --gm-kill 5 -np $NP PBS_JOBID=$PBS_JOBID myprog Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

34 PBS/MPI (3) mpirun.ch_gm –version of mpirun that uses myrinet --gm-f $GMCONF –access configuration file constructed above --gm-recv polling –poll continually to check for completion of sends and receives –most efficient for dedicated procs. That’s us!

35 PBS/MPI (4) --gm-use-shmem –enable shared-memory support –may improve or degrade performance –try your code with and without it --gm-kill 5 –if one MPI process aborts, kill others after 5 sec. Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

36 PBS/MPI (5) -np $NP –run on NP procs as computed earlier in script –equals “nodes x ppn” from PBS -l option PBS_JOBID=$PBS_JOBID –seems redundant redundant –do it anyway myprog – run the darn code already!

37 Profiling Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

38 Portland Group Portland Group Compiler flag –function level -Mprof=func –line level -Mprof=lines much larger file creates pgprof.out file in working directory Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

39 PG (2) At unix prompt, type pgprof command will pop up window with bar chart of timing results can take file name argument in case you’ve renamed the pgprof.out file pgprof pgprof.lines Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

40 PG (3) option to specify source directory pgprof - I sourcedir pgprof.lines –can specify multiple directories with multiple - I flags also can use GUI menu –Options Source Directory... Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

41 PG (4)

42 PG (5) Calls - number of times routine was called Time - time spent in specified routine Cost - time spent in specified routine plus time spent in called routines Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002

43 PG (6) Lines profiling –with optimization, may not be able to identify many (most?) lines in source code reports results for blocks of code, e.g., loops –without optimization, doesn’t measure what you really want –initial screen looks like “func” screen –double-click function/subroutine name to get line-level listing

44 PG (7)

45 Questions/Comments Feel free to contact us directly with questions about the cluster or parallelization/optimization issues Doug Sondaksondak@bu.edu Kadin Tsengkadin@bu.edu Doug Sondak Linux Clusters and Tiled Display Walls July 30 – August 1, 2002


Download ppt "Introduction to Scientific Computing on BU’s Linux Cluster Doug Sondak Linux Clusters and Tiled Display Walls Boston University July 30 – August 1, 2002."

Similar presentations


Ads by Google