Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010.

Similar presentations


Presentation on theme: "Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010."— Presentation transcript:

1 Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010

2 What is the Newton Program? Research computing support Infrastructure management Consultation Training Research Objectives Effectiveness Efficiency Capability User applications Computational environment (OS, cluster management, software) Computing hardware Computing infrastructure (space, network, power, cooling) Community organization (policies, membership)

3 The Newton cluster “Normal” Linux compute cluster 295 computers 2500 processors 5TB RAM 40 Gbit/sec Infiniband 80 TB Storage Storage server Lustre storageHead node Compute node Interactive node Lustre storage External network Infiniband network Ethernet network Compute node Interactive nodeCompute node Lustre storage Storage server

4 Newton cluster machines Rack 1Rack 2Rack 3Rack 4Rack 5Rack 6Rack 7Rack 8 Dell R410 (tao040)Dell R410 (tao059)Dell R410 (tao078)Dell R410 (tao119) Dell 1950 (gamma31) C6100 Dell R410 (tao039)Dell R410 (tao058)Dell R410 (tao077)Dell R410 (tao118)KVMDell 1950 (gamma30)X2200M2 (lustre4) Dell R410 (tao038)Dell R410 (tao057)Dell R410 (tao076)Dell R410 (tao117)Dell 1950 (zeta31)Dell 1950 (gamma29)X2200M2 (lustre3) C6100 Dell R410 (tao037)Dell R410 (tao056)Dell R410 (tao075)Dell R410 (tao116)Dell 1950 (zeta30)Dell 1950 (gamma28)X2200M2 (lustre2) Dell R410 (tao036)Dell R410 (tao055)Dell R410 (tao074)Dell R410 (tao115)Dell 1950 (zeta29)Dell 1950 (gamma27)X2200M2 (lustre1) C6100 Dell R410 (tao035)Dell R410 (tao054)Dell R410 (tao073)Dell R410 (tao114)Dell 1950 (zeta28)Dell 1950 (gamma26)X2200M2 (alpha11) Dell R410 (tao034)Dell R410 (tao053)Dell R410 (tao072)Dell R410 (tao113)Dell 1950 (zeta27)Dell 1950 (gamma25)X2200M2 (alpha10) C6100 Dell R410 (tao033)Dell R410 (tao052)Dell R410 (tao071)Dell R410 (tao112)Dell 1950 (zeta26)Dell 1950 (gamma24)X2200M2 (alpha09) Dell R410 (tao032)Dell R410 (tao051)Dell R410 (tao070)Dell R410 (tao111)Dell 1950 (zeta25)Dell 1950 (gamma23)X2200M2 (alpha08) C6100 Dell R410 (tao031)Dell R410 (tao050)Dell R410 (tao069)Dell R410 (tao110)Dell 1950 (zeta24)Dell 1950 (gamma22)X2200M2 (alpha07) Dell R410 (tao030)Dell R410 (tao049)Dell R410 (tao068)Dell R410 (tao109)Dell 1950 (zeta23)Dell 1950 (gamma21)X2200M2 (alpha06) C6100 Dell R410 (tao029)Dell R410 (tao048)Dell R410 (tao067)Dell R410 (tao108)Dell 1950 (zeta22)Dell 1950 (gamma20)X2200M2 (alpha05) Dell R410 (tao028)Dell R410 (tao047)Dell R410 (tao066)Dell R410 (tao107)Dell 1950 (zeta21)Dell 1950 (gamma19)X2200M2 (alpha04) C6100 Dell R410 (tao027)Dell R410 (tao046)Dell R410 (tao065)Dell R410 (tao106)Dell 1950 (zeta20)Dell 1950 (gamma18)X2200M2 (alpha03) Dell R410 (tao026)Dell R410 (tao045)Dell R410 (tao064)Dell R410 (tao105)Dell 1950 (zeta19)Dell 1950 (gamma17)X2200M2 (alpha02) C6100 Dell R410 (tao025)Dell R410 (tao044)Dell R410 (tao063)Dell R410 (tao104)Dell 1950 (zeta18)Dell 1950 (gamma16)X2200M2 (lustre0) Dell R410 (tao024)Dell R410 (tao043)Dell R410 (tao062)Dell R410 (tao103)Dell 1950 (zeta17)Dell 1950 (gamma15)X2200M2 (alpha00) C6100 Dell R410 (tao023)Dell R410 (tao042)Dell R410 (tao061)Dell R410 (tao102)Dell 1950 (zeta16)Dell 1950 (gamma14)Dell 1850 (isaac) Dell R410 (tao022)Dell R410 (tao041)Dell R410 (tao060)Dell R410 (tao101)Dell 1950 (zeta15)Dell 1950 (gamma13) EMC CX300 SAN Qlogic IB 122000 Dell R410 (tao021) Dell R900 (epsilon0) C6100 Dell R410 (tao100)Dell 1850 (admin)Dell 1950 (gamma12)Qlogic IB 122000 Dell R410 (tao020)Dell R410 (tao099)consoleDell 1950 (gamma11)Qlogic IB 122000 Dell R410 (tao019) C6100 Dell R410 (tao098)Dell 1950 (zeta14)Dell 1950 (gamma10) EMC CX300 SAN C6100 Dell R410 (tao018)Dell R410 (tao097)Dell 1950 (zeta13)Dell 1950 (gamma09) Dell R410 (tao017)Qlogic IB 123000Qlogic IB 122000Dell R410 (tao096)Dell 1950 (zeta12)Dell 1950 (gamma08) C6100 Qlogic IB 122000 Dell 1950 (zeta11)Dell 1950 (gamma07) EMC CX300 SAN Dell R410 (tao016) Dell R510 nfs-mrail0Dell R510 lustre-oss-0 Dell R410 (tao095)Dell 1950 (zeta10)Dell 1950 (gamma06) C6100 Dell R410 (tao015)Dell R410 (tao094)Dell 1950 (zeta09)Dell 1950 (gamma05) Dell R410 (tao014) SunFire X4540 (thumper-spanier) Dell R510 lustre-oss-1 Dell R410 (tao093)Dell 1950 (zeta08)Dell 1950 (gamma04) EMC CX300 SAN C6100 Dell R410 (tao013)Dell R410 (tao092)Dell 1950 (zeta07)Dell 1950 (gamma03) Dell R410 (tao012) Dell R510 lustre-oss-2 Dell R410 (tao091)Dell 1950 (zeta06)Dell 1950 (gamma02) C6100 Dell R410 (tao011)Dell R410 (tao090)Dell 1950 (zeta05)Dell 1950 (gamma01)EMC CX300 SAN Dell R410 (tao010)Dell 1850 (login0) Dell R510 lustre-mds Dell R410 (tao089)Dell 1950 (zeta04)Dell 1950 (gamma00) PDU C6100 Dell R410 (tao009)Dell 1850 (login1)Dell R410 (tao088)Dell 1950 (zeta03)Dell 6248 Ethernet Dell R410 (tao008)Sun X2200M2 (head) Dell R510 lustre-oss-3 Dell R410 (tao087)Dell 1950 (zeta02)Dell 6248 Ethernet C6100 Dell R410 (tao007) Dell R900 (epsilon1) Dell R410 (tao086)Dell 1950 (zeta01) Dell R410 (tao006)Dell 6248 EthernetDell R410 (tao085)Dell 1950 (zeta00) C6100 Dell R410 (tao005)Dell 6248 EthernetDell R410 (tao084) Cisco Infiniband Dell R410 (tao004)Dell 6248 EthernetDell R410 (tao083) C6100 Dell R410 (tao003) SunFire X4500 (thumper) Dell 6248 EthernetDell R410 (tao082) Dell R410 (tao002) Dell R410 (tao081)PC 6248 switch Dell R410 (tao001) PDU Dell R410 (tao080)PC 6248 switch Dell R410 (tao000)Dell R410 (tao079)PC 3548 switch Legend: server storage server compute node login compute node Infiniband switch Ethernet switch management power distribution empty

5 Getting started SSH to login.newton.utk.edu using NetID Transfer files with scp, sftp, or FileZilla Display graphics with X11, xorg, or Xming Requires X11 “tunneling” through SSH client $ ssh gragghia@login.newton.utk.edu Password: *************** [gragghia@newton1 ~]$ ls Test.sge filename.txt [gragghia@newton1 ~]$ w 10:36:49 up 32 days, 15:07, 20 users, load average: 1.98, 1.81, 1.88 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT gragghia pts/0 poltth Tue05 1:05 1.39s 1.39s -bash mkzadd pts/1 bkg.engr.utk.edu Thu18 15:16m 0.06s 0.06s -bash Krrrccc pts/2 ares.bio.utk.edu 03Aug10 3days 0.03s 0.03s -bash

6 Environment management Modules utility Manages environment variables and aliases User chooses applications and libraries to use Allows multiple versions to be available Example use: See available modules: “module avail” Load a module: “module add R” Unload a module: “module unload R”

7 Resource Management: The Grid Engine 1.Accepts job requests Executable to run Execution time Parallelization RAM needed 2.Finds available resources (compute nodes) 3.Reserves and uses resources 4.Returns output

8 A simple job 1.Create a job request file. 2.Submit job $ qsub job.sge 3.Monitor job $ qstat -g t 4.View result log files #$ -q short* #$ -cwd #$ -N Test uname –a sleep 30 #$ -q short* #$ -cwd #$ -N Test uname –a sleep 30

9 More Sophistication: Array jobs » Run the same job multiple times 1.Create data files (optional) $ ~gragghia/workshop/make_datafiles.sh 2.Create a job request file with “-t” option: 3.Submit job $ qsub job.sge 4.Monitor job $ qstat -g t 5.View result log files #$ -q short* #$ -cwd #$ -N Array #$ -t 1-10 md5sum data-$SGE_TASK_ID.dat #$ -q short* #$ -cwd #$ -N Array #$ -t 1-10 md5sum data-$SGE_TASK_ID.dat

10 A parallel job: MPI 1.Download the software: $ wget http://newton.utk.edu/workshop/hello.tar 2.Extract the software: $ tar –vxf hello.tar 3.Select MPI version: $ module add openmpi/1.4.2/intel 4.Compile the application: $ cd hello $ make 5.Create a batch submit file 6.Submit the job #$ -N Hello #$ -q short* #$ -cwd -V #$ -pe openmpi* 16 mpirun hello sleep 30 #$ -N Hello #$ -q short* #$ -cwd -V #$ -pe openmpi* 16 mpirun hello sleep 30

11 Compiling and Installing Software Example: Fractal generator 1.Find the software 2.Transfer to Newton Direct: wget http://newton.utk.edu/workshop/gmandel.tgz Indirect: Download to workstation and scp (sftp) 3.Extracting the source code 1.Uncompressed: tar 2.Compressed: gunzip or unzip 4.Configure the software: $./configure –prefix=$HOME/gmandel 5.Compile: $ make 6.Install: $ make install $ wget http://newton.utk.edu/workshop/gmandel.tgz $ tar –vzxf gmandel.tgz $./configure –-prefix=$HOME/gmandel $ make install … $ wget http://newton.utk.edu/workshop/gmandel.tgz $ tar –vzxf gmandel.tgz $./configure –-prefix=$HOME/gmandel $ make install …

12 Commercial Applications Matlab Graphical (interactive) Batch mode (parallel): matlab –r SAS SPSS $ module load matlab t $ matlab $ matlab –r ‘TestFunction’ $ module load matlab t $ matlab $ matlab –r ‘TestFunction’

13 More Information Newton Program website: http://newton.utk.edu/http://newton.utk.edu/ Program policies Documentation Meetings / support / consulting schedule Research Computing Mailing List: USG_HPCC@listserv.utk.edu Visit http://oit.utk.edu/workshops/eval/http://oit.utk.edu/workshops/eval/ Section ID: Newton_Cluster-5


Download ppt "Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010."

Similar presentations


Ads by Google