Presentation on theme: "USING FLUENT FOR HPC IT MANAGEMENT OF FME, 3 RD MARCH 2011."— Presentation transcript:
USING FLUENT FOR HPC IT MANAGEMENT OF FME, 3 RD MARCH 2011
important notes This is not a FLUENT introductory course but an advance course of existing Fluent users to run their models on HPC.
Table of contents THE HPC FACILITY USING PUTTY AND WINSCP TO ACCESS THE SERVER SENDING FILES TO THE SERVER RUNNING JOBS MONITORING JOBS COPY RESULTS BACK FROM THE SERVER Using CYGWIN to ACCESS Fluent
HPC @ cict Sunfire 8 CPUs x 6 Nodes - Quad-Core AMD Opteron(tm) Processor 2376 HE (2.3 GHz) Interconnected using Infiniband and Ethernet Each Node has 8 GB of memory Storage capacity - 100 GB at the moment
QUEUE SYSTEM Using Torque - torque-server-2.3.6-1cri.slc4 and torque-mom-2.3.6-1cri.slc4 Scheduler - maui-server-3.2.6p21- snap.1224706197.2.slc4 Current MPI does not utilise Infiniband. This soon will be fixed.
Table of contents THE HPC FACILITY USING PUTTY AND WINSCP TO ACCESS THE SERVER SENDING FILES TO THE SERVER RUNNING JOBS MONITORING JOBS COPY RESULTS BACK FROM THE SERVER
installing Cygwin Cygwin is an opensource software that can be installed on your personal Windows XP or 7 computer. It can be downloaded from www.cygwin.com.www.cygwin.com To use its X-server functionality, please visit http://x.cygwin.com/. http://x.cygwin.com/ The userguide is at http://x.cygwin.com/docs/ug/cygwin-x- ug.html http://x.cygwin.com/docs/ug/cygwin-x- ug.html
running Cygwin After installation, a program called Cygwin will be installed. A program called Cygwin/X can be executed. The X-server will be running. Leave it there.
accessing the system Using putty.exe – http://fkm.utm.my/ftp/pub/Windows/putty/putty.exe Using winscp.exe – http://fkm.utm.my/ftp/pub/Windows/winscp/winscp.exe
Login into the system via putty and winsCP Server : fkm.utm.my Port : 2323 or Server : ce.utmgrid.utm.my Port : 22
SENDING FILES TO THE SERVER 1.Search files to be transferred at the left panel. 2.Create a new directory on the right panel. 3.Select files on the left panel. 4.Copy from left to right.
FLUENT On HPC, you can run Fluent in the Fluent windowing environment or submit your application which needs more than 30 minutes of CPU time to run to the PBS queue. Interactive sessions are recommended only for pre- and postprocessing and for the solution of small problems. Larger problems should be run as batch jobs for either serial runs or parallel runs of Fluent. The user may check the possible command line options by entering the command.
FLUENT Fluent interface can be access via CYGWIN. Cygwin is an opensource software that could emulate X-server. The X-server is used to run ANSYS Workbench. Run the following command in your putty. /opt/exp_soft/share/istas/ansys_inc/v121/Framework/bin/Linux64/runwb2
Fluent Type ‘pico model_journal’ Enter file/read-case your_input_file.cas solve/init/initialize-flow solve/iterate 400 file/binary-files n file/confirm-overwrite n file/write-data your_output_file.dat exit y Press Control–O to save Press Enter Press Control-X to exit
fluent prepare the pbs-script, type ‘pico pbs-script’ Enter the following #!/bin/sh #PBS -q utm #PBS -N istas.model3d #PBS -l nodes=1:ppn=8 #PBS -M email@example.com #PBS -m abe nCPU=8 version=3d journal=model_journal cd $PBS_O_WORKDIR /opt/exp_soft/share/istas/ansys_inc/v121/fluent/bin/fluent $version -t$nCPU -g -i $journal - mpi=openmpi -cnf=$PBS_NODEFILE -ssh Press Ctrl-O to save Press Enter Press Ctrl-X to exit
fluent Finally we want to submit the job, just type ‘qsub pbs-script’ type ‘qstat’ to see the status of your job. an email will be sent to you to let you know the job has started. another email will be sent to let you know the job has ended. You could open WINSCP again to copy the output back to your PC.
USEFUL PBS commands qsub: Once a PBS job script is created, it is submitted to PBS via the qsub command. In its simplest form, qsub takes a single parameter, the name of the script file that you wish to submit. qstat: The qstat command will allow you to view the contents of the PBS queue. node1:~/test> qstat Job id Name User Time Use S Queue ---------------- ---------------- ---------------- ------ -- - ----- 147.node1 testjob psmith 0 R default
cont... qdel: The qdel command takes a single argument, a job number. You can use qdel to abort execution of your job: qdel 147 would cancel execution of the job shown in the qstat example above. qalter: The qalter command is helpful for altering the parameters of a job after it's submitted. qalter takes two arguments: the PBS directive that you wish to change (like -l), and the job number that you want to change. For example, if you forgot to set the walltime that your job requires, you can change it after it's been submitted: node1:~> qalter -l walltime=4:00:00 147
cont... pbsnodes: The pbsnodes command, while a useful PBS administration command, can also be informative to the PBS user. pbsnodes -a will list all PBS nodes, their attributes, and job status. This is a useful way to get a list of valid machine properties for use in a #PBS -l directive. node1:~> pbsnodes -a node2 state = free np = 2 properties = gigabit,pcn,m2048,dual,p1800,athlon ntype = cluster
Exercise Download this file http://hpc.fkm.utm.my/fluent.tar.gz and unzip it in a folder. There are four files in the zip file, Makefile, model.cas, model_journal, pbs-script
Exercise Open WinSCP Click New Enter Hostname : ce.utmgrid.utm.my Enter your username Enter your password Transfer the content of Fluent folder to the server.
Exercise Copy the whole directory from left to right.
Exercise Open Putty Enter hostname : ce.utmgrid.utm.my Click Open Enter your username Enter your password If all are OK, you will be getting a Unix shell. istaz@ce ~/ $ _
Exercise Common Unix Commands CommandDescription ls -llist directory mkdir xxxcreate directory xxx rm xxxdelete file xxx cd..back to previous level directory cd -shifted to previous directory exitlog off from server pwdcurrent director whowho is currently logging cp aaa bbbcopy file aaa to a new file bbb mv aaa bbbmove/rename file aaa to file bbb clearclear screen
Exercise Type ‘ls’ to list the content of the directory. You will see the name of the directory that contains the Fluent file that was uploaded a moment ago. Type ‘cd ’. Move the directory. Type ‘ls’ again.
Exercise Now we can to see the content of our model_journal file. Type ‘pico model_journal’. You will see a series of Fluent command to be executed when your model is ran. Type ‘control-x’ to exit.
Exercise To edit the content of pbs-script, type ‘pico pbs- script’ You could edit the content of PBS-SCRIPT and to exit without save – type ‘control-x’ and type ‘n’ for no to save. To save the content, type ‘control-x’ and type ‘y’ for yes to save. Now you are satisfied with your setting.
Exercise #!/bin/sh #PBS -q utm #PBS -N istas.model3d.16nodes #PBS -l nodes=1:ppn=4 #PBS -M firstname.lastname@example.org #PBS -m abe nCPU=4 version=3d journal=model_journal cd $PBS_O_WORKDIR # in one line... /opt/exp_soft/share/istas/ansys_inc/v121/fluent/bin/fluent $version -t$nCPU -g -i $journal -cnf=$PBS_NODEFILE -ssh Queue name : utm Job name node x ppn = no cpu required email address email when start, error and finish fluent version used the name of the journal file
Exercise File “model_journal” file/read-case model.cas solve/init/initialize-flow solve/iterate 400 file/binary-files n file/confirm-overwrite n file/write-data model.dat exit y input output This file automates the running of Fluent commands during the execution. The content of this file depends on the type of problem you are solving.
Exercise Now it is time to run the job, all you need to do is to run the following command, qsub pbs-script If nothing is wrong, your job will be placed in a queue and if there is available CPUs your job will be executed immediately.
Exercise You could monitor the progress of your job by type command ‘qstat’. If the list is too long, type ‘qstat | less’. Type ‘q’ to exit. Example of qstat is
ce.utmgrid.utm.my: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 149111.ce.utmgri pgjsaeed utm pgjsaeed.model3d 27086 1 -- -- 5500: R 392:4 149627.ce.utmgri aliff utm Drop 16602 1 -- -- 5500: R 604:3 149802.ce.utmgri smahmad3 utm FJSSPHM111.sh 5690 -- -- -- 5500: R 136:0 149830.ce.utmgri smahmad3 utm FJSSPHM225.sh 15284 -- -- -- 5500: R 65:20 149831.ce.utmgri smahmad3 utm FJSSPHM226.sh 24896 -- -- -- 5500: R 57:29 149832.ce.utmgri smahmad3 utm FJSSPHM231.sh 16951 -- -- -- 5500: R 57:21 149833.ce.utmgri smahmad3 utm FJSSPHM232.sh -- -- -- -- 5500: Q -- 149834.ce.utmgri smahmad3 utm FJSSPHM233.sh -- -- -- -- 5500: Q -- 149835.ce.utmgri smahmad3 utm FJSSPHM234.sh -- -- -- -- 5500: Q -- Username Exercise Job numberJob name Max time allowed Q – still waiting R - running Elapse Time
Exercise Tips and tricks 1.Use a small number of CPUs to get your job executes earlier in the queue. 2.Start from small problem. 3.Always delete old unused files. Keep the storage space free.