Presentation is loading. Please wait.

Presentation is loading. Please wait.

HPC for CST simulations E. Bonanno, M. Husejko 1.

Similar presentations


Presentation on theme: "HPC for CST simulations E. Bonanno, M. Husejko 1."— Presentation transcript:

1 HPC for CST simulations E. Bonanno, M. Husejko 1

2 Outline HPC and CST Initialization New Job example Shortcut for new jobs Job and Task progresses Further information on the Job/Task Occurred errors Performance Documentation and support 2

3 HPC 3 HPC  High Performance Computing Cluster for technical software simulations Service portal for question and support: https://cern.service-now.com/service-portal/function.do?name=windowsHPC https://cern.service-now.com/service-portal/function.do?name=windowsHPC

4 CST 4 CST Studio Suite is a commercial software for 3D electromagnetic simulations. Current version available on cluster: CST 2016 SP2 One should have the most similar version to it (possibly the same) on the local machine where he/she wants to visualize the results.

5 Initialization 5 To run the CST simulations efficiently on the cluster you need to do the access request: https://cern.service-now.com/service- portal/function.do?name=windowsHPC https://cern.service-now.com/service- portal/function.do?name=windowsHPC Request Fill the request form Short description  Access request to HPC for CST simulations Description: answer within the line at the following questions 1) What are you trying to simulate and where it is applied at CERN ? 2) Who is the project leader? 3) What is the time frame for the project? 4) Which application do you plan to use to perform the simulations ? 5) Have you already tried to run simulations on your local machine? 5a) If yes, then report the clock time of the simulation and amount of resources used: number of cores, amount of RAM consumed, and amount of disk space consumed by the solution. 5b) Please provide a log file for the most resource demanding simulation of yours. 6) Why you need HPC resources?

6 Initialization 6 When approved, your project scratch space will be available on the DFS under this folder: \\cern.ch\DFS\Projects\HPC\SCRATCH\CSTTEMP\[username]

7 Initialization 7 Two options for jobs submission: 1.Front-end Virtual Machine: computer name EHPCREG, membership in the cst-rsm-users e- group is required to be allowed to use it. 2.Install the HPC Pack on your local machine (https://twiki.cern.ch/twiki/bin/view/PESgroup/ HpcEngPack).https://twiki.cern.ch/twiki/bin/view/PESgroup/ HpcEngPack

8 Initialization 8 Login to the EHPCREG machine. Open and command window (cmd) from START menu and execute this command inside the command window: cluscfg setcreds You will be asked to provide your CERN login name together with domain: if your login name is TEST, then the complete login to cache will be CERN\TEST Start the HPC Job Manager

9 New Job example 9 New Job

10 New Job example 10 Notification at job termination suggested CST Arbitrary name to identify the job JT-CST2socket for 16 cores JT-CST1socket for 8 cores Both give 128 GB of RAM

11 New Job example 11 Add (at least) one task

12 New Job example 12 Arbitrary name to identify the task Command line for: "C:\Program Files (x86)\CST STUDIO SUITE 2016\CST DESIGN ENVIRONMENT.exe" -t -tw \\cern.ch\DFS\Projects\HPC\SCRATCH\CSTTEMP\%USERNAME%\[restOfThePathIfAny]\[filename].cst -Change [restOfThePathIfAny] and [filename]. -Avoid underscores and spaces, because they probably made it crush a few times in my case (when having this problem, it usually crushed in about 10s). -Don't change %USERNAME%. -Leave the " " for the.exe path. --t -tw are specific options for Particle Studio and Wakefield Solver, respectively.

13 New Job example 13 Arbitrary name to identify the task Other command line options are available in the CST 2016 Help Section  General Features  Command Line Options Syntax: "C:\Program Files (x86)\CST STUDIO SUITE 2016\CST DESIGN ENVIRONMENT.exe" [options] \\cern.ch\DFS\Projects\HPC\SCRATCH\CSTTEMP\%USERNAME%\[restOfThePathIfAny]\[filename].cst

14 New Job example 14 Many other command line options are available in the CST 2016 Help Section  General Features  Command Line Options Syntax: "C:\Program Files (x86)\CST STUDIO SUITE 2016\CST DESIGN ENVIRONMENT.exe" [options] \\cern.ch\DFS\Projects\HPC\SCRATCH\CSTTEMP\%USERNAME%\[restOfThePathIfAny]\[filename ].cst Useful examples: 1.Change [options] (including brackets) with: -t -p -t starts Particle Studio. -p starts Parametric Sweep Simulations with the current solver (this means one should close the project with the Particle Studio solver he/she wants for the Sweep) 2.Change [options] (including brackets) with: -m -e -m starts Microwave Studio. -e starts the Eigenmode Solver

15 New Job example 15 Working directory: \\cern.ch\DFS\Projects\HPC\SCRATCH\CSTTEMP\%USERNAME%\[restOfThePathIfAny] -Change [restOfThePathIfAny]. -Avoid underscores and spaces, because they probably made it crush a few times in my case (when having this problem, it usually crushed in about 10s). -Don't change %USERNAME%.

16 New Job example 16 Standard output: \\cern.ch\DFS\Projects\HPC\SCRATCH\CSTTEMP\%USERNAME%\[restOfThePathIfAny]\std.txt Standard output: \\cern.ch\DFS\Projects\HPC\SCRATCH\CSTTEMP\%USERNAME%\[restOfThePathIfAny]\err.txt -Change [restOfThePathIfAny]. -std.txt and err.txt are the default names for these output files. They’ll be described in the following. -Avoid underscores and spaces, because they probably made it crush a few times in my case (when having this problem, it usually crushed in about 10s). -Don't change %USERNAME%. Leave blank

17 New Job example 17 Set 16 and 16 Finish task creation

18 New Job example 18 Ensure you have the same (this is for JT-CST2socket, it can be slightly different for JT- CST1socket)

19 New Job example 19 Leave blank

20 New Job example 20 Add a new entry with: Name  HPC_CREATECONSOLE Value  TRUE …submit!

21 Shortcuts for new jobs 21 1.XML file: create a template for future use

22 Shortcuts for new jobs 22 1.XML file: use a previously created template

23 Shortcuts for new jobs 23 2.Copy Job: right click on an existing job to have most of the parameters already set

24 Job and Task progress 24 Job progress: the progress bar refers to the number of tasks completed  it doesn’t take into account the current task progresses! This means that a single task job can only show 0% (queued or running) or 100% (failed, cancelled, finished). A 2-tasks job can also show 50%, when the cluster moves to the second one.

25 Job and Task progress 25 How to check for the in-Task progresses? Check the std.txt file: it will reports the messages that usually appear in the CST Progress window. It shows, for instance, the involved parameters updates during sweep simulations. It doesn’t give a percentage of the task progress but is a good real-time feedback and it gives an rough idea of the single simulation progress.

26 Further information on the Job/Task 26 Check these tabs for further information on the ongoing (or previous) simulations

27 Occurred errors 27 1.Attempt of parallelization Where/when? Last simulations of a parametric sweep, usually. Are the previous ones compromised? No. How to detect? Job fails (not immediately). In the std.txt: … WARNING: A fatal error occurred during matrix calculation. Error details: SplitMergeManagerMPI - Could not split project files … WARNING: Solver run aborted. …

28 Occurred errors 28 1.Attempt of parallelization How to detect? (part 2) In the err.txt: … ERROR: Error in calculating solver matrix (parallelization enabled) ERROR: Parallelization Control terminated abnormally. … Why? HPC still doesn’t supports MPI (parallelization). CST sometimes automatically sets it on. Probably there’s an option or a checkbox to avoid it (I didn’t look for it so far).

29 Occurred errors 29 1.Attempt of parallelization Solution (after the failure): change the project sweep in order to decrease as possible the number of simulations, but including the missing ones. Restart the job (suggest Copy Job). The new simulations will be added to the existing results with new runID, just like nothing happened. If some simulations are repeated, they’ll be overlapped, with no consequences. std.txt and err.txt will be reset (unless you change their names).

30 Occurred errors 30 2.CST crashed on the computing server How to detect? In the Activity Log (see in the slide named Further Information) it tries to start several (3 in my case) times, failing after a few seconds every time. It gives the following error in the Error Message column. Double clicking on it…

31 Occurred errors 31 2.CST crashed on the computing server Double clicking on the job the details look like the following: Then clicking on the blue line: Error from node: P05799555K88117:HPC_CREATECONSOLE set to true but console is in use

32 Occurred errors 32 2.CST crashed on the computing server Solution: contact the support, explaining the problem. They will restart the machine and it should work again properly.

33 Occurred errors 33 3.CST stacked somewhere for more than expected How to detect it? From std.txt, when there are no changes for more than expected. For instance if one of many sweep simulations suddenly last far more than the previous ones, one should check if there are new messages in the std.txt and err.txt every once in a while. Solution: cancel and restart the job (see solution for case 1). Also here, the previous sweep simulations (if any) should be correct.

34 Performance 34 Wakefield sweep simulations: 1 sequence of 5 values Simulationcaevmsrv48HPC 1 st 5h14m30s0h34m54s 2 nd 5h24m58s0h37m44s 3 rd 6h18m35s0h37m09s 4 th 6h04m17s0h35m49s 5 th 5h59m32s0h35m59s Total29h01m52s3h01m35

35 Documentation Documentation: https://twiki.cern.ch/twiki/bin/view/PESgroup/HpcCSTBatch Service portal for question and support: https://cern.service-now.com/service-portal/function.do?name=windowsHPC CST 2016 Help Section for command line: General Features  Command Line Options


Download ppt "HPC for CST simulations E. Bonanno, M. Husejko 1."

Similar presentations


Ads by Google