Presentation is loading. Please wait.

Presentation is loading. Please wait.

S UMMER I NTERNSHIP Douglas Drobny Idaho National Laboratory High Performance Computing.

Similar presentations


Presentation on theme: "S UMMER I NTERNSHIP Douglas Drobny Idaho National Laboratory High Performance Computing."— Presentation transcript:

1 S UMMER I NTERNSHIP Douglas Drobny Idaho National Laboratory High Performance Computing

2 W HO I WORKED FOR Idaho National Laboratory Idaho Falls High Performance Computing group Manages ~4 different clusters Supports and maintains software for big research progress. User Support group

3 C LUSTERS Fission 12,512 processors 25 TBytes of memory Icestorm 2048 processors 4 TBytes of memory Quark Eos

4 C OMPUTE M ANAGER Current job submissions are command line Goals Web interface for PBS Scheduler Easy to use Behaves the same as current job submissions Improved error message handling

5 S ETUP Application Services On the server head nodes Receive web requests Submits Jobs Compute Manager On the web server Creates web forms Sends results to App. Services Displays Results

6

7

8 W HAT I DID Installed compute manager and AIF on Eos Created test cases for PBS features Created test cases for User Inputs Submit feedback / bug reports with PBS Documented process for future implementations / troubleshooting

9 R ESULTS Good Easy to create different application forms Instant job monitoring Restrict input values Default input values Secure file transferring

10 R ESULTS Bad Easy to put results in insecure location Always copies the input files Missing a form entry can result in lost output files Spams the sudo log “Fixed in next version (Week after I leave)”

11 U PDATING HPC W IKI Moinmoin wiki (python) 1.8.8 to 1.9.4 Used temporary virtual machine to test update and fix issues Added support for viewing reports Deployed on hpcweb Note: Learn what type of service monitoring is being used before taking down a system.

12 W IKI R EPORTS Automatically generate a visual report of an XML document each month Created the XSL Putting data into charts Automation ('Right' way vs. Working way) Editing to reduce transcription errors

13 XSL/XML Goal: Display XSL/XML pages inside of a wikipage Problems Moinmoin uses outdated XSL library XSL can contain javascript (XSS) Solution Created a wiki macro to convert XML with a specific XSL stylesheet on the server

14 I NTEL C OMPILER I SSUE (ICC) Issue Compile times on Quark are much longer than Fission (head nodes) Quark should be faster (hardware wise) 17 minutes on Quark 8 minutes on Fission

15 I NTEL C OMPILER S TEPS Create test cases Determine effected systems Enable debugging Strace Wireshark Hardware Test Environment

16 ICC S OLUTION License files were resolved in the order License manager User's home directory /opt/intel /apps/intel/..../license 'Errors' in the license file cause the system to check all of the sources

17 ICC S OLUTION The /opt/intel license files pointed to the license manager This caused additional requests to the license manager (takes time) Quark's /opt/intel license files pointed to the license servers the most *Removed /opt/intel/license folder to fix the problem.

18 T HINGS L EARNED Python XSL Creating and Signing SSL Keys Unix permissions Strace Testing Refactoring Monitoring Vim!


Download ppt "S UMMER I NTERNSHIP Douglas Drobny Idaho National Laboratory High Performance Computing."

Similar presentations


Ads by Google