Presentation is loading. Please wait.

Presentation is loading. Please wait.

Changhua, Li National Astronomical Observatory of China Short Report on the laohu GPU cluster usage at NAOC.

Similar presentations


Presentation on theme: "Changhua, Li National Astronomical Observatory of China Short Report on the laohu GPU cluster usage at NAOC."— Presentation transcript:

1 Changhua, Li National Astronomical Observatory of China Short Report on the laohu GPU cluster usage at NAOC

2 Introduction of Laohu Hardware configuration: 85 nodes+ Infiniband+140T Node hardware info: Lenovo R740 2 Xeon E5520 CPUs 24GB Memory 500G disk, 2 Nvidia C1060 GPU cards. Laohu GPU cluster built in 2009, the peak of single precision performance is 160TFLOPs. Total Cost: 6 million RMB 2009 (4/1 Min. of Finance ZDYZ A06/NAOC)

3 C cores, 4GMemory, 933GFlops In Sep we bought 59 K20 GPU cards for 59 nodes we spent 1.18 million RMB. So, the new laohu configuration is 59 hosts with one k20 GPU card and 26 hosts with 3 C1060 GPU cards In theory, the peak of single precision performance is 280 TFLOPS/s.

4

5 Platform LSF (Load Sharing Facility) is a suite of distributed resource management products that: Connects computers into a Cluster (or Grid) Monitors loads of systems Distributes, schedules and balances workload Controls access and load by policies Analyzes the workload High Performance Computing (HPC) environment

6 GPU queues gpu_16: k20 host, max cores: 16, min cores: 4, total cores limitation: 32 gpu_8: k20 host, max cores: 8, min cores: 2, total cores limitation: 24 gpu_k20_test: k20 host, only 2 croes for one job, total cores limitation: 3 gpu_c1060: c1060 host, max cores: 30, min cores: 2, total cores limitation: 66 gpu_c1060_test: c1060 host, only 3 cores for one job, total cores limitation: 9

7 CPU queues cpu_32: nodes with 7/5 Cpu cores per node (192 cores) for per job, Allow to execute as two jobs. Maximum running time 1 week. cpu_large: nodes with 7/5 Cpu cores per node (total: 48 cores). Allow to execute as many jobs. Maximum running time 1 week. cpu_small: nodes with 7/5 cpu cores per node for per single job.Allow to execute as many job to fill 8 nodes/48 cpu cores. Maximum running time 1 week. cpu_test: nodes with 7/5 cpu cores per node(total: 30 cores). Allow to execute as many job to fill 5 nodes/30 cpu cores. Maximum running time 3 hours

8 Sample 1: cpujob.lsf #!/bin/sh #BSUB -q cpu_32 #job queue, modify according to user #BSUB -a openmpi-qlc #BSUB -R 'select[type==any] span[ptile=6] #resource requirement of host #BSUB -o out.test #output file #BSUB -n 132 #the maximum number of CPU mpirun.lsf --mca "btl openib,self" Gadget2wy WJL.PARAM # need modify for users program. Exec method: bsub < cpujob.lsf

9 Sample 2: gpujob.lsf #!/bin/sh #BSUB -q gpu_32 #job queue #BSUB -a openmpi-qlc #BSUB -R 'select[type==any] #resource requirement of host #BSUB -o out.test #output file #BSUB –e out.err #BSUB -n 20 #the maximum number of CPU mpirun.lsf --prefix "/usr/mpi/gcc/openmpi qlc" -x "LD_LIBRARY_PATH=/export/cuda/lib:/usr/mpi/gcc/openm pi qlc/lib64"./phi-GRAPE.exe # need modify for users program. Exec method: bsub < gpujob.lsf

10

11

12 CUDA4.0/CUDA5.0 OPENMPI/IntelMPI, etc. GCC 4.1/GCC4.5 Intel Compiler Math lib: blas, gsl, cfitsio, fft,… Gnuplot, pgplot Gadget

13

14 2012 (Avg. 74%)

15 2013 (Avg. 64%)

16 1.NBODY Simulations (NBODY6++, phiGPU, Galactic Nuclei, Star Clusters) 2.NBODY Simulations (Gadget2, galactic dynamics) 3.Correlator(only test) 4.Gravitational Microlensing 5.Local spirals formation through major merger 6.Dark energy survey 7. TREND, the Mento carlo simulation for the extreme-high energy Extensive AirShower(EAS) 8. Parallelization of Herschel Interactive Processing Environment 9. The HII region and PDR modeling based on CLOUDY code 10. Reconstructing primordial power spectrum and dark energy equation of state ……

17 Astrophysical Supercomputing with Green GPU Clusters in Jülich and Beijing Rainer,peter03/2012http://inside.hlrs.de/pdfs/inSiDE_spring2012.pdf Loops formed by tidal tails as fossil records of a major merger Wang, J.; Hammer, F.; Athanassoula, E.; Puech, M.; Yang, Y.; Flores, H. 02/2012http://adsabs.harvard.edu/abs/2012A%26A...538A.121W Made-to-measure galaxy models - III Modelling with Milky Way observations Long,R.J.; Mao,Shude; Shen,Juntai; Wang,Yougang 09/2012http://adsabs.harvard.edu/abs/2012arXiv L Made-to-measure galaxy models - II. Elliptical and lenticular galaxies Long, R. J.; Mao, Shude04/2012http://adsabs.harvard.edu/abs/2012MNRAS L A New Model for the Milky Way Bar Wang,Yougang; Zhao,Hongsheng; Mao,Shude; Rich, R.M. 09/2012http://adsabs.harvard.edu/abs/2012arXiv W On the Survivability and Metamorphism of Tidally Disrupted Giant Planets: the Role of Dense Cores Liu, Shang-Fei; Guillochon, James; Lin, Douglas N. C.; Ramirez- Ruiz, Enrico 11/2012http://adsabs.harvard.edu/abs/2012arXiv L Interaction of Recoiling Supermassive Black Holes with Stars in Galactic Nuclei Li, Shuo; Liu, F. K.; Berczik, Peter; Chen, Xian; Spurzem, Rainer 03/2012http://adsabs.harvard.edu/abs/2012ApJ L Berczik, P., Nitadori, K., Zhong S., Spurzem, R., Hamada, T, Wang, X.W., Berentzen, I., Veles, A., Ge, W., Proceedings of the International conference on High Performance Computing High Performance massively parallel direct N-body simulations on large GPU clusters Amaro-Seoane, P., Miller, M. C., Kennedy, G. F., Monthly Notices of the Royal Astronomical Society Tidal disruptions of separated binaries in galactic nuclei Just, A., Yurin, D., Makukov, M., Berczik, P., Omarov, C., Spurzem, R., Vilkoviskij, E. Y., The Astrophysical Journal Enhanced Accretion Rates of Stars on Supermassive Black Holes by Star-Disk Interactions in Galactic Nuclei Taani, A., Naso, L., Wei, Y., Zhang, C., Zhao, Y., Astrophysics and Space Science Modeling the spatial distribution of neutron stars in the Galaxy Olczak, C., Spurzem, R., Henning, T., Kaczmarek, T., Pfalzner, S., Harfst, S., Portegies Zwart, S., Advances in Computational Astrophysics: Methods, Tools, and Outcome Dynamics in Young Star Clusters: From Planets to Massive Stars Spurzem, R., Berczik, P., Zhong, S., Nitadori, K., Hamada, T., Berentzen, I., Veles, A., Advances in Computational Astrophysics: Methods, Tools, and Outcome Supermassive Black Hole Binaries in High Performance Massively Parallel Direct N-body Simulations on Large GPU Clusters Khan, F. M., Preto, M., Berczik, P., Berentzen, I., Just, A., Spurzem, R., The Astrophysical Journal Mergers of Unequal-mass Galaxies: Supermassive Black Hole Binary Evolution and Structure of Merger Remnants Li, S., Liu, F. K., Berczik, P., Chen, X., Spurzem, R., The Astrophysical Journal Interaction of Recoiling Supermassive Black Holes with Stars in Galactic Nuclei.....

18

19 PORTAL(Gridway) GLOBUS Toolkits Internet

20

21

22

23


Download ppt "Changhua, Li National Astronomical Observatory of China Short Report on the laohu GPU cluster usage at NAOC."

Similar presentations


Ads by Google