Presentation is loading. Please wait.

Presentation is loading. Please wait.

J.J. Keijser Nikhef Amsterdam Grid Group MyFirstMic experience Jan Just Keijser 26 November 2013.

Similar presentations


Presentation on theme: "J.J. Keijser Nikhef Amsterdam Grid Group MyFirstMic experience Jan Just Keijser 26 November 2013."— Presentation transcript:

1 J.J. Keijser Nikhef Amsterdam Grid Group MyFirstMic experience Jan Just Keijser 26 November 2013

2 J.J. Keijser Nikhef Amsterdam Grid Group

3 J.J. Keijser Nikhef Amsterdam Grid Group What have we got? Supermicro server 'pleedo' Dual E5-2620 @ 2.00 GHz 64 GB RAM Two Xeon Phi (aka 'Intel MICs') model 5110 @ 1052631 kHz Each card has ◦60 cores with 4 threads each ◦8 GB DDR5 RAM ◦PCI Express v2 x16 interface @ 5.0 GT/s

4 J.J. Keijser Nikhef Amsterdam Grid Group What can we do with it? Massively parallel computing: ◦Manycore applications ◦OpenMP and/or MPI jobs Runs Linux ◦Kewl! 'cat /proc/cpuinfo' returns 240 cores per card Can be reached via minicom and ssh

5 J.J. Keijser Nikhef Amsterdam Grid Group How do we do that? 1.Recompile code using Intel C/C++ and/or FORTRAN compiler suite 2.Copy file to Xeon Phi (or use NFS to exchange data) 3.Run code natively on the Xeon Phi 4.Copy results back (or again, use NFS)

6 J.J. Keijser Nikhef Amsterdam Grid Group Sounds too easy... It does ◦Recompiling code does not optimize it for the new architecture ◦The Intel compiler and the gcc compiler are almost compatible ◦The Intel compiler runs on the host machine, not on the Xeon Phi, so we're cross-compiling ◦(Some HEP build frameworks do not like this)

7 J.J. Keijser Nikhef Amsterdam Grid Group Openssl speed test Completely useless but very handy test: ◦openssl speed -evp aes-256-cbc -multi Test was extended to 30 seconds (default=3) and was run 3 times for each value of Advantages: ◦Compare results to regular CPUs ◦Scales very well with the number of cores/threads ◦Embarrassingly parallel ◦Low memory usage

8 J.J. Keijser Nikhef Amsterdam Grid Group Results # cores 60120180

9 J.J. Keijser Nikhef Amsterdam Grid Group Versus E5-2697v2@2.7/3.5 GHz # cores That's only a factor of 15 Even when correcting for optimisation the difference is still a factor of 5

10 J.J. Keijser Nikhef Amsterdam Grid Group Quantum chemistry code “Real life” usecase C & FORTRAN OpenMP Optimised for “normal” Xeons Low memory usage

11 J.J. Keijser Nikhef Amsterdam Grid Group Results (Courtesy Mark Somers @ Leiden University) Do not be fooled by this plot: the E5-2670 is still a factor of 10 faster and can access all system memory

12 J.J. Keijser Nikhef Amsterdam Grid Group Issues Initially the Xeon Phi's were very unstable Core temperature when idle was 91ºC During testing temperature went up to 98ºC after which the card shut down and a hardware reset was necessary ◦Fixed by setting chassis fans to always be on Intel compiler suite requires a license. For academic use a free license can be used, valid for 1 year 'root' code has a compile target for the Xeon Phi's that does not work out of the box

13 J.J. Keijser Nikhef Amsterdam Grid Group What's next? Cross-compile 'openjdk' Get 'root' running Examine “dips” in performance Gain more experience in debugging and tuning multi/many core applications

14 J.J. Keijser Nikhef Amsterdam Grid Group


Download ppt "J.J. Keijser Nikhef Amsterdam Grid Group MyFirstMic experience Jan Just Keijser 26 November 2013."

Similar presentations


Ads by Google