Presentation is loading. Please wait.

Presentation is loading. Please wait.

Empowering efficient HPC with Dell Martin Hilgeman HPC Consultant EMEA.

Similar presentations


Presentation on theme: "Empowering efficient HPC with Dell Martin Hilgeman HPC Consultant EMEA."— Presentation transcript:

1 Empowering efficient HPC with Dell Martin Hilgeman HPC Consultant EMEA

2 Global HPC Group Amdahl’s Law Gene Amdahl (1967): "Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities". AFIPS Conference Proceedings (30): 483–485. CHPC conference 2013 “The effort expended on achieving high parallel processing rates is wasted unless it is accompanied by achievements in sequential processing rates of very nearly the same magnitude” a: speedup n: number of processors p: parallel fraction 2

3 Global HPC Group Amdahl’s Law limits maximal speedup CHPC conference 2013 a: speedup n: number of processors p: parallel fraction 3

4 Global HPC Group Amdahl’s Law and Efficiency Diminishing returns: Tension between the desire to use more processors and the associated “cost” 4 CHPC conference 2013

5 Global HPC Group The Real Moore’s Law 5 The clock speed plateau The power ceiling IPC limit CHPC conference 2013

6 Global HPC Group Meanwhile Amdahl’s Law says that you cannot use them all efficiently Industry is applying Moore’s Law by adding more cores 6 Moore’s Law vs Amdahl's Law - “too Many Cooks in the Kitchen” CHPC conference 2013

7 Global HPC Group What levels do we have? Challenge: Sustain performance trajectory without massive increases in cost, power, real estate, and unreliability Solutions: No single answer, must intelligently turn “Architectural Knobs” 7 Hardware performance What you really get 12345 CHPC conference 2013

8 Global HPC Group Turning the knobs 1 - 4 8 1 Frequency is unlikely to change much Thermal/Power/Leakage challenges 2 Moore’s Law still holds: 130 -> 22 nm. LOTS of transistors 3 Number of sockets per system is the easiest knob. Challenging for power/density/cooling/networking 4 IPC still grows FMA3/4, AVX, FPGA implementations for algorithms Challenging for the user/developer CHPC conference 2013

9 Global HPC Group Traditional IT server utilization rates remain low New µServers are emerging, x86 and ARM Further movement from 4->2->1 socket systems as their capabilities expand What to do with all the capacity? Software defined everything….. 9 Meanwhile… traditional IT is swimming in performance CHPC conference 2013

10 Global HPC Group Scaling sockets, power and density 10 ARM/ATOM: potential to disrupt perf/$$, perf/Watt model Shared Infrastructure evolving Highest efficiency for power and cooling Extending design to facility Modularized compute/ storage optimization 2000 nodes, 30 PB storage, 600 kW in 22 m2 CHPC conference 2013

11 Global HPC Group Which leaves knob 5: make your hands dirty! DO it=1,noprec DO itSub=1,subNoprec ix = ir(1,it,itSub) iy = ir(2,it,itSub) iz = ir(3,it,itSub) idx = idr(1,it,itSub) idy = idr(2,it,itSub) idz = idr(3,it,itSub) sum = 0.0 testx = 0.0 testy = 0.0 testz = 0.0 DO ilz=-lsz,lsz irez = iz + ilz IF (irez.ge.k0z.and.irez.le.klz) THEN DO ily=-lsy,lsy irey = iy + ily IF (irey.ge.k0y.and.irey.le.kly) THEN DO ilx=-lsx,lsx irex = ix + ilx IF (irex.ge.k0x.and.irex.le.klx) THEN sum = sum + field(irex,irey,irez)& * diracsx(ilx,idx) & * diracsy(ily,idy) & * diracsz(ilz,idz) * (dx*dy*dz) testx = testx + diracsx(ilx,idx) testy = testy + diracsy(ily,idy) testz = testz + diracsz(ilz,idz) END IF END DO END IF END DO END IF END DO rec(it,itSub) = sum END DO 11 DO itSub=1,subNoprec DO it=1, noprec ix = ir(1,it,itSub) iy = ir(2,it,itSub) iz = ir(3,it,itSub) idx = idr(1,it,itSub) idy = idr(2,it,itSub) idz = idr(3,it,itSub) sum = 0.0 startz = MAX(iz-lsz,k0z) starty = MAX(iy-lsy,k0y) startx = MAX(ix-lsx,k0x) stopz = MIN(iz+lsz,klz) stopy = MIN(iy+lsy,kly) stopx = MIN(ix+lsx,klx) DO irez = startz, stopz ilz = irez - iz IF (diracsz(ilz,idz).EQ. 0.d0 ) THEN CYCLE END IF dirac_tmp1 = diracsz(ilz,idz)*(dx*dy*dz) DO irey = starty, stopy ily = irey - iy dirac_tmp2 = diracsy(ily,idy) * dirac_tmp1 DO irex = startx, stopx ilx = irex - ix sum = sum + field(irex,irey,irez) & * diracsx(ilx,idx) & * dirac_tmp2 END DO rec(it,itSub)=sum END DO 92 seconds 17 seconds CHPC conference 2013

12 Global HPC Group 12 Efficiency optimization also applies across nodes CHPC conference 2013

13 Global HPC Group 13 CHPC conference 2012


Download ppt "Empowering efficient HPC with Dell Martin Hilgeman HPC Consultant EMEA."

Similar presentations


Ads by Google