Presentation is loading. Please wait.

Presentation is loading. Please wait.

Moore vs. Moore Rainer Schwemmer, LHCb Computing Workshop 2015.

Similar presentations


Presentation on theme: "Moore vs. Moore Rainer Schwemmer, LHCb Computing Workshop 2015."— Presentation transcript:

1 Moore vs. Moore Rainer Schwemmer, LHCb Computing Workshop 2015

2 Moore vs. Moore’s Law 2 Trigger Decisions/s actual Online farm hardware Fit of average CPU FLOPS Since ~1970 FLOPS of actual Online farm hardware Tests done with 2014 Farm tender benchmark Dates are release dates of CPU

3 Moore vs. Moore’s Law 3 Approximately 80% of available compute power is currently unused (vs. 2007) Tests done with 2014 Farm tender benchmark Dates are release dates of CPU

4 Moore vs. Moore’s Law 4 Approximately 80% of available compute power is currently unused (vs. 2007) ~ 1.8 MCHF alone for latest batch of farm nodes Tests done with 2014 Farm tender benchmark Dates are release dates of CPU

5 Moore vs. Moore’s Law 5 Approximately 80% of available compute power is currently unused (vs. 2007) ~ 1.8 MCHF alone for latest batch of farm nodes No power, No cooling No grid resources Tests done with 2014 Farm tender benchmark Dates are release dates of CPU

6 Yes, but … Why compare to floating point? –Contrary to common believe our code does contain a significant amount of floating point ops At least this is what I’ll claim until someone proves me wrong … At least for Xeon CPUs: direct correlation to Moore’s law (the transistor density one) Moore’s law is very generic and does not apply to individual CPUs FLOPS are a direct measure of performance for a particular CPU Take a look at what CPU vendors are selling –Except to the cloud people (Do you want ARM? Cause this is how you get ARM) This is why ATOM exists 6

7 What happens if this trend continues? 7

8 Trigger Decisions per GFLOP 8 By 2019 we will need to buy 1 GFLOP for 1 Trigger decision … where we used to need 0.05 in 2007 HLT Moore’s Law  approximately x1.24/y

9 Trigger Decisions per GFLOP 9 By 2019 we will need to buy 1 GFLOP for 1 Trigger decision Of current event complexity! (Goes with ~n 2 in event size) We paid 5.3 CHF per GFLOP in 2014 (GPUs are much cheaper btw.) Probably 1/3 to 1/4 until 2019  Exercise: Calculate cost of run 3 farm

10 What contributes to this decay? Vectorization –We are not using it and significant computing power increase comes via this route Excessive use of encapsulation –You can not write high performance code without knowing what the other side of an interface (or the hardware) is doing  Memory issues (speed and occupancy) No multi threading (yes I know, it’s hard) –Independent program instances are trampling over each other in the caches  More memory issues 10

11 “Memory Issues” 11 With every generation: Cache misses are slightly reduced Clocks lost are still increasing though  Memory latency is increasing (DDR3  DDR4 + contention)

12 Things that work in our favour Different detector –Might be more efficient –How much? Reject more events earlier in the processing chain than now –Need on average less compute power per event Time –We still have time to make our software more efficient 12

13 Things working against us New detector –Might be less efficient from a computing point of view More complex events –n 2 in event size (x2  x4 in processing power) The majority of the computing world buys Floating Point Operations –And that’s where CPU manufacturers are heading 13

14 Conclusion Putting effort into optimization is worth it –A lot can still be gained by better organization of memory accesses –Incidentally this is a must for accelerators and research there can be backported We should try to jump back onto the floating point bandwagon –If you think our software is not floating point sensitive, maybe we should make sure it is. –We would profit a lot more from the improvements in CPUs 14

15 Discussion


Download ppt "Moore vs. Moore Rainer Schwemmer, LHCb Computing Workshop 2015."

Similar presentations


Ads by Google