Download presentation

Presentation is loading. Please wait.

Published byCharles Nutt Modified over 2 years ago

1
IIAA GPMAD A beam dynamics code using Graphics Processing Units GPMAD (GPU Processed Methodical Accelerator Design) utilises Graphics Processing Units (GPUs) to perform beam dynamics simulation. The understanding of modern particle accelerators requires the simulation of charged particle transport through the machine elements. These simulations can be time consuming due to many-particle transport being computationally expensive. Modern GPUs can be used to run such simulations with a significant increase in performance at an affordable price; here the NVidia CUDA architecture is used. The speed gains of GPMAD have been documented in [1], building on this, GPMAD was upgraded and a space charge algorithm included. GPMAD is benchmarked against MAD-X [2] and ASTRA [3]. Test cases of the DIAMOND Booster-to-Storage lattice at RAL and the ALICE transfer line at Daresbury are used. It is found that particle transport and space charge calculations are suitable for the GPU and large performance increases are possible in both cases. H. Rafique, S. Alexander, R. Appleby, H. Owen Particles are treated as 6-vectors in phase space: x = transverse horizontal position p x = transverse horizontal momentum y = transverse vertical position p y = transverse vertical momentum τ = time of flight relative to ideal reference particle p t = ∆E/p s c ∆E = energy relative to ideal reference particle P s = nominal momentum of an on-energy particle GPMAD uses TRANSPORT [4] maps, as used in MAD-X, to perform the operation of transporting particles through magnetic elements. The full Taylor expansion is truncated to order two. First order R terms are represented by 6x6 matrices. This method assumes that particles do not interact with each other in the ultra-relativistic limit. Copy from host to device Copy from device to host Loop over magnetic elements Half Matrix Kernel Space Charge Kernel Half Matrix Kernel When operating at ultra-relativistic energies, space charge forces may be omitted, in this case GPMAD operates the complete transport of all particles in a single Kernel function, thus minimising time taken for memory copies to and from the GPU. When operating at lower energies (of the order of the particle mass) the space charge algorithm may be included. In this case three Kernel functions are launched per magnetic element as shown in the flow diagram starting here: Particle transport and space charge effects are performed on the GPU (device) via ‘Kernel functions’ the remaining part of the code operates on the CPU (host). In order to handle memory efficiently, two sets of particle data are used on the GPU, denoted by the superscript {1} and {2}. Illustrated below is the transport of N particles through a simple drift element followed by a quadrupole element. Note that the superscripts 0, 1 and 2 denote the initial (0) particle data and subsequent particle data after transport through 1 or 2 magnetic elements. Figure 1: TWISS parameters for the DIAMOND BTS GPMAD compared to MAD-X (no space charge) Figure 4: Transverse beam emittance for the ALICE transfer line - GPMAD compared to ASTRA (with space charge) Figure 3: Stability under magnetic element splitting – GPMAD with space charge for a 1m quadrupole Figure 2: Run times for the DIAMOND BTS GPMAD compared to MAD-X (no space charge) Without space charge we see that the TWISS parameters (optical parameters that characterise the bunch of particles) are identical to MAD-X, in fact the raw particle data is identical to 10 significant figures. Figure 2 compares the performance of GPMAD to that of MAD-X, it is clear that GPMAD offers an accurate particle tracking code with considerable improvement in performance over the MAD-X tracking algorithm. Figure 3 shows that GPMAD’s space charge algorithm is stable under magnetic element splitting – here a 1 metre quadrupole element is split into 10, 20 and 100 parts. We can infer from Figure 4 that GPMAD’s space charge algorithm gives similar emittance growth behaviour to that of ASTRA for identical initial particle distributions. Figure 5 illustrates the performance benefits of GPMAD over ASTRA – this is made even more apparent in Figure 6 where a logarithmic scale has been used to show that GPMAD is around 100 times faster than ASTRA. The performance gains of GPMAD scale with the GPU that is used, newer models offer more processors and thus better performance. GPMAD is a proof of principle; for parallel problems such as many particle transport, the GPU offers an affordable and mobile solution. Here we have implemented an algorithm which exploits the parallel nature of the GPU, and in doing so offer performance comparable with HPC at a substantial monetary saving. [1] M.D. Salt, R. B. Appleby, D. S. Bailey, Beam Dynamics using Graphical Processing Units – EPAC08 – TUPP085 [2] Methodical Accelerator Design, mad.web.cern.ch/mad/ [3] A Space charge Tracking Algorithm, www.desy.de/~mpyflo/ [4] K. Brown, A First- and Second-Order Matrix Theory for the Design of Beam Transport Systems and Charged Particle Spectrometers – SLAC-75 CPUGPU The CPU operates sequentially. To transport N particles through a single magnetic element, each particle will be transported one after the other. The time taken scales like ~ N 2. In reality N is large – to simulate this requires either a large run time or expensive hardware (HPC). The GPU is a parallel processor. Using the Singe Instruction Multiple Data framework (SIMD) N particles can be transported through the same magnetic element simultaneously. The time taken scales like ~ N. This allows parallel problems such as particle tracking to be performed quickly and inexpensively. Particle Magnetic Element Bunch of Particles Figure 5: Run times for the ALICE Transfer Line GPMAD compared to ASTRA (with space charge) Figure 6: Run times for the ALICE Transfer Line GPMAD compared to ASTRA (with space charge)

Similar presentations

OK

Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.

Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on switching devices on clash Ppt on biodiesel from algae Ppt on teachers day wishes Ppt on hindu religion wikipedia Ppt on bio battery free download Ppt on pre-ignition definition Ppt on javascript events Ppt on human resource management system Ppt on our environment for class 10th free download Ppt on email etiquettes presentation