OCAPIE project W. Borreani (INFN) & G. Ciaccio (DIBRIS-Unige) Outline

OCAPIE project W. Borreani (INFN) & G. Ciaccio (DIBRIS-Unige) Outline
(Ottimizzazione di CAlcolo Parallelo Intensivo Applicate a problematiche in ambito Energetico) W. Borreani (INFN) & G. Ciaccio (DIBRIS-Unige) Outline Project description and objectives Codes to be used (some) KNL architecture KNL Genova Cluster Performance tests and scalability OCAPIE collaboration: P. Saracco, W. Borreani, A. Brunengo, V. Calvelli, G. Ciaccio (Unige), M. Corosu, S. Farinon, G. Lomonaco (Unige) R. Musenich, M. Osipenko, M. Ripani, M. Taiuti (Unige) + PoliTo, PoliMi, ANN (Ansaldo Nucleare spa), ASG Superconductors CCR Workshop – May LNGS 1

Amount of the project: 200.000 € (180.000 financed)
Project description and objectives Financed in 2016 by “Compagnia di San Paolo” (Torino), following a public call and selection Amount of the project: € ( financed) to buy servers, for 2 AdR (W. Borreani, V. Calvelli) + movements Objectives: “the study and the development of parallel computing technologies useful to the description of multiscale/multiphysics phenomena in the broad field of energy production and distribution: complex systems modelization usually is grounded on a variety of physical models, because of the wide differences in the scales at which different physical phenomena occur.” “The traditional approach to describe such complex systems relies on the sequential use of simulation tools specifically tailored for each subset of the physical problems involved: (…)Realization of a full and reliable model usually requires many iterations of this computational process; then completion of the calculation is computationally expensive and requires large amount of human work…” to develop and optimize an intermediate approach: parallelization and (partial) integration of existing and reliable computational/simulation codes, by using the newest hardware and software technologies. This should reduce both the computational demands and the time needed to the full R&D process, due to (at least partial) use of computational methodologies that are optimized for the different physical processes involved and to the removal of the need of a validation phase, which has been carried on along decades for each of the codes. CCR Workshop – May LNGS 2

OpenFOAM v. 1612+ (MPI) for 3D thermohydraulics modelization
Codes to be used Transport codes: mainly MCNP6 (OpenMP, MPI), but we project to use also GEANT4, Tripoli, Fluka and Serpent 2 (integrates thermohydraulics, at least partially) Typical run times for realistic configurations on single CPU ∼ month(s) OpenFOAM v (MPI) for 3D thermohydraulics modelization Typical run times for realistic configurations on single CPU: steady state ∼ day(s), transients ∼ month(s) Need to iterate transport and thermohydraulics until convergence Different solvers and calculation meshes Use of validated codes for industrial design In-house developed code (OpenMP+MPI), dedicated to quench propagation in superconducting devices (see later on) Typical run times for realistic configurations on single CPU: days or more CCR Workshop – May LNGS 3

KNL package OmniPath available on some models only
CCR Workshop – May LNGS 4

KNL internal mesh CCR Workshop – May LNGS 5

KNL tile CCR Workshop – May LNGS 6

KNL MCDRAM CCR Workshop – May LNGS 7

KNL clustering modes CCR Workshop – May LNGS 8

MKL KNL e Xeon CCR Workshop – May LNGS 11

MKL KNL e Xeon CCR Workshop – May LNGS 12

MKL dcsrgemv() @ KNL e Xeon

MKL @ KNL: hyperthreading
Peak performance with #threads = #cores, no improvement with hyperthreading. Possible reasons: BLAS level 3 (CPU-bound): code so well optimized that one thread is enough to saturate a core BLAS level 2, and sparse (memory-bound): 64 threads are enough to saturate DRAM bandwidth But had no time to investigate this (do we need to?) CCR Workshop – May LNGS 14

KNL: notes quad/cache and a2a/cache look like best ones, overall. a2a/flat-mcdram good too, but limited to small jobs (16 GB max), so not quite practical weird performance of BLAS level 2 and sparse on snc4/cache; maybe because such a NUMA-like configuration requires suitable data placement in memory Sparse matrix * vector: MKL shows a bare 1.3 speedup over a handcrafted one written in C (CSR format, Intel compiler) when size is large (>100M) CCR Workshop – May LNGS 15

superconducting magnet
An electromagnet made of windings of a cable containing a superconducting wire inside. Zero resistance: huge current ( Ampere) can flow indefinitely, no power supply during service, no heat dissipation. Intense magnetic field (orders of 10 Tesla), not feasible with ordinary conductors. Cooling system (liquid He) necessary for keeping the superconducting state. CCR Workshop – May LNGS 16

the cable CCR Workshop – May LNGS 17

applications: magnetic resonance

applications: nuclear physics

quench Loss of superconductivity in a region of the wire.
Various causes: defective wire, mechanical stress, electric malfunction, cooling failure, nuclear particles swarm. More likely during the initial charging of the magnet, but also possible during normal operation. The resistive region dissipates energy as heat. Heat expands to near regions, triggering further losses of superconductivity. Energy dissipation increases, more heat is generated... CCR Workshop – May LNGS 20

quench CCR Workshop – May LNGS 21

quench Superconducting magnets are expensive toys...safety devices must detect and absorb the quench before a permanent damage occurs. The correct dimensioning of the safety devices requires a model of quench behaviour in space/time. Possible use of a simulator. CCR Workshop – May LNGS 22

quench simulators Some quench simulators are available, but
simplified model of the cable (single material) source code not available; impossible to customize slow execution; acceptable run time achieved at expenses of accuracy, rather than HPC techniques. So there is room for improvement, especially with parallel computing! CCR Workshop – May LNGS 23

QUEPS: QUEnch Parallel Simulator
A new quench simulator from scratch. work in progress. Start simple but with a look ahead: born as a parallel program: C++, MPI+OMP multiple-material cable solenoids only, no iron (for now) cooling: adiabatic, convection discretization: finite volume solver: BiCGSTAB + preconditioner CCR Workshop – May LNGS 24

QUEPS @ KNL and Xeon Test with a solenoid from literature
30 layers, 24 turns per layer cable is thin, NbTi + Cu + outer insulation adiabatic simulated time: 1.2 sec (quench inception at t=0) mesh: 3.8 millions cells tests on a single KNL node: OpenMP, MPI, hybrid comparison with a dual Xeon E x8 cores @ 2.6 GHz CCR Workshop – May LNGS 25

QUEPS @ KNL and Xeon: MPI
MCDRAM configured in cache mode execution times in minutes 64 processes 128 processes 256 processes a2a 125 81 86 quad 134 80.5 85 snc4 125.5 78.5 82 16 processes 32 processes (hyperthreading) dual Xeon 59.5 77 CCR Workshop – May LNGS 26

QUEPS @ KNL and Xeon: OpenMP
MCDRAM configured in cache mode execution times in minutes 64 threads 128 threads 256 threads a2a 55.5 39 35.5 quad 53 34 snc4 58 32 16 threads 32 threads (hyperthreading) dual Xeon 65 CCR Workshop – May LNGS 27

OpenFOAM (MPI) single node tests - icoFoam
OpenFOAM tests OpenFOAM (MPI) single node tests - icoFoam CCR Workshop – May LNGS 28

OpenFOAM (MPI) single node tests - pimpleFoam
OpenFOAM tests OpenFOAM (MPI) single node tests - pimpleFoam pimpleFoam [transient, incompressible, LES]_4Mcells CCR Workshop – May LNGS 29

MCNP (MPI+OMP) single node tests
MPCN tests MCNP (MPI+OMP) single node tests only OMP only MPI CCR Workshop – May LNGS 30

MCNP (MPI+OMP) 2 nodes tests
MPCN tests MCNP (MPI+OMP) 2 nodes tests With 2x-hypertreading Speed-UP about 1.54 Better performance with optimal ratio «MPI_processes / OMP_threads» CCR Workshop – May LNGS 31

OCAPIE project W. Borreani (INFN) & G. Ciaccio (DIBRIS-Unige) Outline

Similar presentations

Presentation on theme: "OCAPIE project W. Borreani (INFN) & G. Ciaccio (DIBRIS-Unige) Outline"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

OCAPIE project W. Borreani (INFN) & G. Ciaccio (DIBRIS-Unige) Outline

Similar presentations

Presentation on theme: "OCAPIE project W. Borreani (INFN) & G. Ciaccio (DIBRIS-Unige) Outline"— Presentation transcript:

Similar presentations

About project

Feedback