Presentation on theme: "A Skyrme QRPA code for deformed nuclei J. Engel and J.T. Univ. of North Carolina at Chapel Hill We now have a working and tested code. We are speeding."— Presentation transcript:
A Skyrme QRPA code for deformed nuclei J. Engel and J.T. Univ. of North Carolina at Chapel Hill We now have a working and tested code. We are speeding it up prior to starting systematic calculations. Assumed symmetry: parity and axial symmetry (2D) HFB gs: time-rev. invariant Volume-type pairing interaction used for tests Density-dependent pairing also implemented
What are the aspects of our science that require high-performance computing? All our science requires high-performance computing, e.g.: Calculating energies and transition strengths of excited states in many nuclei to test predictive power of energy functionals. Calculating beta-decay for r-process and understanding resonances near neutron drip line.
What has been done after the last Pack Forest meeting: 1. Integration with coordinate-space HFB code 2. Parallelization 3. Test of separation of translational spurious component 4. Improvement of efficiency (still in progress, nearing completion) 5. Distribution of spherical QRPA code to Livermore group for reaction studies What are the main accomplishments? We expect an efficiency-improved version to be completed in a month. Should be able to begin systematic calculations shortly thereafter.
Integration with coordinate-space HFB code (previous version of code used HFBTHO) For enabling description of nuclei near drip lines efficiently, and for making separation of translational spurious state easier than the harmonic-oscillator basis. Symmetry of wave function space is important.
Parallelization a.Speeding up Division of interaction matrix elements among individual processors ScaLAPACK diagonalization b. Memory management distribution of quasiparticle wave functions distribution of Hamiltonian matrix Our QRPA code can handle arbitrarily large basis as long as more than a few processors are available.
quasiparticle wave functions are read by processor 1 processor 2 … processor 4 Interaction matrix elements processor 1 V 1,…,V n 4 qp states a,b,c, and d processor 2 processor 3 processor 4 Parallelization
without correction operator with correction operator 26 Mg Absolute value of transition matrix elements of between ground and K π =0 – states A correction operator subtracts translational spurious components from the matrix elements (Energies of two impulses are slightly shifted horizontally) (efm 3 ) good test of code Separation of spurious translational mode
without correction operator with correction operator 26 Mg (efm 3 ) Higher-energy region
4x10 8 interaction matrix elements 26Mg.72 processor sec/me In test on one processor, a typical matrix element takes.36 sec. About half the time not in computation. https://www.nersc.gov/nusers/status/jobs/?hostname=franklin Computational Issues (before speedup) (Size of H = by 40000)
Improvement of efficiency of algorithm 1. B-spline + Gauss-Legendre → factor of 2 speedup first version 100x100 mesh for wf 4-point formula for differentiation integration by Simpson rule new version 70x70 B-splines for wf analytical formula of B-spline for differentiation integration by Gauss-Legendre using 70x70 mesh
Improvement of efficiency of algorithm 2. Canonical basis (being implemented) → factor of 4 speedup Definition: eigen functions ψ(r) of density matrix eigen value = v 2 : occupation probability Conversion of QRPA formulation using quasiparticle basis to another one using canonical-BCS basis is obtained by
The 4 integrals are all closely related in canonical basis. Only need one integral.
More on speedup 1. B-spline + Gauss-Legendre → factor of 2 2. Canonical basis (being implemented)→ factor of 4 total factor of 8 Further speedup will come (we hope) from ADLB Expected: reduction of dead time and number of communications between processors Further speedup will come (we hope) from ADLB Expected: reduction of dead time and number of communications between processors
For a K π =1 – calculation of A≈30, estimated charge is 5,000 Mpp·h. (without ADLB on franklin.nersc.gov) --- We need more resources to calculate lots of nuclei. What nuclei should be chosen for testing energy functionals if resources are limited? Access to resources? INCITE? Other options?
Summary What has been done since the last Pack Forest meeting: 1. Integration with coordinate-space HFB code 2. Parallelization 3. Test of separation of translational spurious component 4. Two improvements of efficiency 5. Distribution of spherical QRPA code to Livermore group 6. Comparison with J-scheme calculation for a spherical nucleus. 7. A few other K π calculations The code is very close to completion, though we’d hoped to be done by now.