Hartree – Fock Benchmark JOHN FEO Center for Adaptive Supercomputing Software.

Hartree – Fock Benchmark JOHN FEO Center for Adaptive Supercomputing Software

Self-Consistent Field Method Obtain variational solutions to the electronic Schrödinger Equation 2 eigenvector Wavefront Energy By expressing the system’s one electron orbitals as the dot product of the system’s eigenvectors and some set of Gaussian basis functions Gaussian basis function

… reduces to The solution to the electronic Schrödinger Equation reduces to the self-consistent eigenvalue problem 3 Fock matrix Eigenvectors Density matrix Overlap matrix Eigenvalues One-electron forces Two-electron Coulomb forces Two-electron Exchange forces

Logical flow 4 Initial Density Matrix Construct Fock Matrix a) One electron b) Two electron Compute Orbitals Compute Density Matrix Damp Density Matrix no convergence convergence Output Initial Orbital Matrix Schwarz Matrix

Modules 5 denges Construct Fock Matrix a) oneel b) twoel diagon makden a) damp b) dendif no convergence convergence Output makeob makesz

Matrix flow 6 Compute Fock Matrix diagon makden Damp Density Matrix Output oneel twoel g_schwartzh g g_fock g_orbs g_dens g_work eigv dendif - constant - current iteration - previous iteration

Memory footprint 7 Ten 2-D arrays of size nbfn x nbfn Two 1-D arrays of size nbfn h is size (nbfn) 2 g is size (nbfn) 4 Direct method – compute g each time Conventional method – compute g once and store nbfn = number of basis functions = 15 * number of atoms

Typical loop structures for (i = 0; i tol) ? h(i,j) : 0.0; } } 8 oneel for (i = 0; i < nbfn; i++) { for (j = 0; j < nbfn; j++) { double p = 0.0; for (k = 0; k < nocc; k++) p += g_orbs[i][k] * g_orbs[j][k]; g_work[i][j] = 2.0 * p; } } makden for (i = 0; i denmax) denmax = xdiff; } } dendif nocc = 2 * number of atoms

The BIG loop for (l = 0; l < nbfn; l++) { for (k = 0; k < nbfn; k++) { for (j = 0; j < nbfn; j++) { for (i = 0; i < nbfn; i++) { if (g_schwarz[i][j] * schwmax ) < tol2e) {icut1 ++; continue;} if (g_schwarz[i][j] * g_schwarz[k][l]) < tol2e) {icut2 ++; continue;} icut3 ++; double gg = g(i, j, k, l); g_fock[i][j] += ( gg + g_dens[k][l]); g_fock[i][k] -= (0.5 * gg * g_dens[j][l]); } } } } 9 twoel

Optimizations 10 Tiling to improve locality and data reuse e.g. - twoel Module collapse to increase thread size and reduce memory operations e.g. - makden and dendif Inline g and hoist loop invariant subexpressions Exploit 8-way symmetry of g --- precompute and save Conventional method (compute h and g once and store) Probably enough memory Probably not enough latency tolerance or power

Hartree – Fock Benchmark JOHN FEO Center for Adaptive Supercomputing Software.

Similar presentations

Presentation on theme: "Hartree – Fock Benchmark JOHN FEO Center for Adaptive Supercomputing Software."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Hartree – Fock Benchmark JOHN FEO Center for Adaptive Supercomputing Software.

Similar presentations

Presentation on theme: "Hartree – Fock Benchmark JOHN FEO Center for Adaptive Supercomputing Software."— Presentation transcript:

Similar presentations

About project

Feedback