CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Order-N DFT calculations with the Conquest code Michael J. Gillan 1,2,

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Order-N DFT calculations with the Conquest code Michael J. Gillan 1,2, T. Miyazaki 3 and D. Bowler 1,2 1 Physics and Astronomy Department, University College London, U.K. 2 London Centre for Nanotechnology, University College London, U.K. 3 Computational Materials Science Center, NIMS, Tsukuba, Japan  Summary of O(N) DFT: the Conquest code  Summary of practical operation of Conquest: linear scaling, parallel scaling  Validation against standard plane-wave codes  Work in progress on Ge/Si hut-clusters

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Si/Ge nanostructures Epitaxial deposition of Ge on the Si (001) surface. There is a strain mismatch of a few percent, and the strain is relieved by formation of missing-row trenches. Evolution of surface structure with increasing coverage... Early 2 x N Late 2 x N M x N ‘Hut’ clusters

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Why traditional methods scale as N 2 or worse There are N Kohn-Sham orbitals Each Kohn-Sham orbital extends over the entire volume V, so information in each is proportional to N. Hence, amount of information to be calculated is proportional to N 2. In standard methods, we have to calculate all overlap integrals Hence, number of operations is proportional to N 3. Prefactor of N 3 is small, so practical dependence is N 2, except for very large systems.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Density matrix is key to O(N) But E tot in DFT can be expressed entirely in terms of, so we have a variational principle: minimize E tot with respect to density matrix, subject to conditions that : (i) is symmetric; (ii) is idempotent; (iii) gives the correct number of electrons. It is enough to require that be weakly idempotent: its eigenvalues lie between 0 and 1. Density matrix is defined in terms of Kohn-Sham eigenfunctions as: with f n orbital occupation numbers: at T = 0, f n = 0 or 1. The operator is idempotent: its eigenvalues are 0 or 1 (it is a projector). Now as. So we get an upper bound to E tot if we minimize E tot with the additional constraint that for.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Density matrix and localised orbitals In practice, we cannot work directly with, since it is a function of two vector variables. with localized orbitals that vanish outside ‘localization regions’. In practice, take the localization regions spherical, radius R reg and centred on the atoms. Label specifies different localized orbitals on given atom i. Instead, express as: Matrix is the density matrix in the (non-orthogonal) representation of. Search for ground state by minimizing E tot with respect to and subject to idempotency and correct electron number.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Enforcing idempotency In orthogonal tight-binding, the equivalent problem is to minimize: subject to weak idempotency of K, and fixed electron number. The same scheme is applied in CONQUEST, but in a non-orthogonal version: where An effective method is that of Li, Nunes and Vanderbilt, in which: with L the ‘auxiliary density matrix’. This works because of the properties of the polynomial

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Summary of CONQUEST Ground-state search is done as three nested loops: 1.inner: minimization with respect to auxiliary density matrix, with cut-off distance R L. 2.middle: self-consistency 3.outer: minimization with respect to with cut-off distance R reg Minimization with respect to at fixed Kohn-Sham potential and fixed is equivalent to non-orthogonal tight-binding; done by a combination of McWeeny purification and Li-Nunes-Vanderbilt. Self-consistency: reduction of density residual to zero: done by ‘Guaranteed Reduction Pulay’. Localized orbitals represented by B-splines, or by pseudo-atomic orbitals; minimization by conjugate gradients. Written as a parallel code from an early stage.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Parallel operation Main aspects of CONQUEST parallel coding: Formation of overlap and Hamiltonian matrix elements by integration over grid points: integration grid is divided ito ‘domains’, with one processor responsible for a domain. Matrix operations: atoms divided into ‘primary sets’, with one processor responsible for rows of matrices associated with one primary set. Fourier transformation (Hartree potential etc...) is done in parallel.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Practical linear scaling Tests of convergence of total energy with respect to L-matrix cut-off R L and localization-region radius R reg. Bulk silicon.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Practical parallel scaling Parallel scaling: increasing number of processors, constant number of atoms per processor O(N) scaling: constant number of processors, increasing number of atoms

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Relaxation of Si (001) surface MethodBasisBond length (A) Bond angle SIESTA/NSCSZ2.5015.9 CONQUEST/NSCSZ2.5014.5 CONQUEST/NSC O(N) SZ2.4215.0 CONQUEST/SCB-spline2.3722.8 SIESTA/SCDZP2.4019.9 VASPPW2.4119.7 Comparison of relaxed structure of Si (001) surface from non-self-consistent and self-consistent SIESTA and CONQUEST calculations using different basis sets with plane-wave results (VASP). Quantities compared are bond length and tilt angle of surface dimer pair.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Structure and energetics of Ge/Si hut clusters Energetics of hut clusters with O(N) electronic- structure calculations: aims of the investigation: The smallest stable hut-clusters contain ~10,000 atoms of Ge. Previous investigations have used continuum elasticity theory combined with DFT calculations. Energetics of faces and edges is important. Combination of DFT and elasticity theory is not easy, and it is not known how large the clusters must be for this approach to be valid. Our aim: to use the hierarchy of electronic- structure methods to test previous calculations. We are investigating the energetics using: tight- binding; non-self-consistent DFT tight-binding, and self-consistent DFT tight-binding. In the future, also full DFT with plane-wave accuracy. The faces of the hut-clusters are Ge (105) surfaces

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Ge(105) : surface energy by different methods Ge(105): surface energy (LDA) empirical TB : 74.3 meV/Å 2 NSC-AITB : 76.5 meV/Å 2 SC-AITB : 81.5 meV/Å 2 full DFT(blip) : 74.8 meV/Å 2 STATE(planewave): 70.0 meV/Å 2 Ge/Si(105): surface energy (GGA) STATE(12.25 Ry): 43.7 meV/Å 2 STATE( 36 Ry ): 45.6 meV/Å 2 by Hashimoto et al.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Ge(105): cutoff of (auxiliary) density matrix L Ge(105) : NSC-AITB R L (bohr) E tot (Ha/atom ) F max (Ha)E surf (eV/A 2 ) 15.4-4.8476350.00270.0801 20.4-4.8504380.00140.0753 25.4-4.8514200.00070.0752 30.4-4.8518110.00040.0755 diag-4.8520480.00010.0765 R L =20.4 (or 25.4) bohr is enough.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Strategy for calculations on hut clusters Si (001)  Ge : total energy per 1 Ge atom of hut cluster and its dependence on size of hut cluster: a spacings of hut clusters: b thickness of Si substrate: d  Ge of 2xN Ge/Si(001) surface d  Ge = (E tot -E sub )/(# of Ge hut atoms) (empirical) TB DFT calculations a b Ge hut cluster

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Size: 28x28 on 32x32, thickness 8 layer –22,746 atoms Calculations are done on the Earth Simulator using 64 nodes (512 processors, 1/10 of the ES) –DMM : ~ 10min. –Initial forces are already small. –Structure Optimisation within NSC-AITB level 174Å 48Å

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Ge hut cluster v.s. 2xN on Si(001): Tight-binding O(N) Type 2 ‘hut’ is more stable than type 1 for small coverage. crossing of  (Ge) between 2xN and hut? substrate of hut cluster: p2x2 (without dimer vacancies) thicker substrates will reduce  (Ge) of larger hut clusters. -> hut should be more stable Similar calculations by order-N DFT in progress....

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Conquest O(N) DFT on Ge/Si hut-clusters Type 1 ‘hut’ is more stable than type2. Further calculations being done now. Self-consistent DFT has also been achieved for this size of system. Minimal-basis, non-self-consistent DFT. Size of system: 22,746 atoms

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Summary and continuation...  Conquest O(N) DFT has been tested on simple systems, and shows satisfactory agreement with calculations using pseudo- atomic orbitals (Siesta) and with plane-wave calculations.  For Ge (105) surface, O(N) DFT calculations with different density-matrix cut-off radius R L show rapid convergence with respect to R L. Full blip-function basis in Conquest (equivalent to plane waves) gives good agreement with plane-wave results for surface formation energy.  Ge/Si hut clusters: O(N) tight-binding calculations on systems of 20,000 – 30,000 atoms are straightforward. O(N) self-consistent DFT calculations on this size of system are feasible.  Preliminary results on Ge/Si hut-clusters suggest cross-over to stable hut-clusters at coverage of about 3 monolayers. However, much more calculation and analysis is still needed.

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Credits National Institute for Materials Science, Tsukuba International Center for Young Scientists, NIMS, Tsukuba MEXT grant for Scientific Research in Priority Areas “Development of New Quantum Simulators and Quantum Design” (No. 17064017) Earth Simulator Center, Yokohama UK Engineering and Physical Sciences Research Council Royal Society

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Order-N DFT calculations with the Conquest code Michael J. Gillan 1,2,

Similar presentations

Presentation on theme: "CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Order-N DFT calculations with the Conquest code Michael J. Gillan 1,2,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Order-N DFT calculations with the Conquest code Michael J. Gillan 1,2,

Similar presentations

Presentation on theme: "CCP6 Conference on Computational Physics, Gyeongju, Republic of Korea, August 2006 Order-N DFT calculations with the Conquest code Michael J. Gillan 1,2,"— Presentation transcript:

Similar presentations

About project

Feedback