Presentation is loading. Please wait.

Presentation is loading. Please wait.

Feb 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, Edinburgh Large-Scale Density-Functional calculations for nano-meter size Si materials.

Similar presentations


Presentation on theme: "Feb 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, Edinburgh Large-Scale Density-Functional calculations for nano-meter size Si materials."— Presentation transcript:

1 Feb 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, Edinburgh Large-Scale Density-Functional calculations for nano-meter size Si materials Jun-Ichi Iwata Center for Computational Sciences University of Tsukuba

2 Outline Quantum Mechanical (First-Principles) Simulation in Solid-State Physics Density-Functional Theory W. Kohn (Nobel Prize in 1998) Density-Functional simulations for large systems Real-Space DFT program code for Parallel Computation -RSDFT- Applications of RSDFT for Si nano materials >10,000-atom system

3 First-Principles Calculation in Material Physics We describe material properties from the behavior of electrons and ions. ions → classical, electrons → quantum We solve the Schrodinger equation for electronic ground state Density-functional theory is a powerful tool for this purpose.

4 Density-Functional Theory electron density Energy Functional We get stable atomic & electronic structures. ( minimize ) P. Hohenberg and W. Kohn, Phys. Rev. 136 (1964) B864. W. Kohn and L. J. Sham, Phys. Rev. 140 (1965) A1133. minimize with respect to Potential Kohn-Sham equation → We have to solve this equation self-consistently ( Nonlinear eigenvalue problem )

5 M. T. Yin and M. L. Cohen Phys. Rev. B26, 5668 (1982). Exchange functional in Local-Density Approx. DFT calc. Expt. Lattice Constant ( Å ) 5.375.41 Bulk Modulus (Mb) 0.9770.988 Si ( in diamond structure ) Performance of DFT with simple approximation quantitatively good results Correctly describe various properties

6 Proteins ( cytochrome c oxidase ) ~ 30,000 atoms Nano structures (Si pyramid) ~ 100,000 atoms A. Ichimiya et al., Surf. Sci. 493, 555 (2001). Everybody wants to apply the DFT for Large systems Usually, we treat 10- to 1000-atom systems by DFT. However, we need to treat larger systems. to study large objects (nano structures, proteins) to make the atomic model more realistic

7 Real-Space DFT program code (RSDFT) Solve Kohn-Sham equation (eigenvalue problem) → Computational costs ~ O(N 3 ) Developed for parallel computers

8 discretize function Column vector Laplacian → Higher-Order Finite-Difference Higher-order finite difference pseudopotential method J. R. Chelikowsky et al., Phys. Rev. B, (1994) Real-Space Method continuous spacediscrete space Typical number of grid points : 10,000 ~ 1,000,000 ( ⇔ Reciprocal-Space (Plane-Wave) Method )

9 Real-Space Finite-Difference Sparse Matrix FFT free (FFT is inevitable in the conventional plane-wave code) Kohn-Sham eq. (finite-difference) 3D grid is divided by several regions for parallel computation. Higher-order finite difference Integration MPI_ISEND, MPI_IRECV MPI_ALLREDUCE RSDFT – suitable for parallel first-principles calculation - MPI ( Message Passing Interface ) library CPU0 CPU8CPU7 CPU6 CPU5 CPU4CPU3 CPU2CPU1

10 Convergence behavior for Si 10701 H 1996 The largest system in the present study → Si 10701 H 1996 Massively Parallel Computing Computational Time (with 1024 nodes of PACS-CS) 6781 sec. × 60 iteration step = 113 hour Based on the finite-difference pseudopotential method (J. R. Chelikowsky et al., PRB1994) Highly tuned for massively parallel computers Computations are done on a massively-parallel cluster PACS-CS at University of Tsukuba. (Theoretical Peak Performance = 5.6GFLOPS/node) with our recently developed code “RSDFT” Iwata et al, J. Comp. Phys. (2010) Real-Space Density-Functional Theory code (RSDFT) Grid points = 3,402,059 Bands = 22,432

11 Conjugate-Gradient Method Gram-Schmidt orthonormalization Density, Potentials update Subspace Diagonalization Total Computational Cost ~ O(N 3 ) O(N 3 ) O(N) O(N 2 ) Flow chart Calc. Ionic Potentials Input initial configuration of Ions Hellman-Feynman Force Move ions Convergence Check Electronic structure optimization must be performed in each atomic optimization step Atomic structure optimization Electronic structure optimization Algorithm → subspace iteration method (Rayleigh-Ritz method) yes

12 Algorithm 1 → Subspace Iteration Method ( Rayleigh-Ritz Method ) M-dimensional eigenvalue problem We need smallest N( ≪ M) eigen-pairs Minimize Reyleigh quotients by Conjugate-Gradient Method wave function update Initial guess Problem

13 Algorithm 2 O(MN 2 ) O(N 3 ) Subspace Diagonalization O(MN 2 ) ( Ritz vectors ) Gram-Schmidt Orthogonalization → as a basis set ← initial guess for the next iteration O(MN 2 ) Calc. Matrix Elements

14 Gram-Schmidt orthogonalization Time (sec) GFLOPS/node Old algorithm 661 (710) 0.70 (0.65) New algorithm 111 (140) 4.30 (3.50) Time & Performance for Gram-Schmidt O(N 3 ) part can be computed at 80% of the theoretical peak performance! ~ Active use of Level 3 BLAS in O(N 3 ) computation ~ → Collaboration with computer scientists much improve the performance of the RSDFT! Theoretical peak performance = 5.6 GFLOPS/node Part of the calculations can be performed as Matrix × Matrix operation! Algorithm of GS

15 PACS-CS(5.6GFLOPS/node) 256nodes → time for O(N 2 )-part and O(N 3 )-part become comparable Elapsed time for 1 step of iteration O(N 2 ) O(N 3 )

16 Application 1 Nano-meter size Si quantum dots

17 Si quantum dot is a promising material for several device applications  Memory  Single-electron transistor  Optical Device Clarifying the relation between the “Dot size” and “Band gap” is important for controlling the device properties.  System size is very large! A model of the Si quantum dot of 6.6 nm diameter ( Si 7055 H 1596 ) First-principles calculations are useful for such studies? → Yes, but …

18 (eV) Experimental fit curve From STS measurement B.Zanknoon et al., Nano letters 8, 1689 (2008). The ΔSCF gap seems to be closer to the ΔKS gap … Band Gaps 300 atoms>10,000 atoms

19 Application 2 Si nanowires

20 IEDM2005IEDM2006 Diameter of NW10 nm8 nm Gate length30 nm15 nm Vdd1.0 V I_on (n) 2.64 mA/  m1.4 mA/  m I_on (p) 1.11 mA/  m1.94 mA/  m I_off (n) 3.1 nA/  m2.0 nA/  m I_off (p) 0.0056 mA/  m1.0 nA/  m Samsung Si nanowire devices

21 4 nm diameter ( 425 atoms) 10 nm diameter ( 2341 atoms ) 20 nm diameter ( 8941 atoms ) There may be an optimum diameter in the region of 10 nm ~ 20 nm. Several size of Si nanowires

22 d=1nm Si21H20 ( 41 atoms ) Eg=2.60eV (LDA Bulk : 0.53eV)  X Band Structure and DOS of SiNW (d=1nm)

23 d=4nm Si341H84 ( 425 atoms ) Eg=0.81eV (LDA Bulk=0.53eV)  X Band Structure and DOS of SiNW (d=4nm)

24  X Si1361H164(1525 atoms), Eg=0.61eV Band Structure and DOS of SiNW (d=8nm)  X Bulk Si Eg=0.53eV

25 Si12822H1544 ( 14,366 atoms ) ・ 10nm diameter 、 3.3nm height 、 (100) ・ Grid spacing : 0.45Å (~14Ry) ・ # of grid points : 4,718,592 ・ # of bands : 29,024 ・ Memory : 1,022GB ~ 2,044GB Si12822H1544 Top View Side View Si nano wire with surface roughness

26 PACS-CS1024 nodes ( peak performance : 5.6 GFLOPS/node ) Subspace diagonalization : 4600 sec. Gram-Schmidt : 2300 sec. Conjugate-Gradient Method : 3700 sec. Total Energy calc. : 1200 sec. Total(1 step) : 12,000 sec. DOS of SiNW with roughness DOS of Bulk Si d=10nm ( with roughness ) Si12822H1544(14,366 atoms) Eg=0.57eV

27 Application3 Si divacancy

28 Structure of Si divacancy : Small-yellow balls : vacancies (no atoms) Green balls : Si atoms with dangling bonds. There are two possibilities for the structure of Si divacancy. Resonant-Bond type Large-paring type ・ Both “Large-paring” and “Resonant-Bond” structure were found. ・ Large-Paring type is the most stable (RB type is a local minimum) More recent LDA calculation (Oguet et al., 1999) EPR experiment (Watkins & Corbett, 1965) LDA calculation (Saito & Oshiyama, 1994) Large-Paring type Resonant-Bond type is stable (Large-Paring type was not found) What is the stable structure ? Model size ~ 60 atoms Model size ~ 300 atoms →Model Size dependence ?

29 Si divacancy d ac, d ab (Å) Model size (# of atoms) Large-paring Resonant-Bond Small-Paring Structures converge at 998-atom model. LP structure appears at 510 or larger models. RB structure is most stable, but the energy difference is very small (<10 meV) J.-I. Iwata, et al., Phys. Rev. B 77 (2008) 115208 Structure of Si divacancy : Small-yellow balls : vacancies (no atoms) Green balls : Si atoms with dangling bonds. There are two possibilities for the structure of Si divacancy. Resonant-Bond type Large-paring type

30 We have developed Real-Space DFT program code for large systems by utilizing the massively parallel computers Collaboration with computer scientist much improve the performance of RSDFT (Especially, O(N 3 )-part calculation with BLAS 3) By using a few hundred ~ 1000CPUs, we have achieved the first-principles calculation for ・ Si 1000-atom system with atomic structure optimization ・ Self-Consistent electronic structures of Si 10,000-atom systems By using large atomic models → eliminate the model-size dependence We have applied the RSDFT for nano-meter scale Si materials (SiNW, SiQD) I think the RSDFT becomes an useful tool for future device development Summary


Download ppt "Feb 23, 2010, Tsukuba-Edinburgh Computational Science Workshop, Edinburgh Large-Scale Density-Functional calculations for nano-meter size Si materials."

Similar presentations


Ads by Google