# Parallel Finite-Difference Time- Domain Computations Aided by Modal Decomposition Dmitry A. Gorodetsky Philip A. Wilsey.

## Presentation on theme: "Parallel Finite-Difference Time- Domain Computations Aided by Modal Decomposition Dmitry A. Gorodetsky Philip A. Wilsey."— Presentation transcript:

Parallel Finite-Difference Time- Domain Computations Aided by Modal Decomposition Dmitry A. Gorodetsky Philip A. Wilsey

Outline Introduction Introduction FDTDFDTD Distributed ComputationDistributed Computation Model Order Reduction Model Order Reduction Conclusion Conclusion References References

Introduction FDTD: evolutionary algorithm solves Maxwells eqs. by marching. FDTD: evolutionary algorithm solves Maxwells eqs. by marching. Some typical problems: Some typical problems: Aircraft Radar Cross SectionAircraft Radar Cross Section Microwave ICs, High Speed Electronic CircuitsMicrowave ICs, High Speed Electronic Circuits Optical Pulse PropagationOptical Pulse Propagation AntennasAntennas Bioelectromagnetic Systems (Retina, EM hypothermia cancer therapy)Bioelectromagnetic Systems (Retina, EM hypothermia cancer therapy) Bodies of RevolutionBodies of Revolution Computed surface electric currents induced on a prototype military jet fighter plane by a radar beam at 100 MHz. The incident plane wave propagates from left to right head-on to the airplane. The surface currents re-emit electromagnetic energy which can be used to create RCS plots [1].

Simulation Complexity Example of 2 nd order FDTD evolution: Example of 2 nd order FDTD evolution: H z (n+1/2)=k 1 [ΔE x (n)]+k 2 [ΔE y (n)]+k 3 H z (n-1/2) Grid size as well as number of time steps can make the simulation prohibitive. Grid size as well as number of time steps can make the simulation prohibitive. Computational burden grows as ~N 4/3 [1] Computational burden grows as ~N 4/3 [1] A single FDTD cell

Reducing Simulation Time Methods to Improve Simulation Time: Methods to Improve Simulation Time: Distributed Computation [1-3]Distributed Computation [1-3] Domain Decomposition Domain Decomposition Synchronization Synchronization Load Balancing Load Balancing Model Order ReductionModel Order Reduction State Transition Matrix – Modal Approach [4-7] - Exact State Transition Matrix – Modal Approach [4-7] - Exact Entire DomainEntire Domain Sub DomainSub Domain Linear Estimation Methods [1, 8] - Approximate Linear Estimation Methods [1, 8] - Approximate Pronys Method (complex exponentials)Pronys Method (complex exponentials) System Identification TechniqueSystem Identification Technique

Distributed Computation FDTD requires knowledge of state of adjacent points to compute the current point. FDTD requires knowledge of state of adjacent points to compute the current point. Hence it exhibits fine-grain parallelism and its speedup is limited by surface/volume ratio. Hence it exhibits fine-grain parallelism and its speedup is limited by surface/volume ratio. Surface to volume ratio of FDTD partitions is in effect communication/computation ratio. Surface to volume ratio of FDTD partitions is in effect communication/computation ratio. Figure 1. Speedup Efficiency of parallel FDTD [9]

Outline Introduction Introduction Model Order Reduction Model Order Reduction State Transition Matrix (exact)State Transition Matrix (exact) Entire Domain Entire Domain Expensive SetupExpensive Setup Cheap IterationCheap Iteration Setup ParallelizationSetup Parallelization Sub Domain (Macromodel) Sub Domain (Macromodel) Conclusion Conclusion References References

After Chen [10], we can express the FDTD update equations as: After Chen [10], we can express the FDTD update equations as: E(n) = D 1 H(n-1/2) + G 1 E(n-1) H(n+1/2) = D 2 E(n) + G 2 H(n-1/2) With these equations, the state transition matrix becomes: With these equations, the state transition matrix becomes: Entire Domain

Entire Domain (2) Then can express FDTD as: Then can express FDTD as: Q(n)=AQ(n-1),(1) where Q represents the present state. Every step takes N 2 multiplications. Every step takes N 2 multiplications. If we assume that the system starts out from Q(0)=a 1 v 1 +a 2 v 2 +…+a N v N then (1) can be written as: If we assume that the system starts out from Q(0)=a 1 v 1 +a 2 v 2 +…+a N v N, then (1) can be written as:(2) where v i are the eigenvectors and λ i are eigenvalues of A.

Entire Domain (3) Cheap Iteration The advantage of the modal method for FDTD is that time-stepping is de- coupled (see eq.2) The advantage of the modal method for FDTD is that time-stepping is de- coupled (see eq.2) Solution can be obtained at any time step without knowledge of previous time step. Solution can be obtained at any time step without knowledge of previous time step. Time-stepping can be parallelized and does not require communication. Time-stepping can be parallelized and does not require communication.

Entire Domain (4) Expensive Setup The matrix A is sparse, diagonally dominant, and banded. The matrix A is sparse, diagonally dominant, and banded. With standard techniques (LAPACK), getting the eigendecomposition of A is an O(N 3 ) process. With standard techniques (LAPACK), getting the eigendecomposition of A is an O(N 3 ) process. LAPACK uses QR iteration to obtain the Schur form and hence is not easy to parallelize. LAPACK uses QR iteration to obtain the Schur form and hence is not easy to parallelize.

Entire Domain (5) Setup Parallelization We can take advantage of the modal make-up of the A matrix because in practice we do not need all the modes [10,11]. We can take advantage of the modal make-up of the A matrix because in practice we do not need all the modes [10,11]. One alternative method is spectral divide and conquer (SDC) [12]. One alternative method is spectral divide and conquer (SDC) [12]. SDC: sign (A-bI), where b represents the x-coordinate of a vertical line in the complex plane. SDC: sign (A-bI), where b represents the x-coordinate of a vertical line in the complex plane.

Entire Domain (6) Setup Parallelization SDC Advantages: Advantages: Compute only needed eigenvalues.Compute only needed eigenvalues. Easy to parallelize.Easy to parallelize. Computation time is kN 3 but k depends on the number of eigenvalues.Computation time is kN 3 but k depends on the number of eigenvalues. Disadvantages: Disadvantages: Requires several iterations before sign function converges.Requires several iterations before sign function converges. Requires knowledge of where eigenvalues do not lie otherwise sign function may not converge quickly.Requires knowledge of where eigenvalues do not lie otherwise sign function may not converge quickly.

Entire Domain (7) Setup Parallelization Alternatives: Iterative Techniques Simultaneous Iteration Arnoldi and Lancsoz [12,13] Advantages: Advantages: Exploit sparsity.Exploit sparsity. Can be parallelized.Can be parallelized. Disadvantages: Disadvantages: Require computation of all eigenvalues.Require computation of all eigenvalues.

Conclusion The setup time of this method is expensive for a reason. The setup time of this method is expensive for a reason. Very good accuracy results even after eigenmodes are discarded. Very good accuracy results even after eigenmodes are discarded. Setup and time-stepping can be parallelized and need not be limited by communication as conventional FDTD. Setup and time-stepping can be parallelized and need not be limited by communication as conventional FDTD. Imprvmnt = function (#steps x #CPUs) Imprvmnt = function (#steps x #CPUs)

References 1.A. Taflove, Computational Electrodynamics: the finite-difference time-domain method, Norwood, MA: Artech House, 1995. 2.N. P. Chrisochoides, E. Houstis, and J. Rice, Mapping algorithms and software environment for data parallel PDE iterative solvers, Special issue of the Journal of Parallel and Distributed Computing on Data-Parallel Algorithms and Programming, Vol 21, No 1, pp 75--95, April, 1997. 3.N. P. Chrisochoides and J. R. Rice, Partitioning heuristics for PDE computations based on parallel hardware and geometry characteristics. In Advances in Computer Methods for Partial Differential Equations VII, (R. Vichnevetsky. D. Knight and G. Richter, eds) IMACS, New Brunswick, NJ, pp. 127-133, 1992. 4.Z. Chen, Analytic Johns matrix and its application in TLM diakoptics, IEEE MTT-S Digest, vol. 2, pp. 777-780, 1995. 5.W. J. Hoefer, The discrete time domain greens function or johns matrix – a new powerful concept in transmission line modeling (TLM), Int. J. Num. Modeling, vol. 2, pp. 215-225, 1989. 6.P. B. Johns and K. Akhtarzad, Time domain approximations in the solution of fields by time domain diakoptics, Int. J. Num. Methods Eng., vol. 18, pp. 1361- 1373, 1982. 7.P. B. Johns and K. Akhtarzad, The use of time domain diakoptics in time discrete models of fields, Int. J. Num. Methods Eng.,vol. 17, pp. 1-14, 1981.

References (2) 8.W. Kumpel and I. Wolff, Digital signal processing of time domain field simulation results using the system identification method, IEEE Trans. Microwave Theory Techniq., vol. 42, no. 4, pp. 667-671, 1994. 9.D. A. Gorodetsky and P. A. Wilsey, Innovative approaches to parallelizing finite-difference time-domain computations, IEEE Workshop on Direct and Inverse Problems in Electrodynamics, 2005. 10.Z. Chen and P. P. Silvester, Analytic solutions for the finite-difference time-domain and transmission-line-matrix methods, Microwave and Optical Technology Letters, vol. 7, no.1, pp. 5-8, 1994. 11.D. A. Gorodetsky and P. A. Wilsey, Reduction of FDTD simulation time with modal methods, Progress in Electromagnetics Research Symposium, 2006, in press. 12.J. W. Demmel, M. T. Heath, and H. A. van der Vortst, Parallel numerical linear algebra, in Acta Numerica 1993, Cambridge, 1993, Cambridge University Press, pp. 111–197 13.Z. Bai, Progress in the numerical solution of the nonsymmetric eigenvalue problem, Journal of Numerical Linear Algebra with Applications, vol. 2, pp. 219--234, 1995.

Similar presentations