CS267 L24 Solving PDEs.1 Demmel Sp 1999 Math 18.335 Poisson’s Equation, Jacobi, Gauss-Seidel, SOR, FFT Plamen Koev

Slides:



Advertisements
Similar presentations
Fast Fourier Transform for speeding up the multiplication of polynomials an Algorithm Visualization Alexandru Cioaca.
Advertisements

Parallel Fast Fourier Transform Ryan Liu. Introduction The Discrete Fourier Transform could be applied in science and engineering. Examples: ◦ Voice recognition.
CS 290H 7 November Introduction to multigrid methods
SOLVING THE DISCRETE POISSON EQUATION USING MULTIGRID ROY SROR ELIRAN COHEN.
CSCI-455/552 Introduction to High Performance Computing Lecture 26.
Numerical Algorithms ITCS 4/5145 Parallel Computing UNC-Charlotte, B. Wilkinson, 2009.
Fast Fourier Transform Lecture 6 Spoken Language Processing Prof. Andrew Rosenberg.
Linear Systems of Equations
By S Ziaei-Rad Mechanical Engineering Department, IUT.
Ma221 - Multigrid DemmelFall 2004 Ma 221 – Fall 2004 Multigrid Overview James Demmel
Numerical Algorithms • Matrix multiplication
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
CS 584. Review n Systems of equations and finite element methods are related.
03/09/2005CS267 Lecture 14 CS 267: Applications of Parallel Computers Solving Linear Systems arising from PDEs - I James Demmel
CS267 Poisson Demmel Fall 2002 CS 267 Applications of Parallel Computers Solving Linear Systems arising from PDEs - I James Demmel
Motion Analysis (contd.) Slides are from RPI Registration Class.
CS267 L24 Solving PDEs.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 24: Solving Linear Systems arising from PDEs - I James Demmel.
1 Systems of Linear Equations Iterative Methods. 2 B. Iterative Methods 1.Jacobi method and Gauss Seidel 2.Relaxation method for iterative methods.
Iterative methods for solving matrix equations 1. Jacobi 2. Gauss-Seidel 3. Successive overrelaxation (SOR) Idea behind iterative methods: Convert Ax=b.
CS267 Poisson 2.1 Demmel Fall 2002 CS 267 Applications of Parallel Computers Solving Linear Systems arising from PDEs - II James Demmel
02/09/2005CS267 Lecture 71 CS 267 Sources of Parallelism and Locality in Simulation – Part 2 James Demmel
CS267 L12 Sources of Parallelism(3).1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 12: Sources of Parallelism and Locality (Part 3)
02/28/2007CS267 Lecture 13 CS 267: Applications of Parallel Computers Structured Grids Kathy Yelick
FFT1 The Fast Fourier Transform by Jorge M. Trabal.
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Fast Fourier Transform Jean Baptiste Joseph Fourier ( ) These lecture.
03/09/06CS267 Lecture 16 CS 267: Applications of Parallel Computers Solving Linear Systems arising from PDEs - II James Demmel
CS267 L24 Solving PDEs.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 24: Solving Linear Systems arising from PDEs - I James Demmel.
CS267 L11 Sources of Parallelism(2).1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 11: Sources of Parallelism and Locality (Part 2)
CS267 L25 Solving PDEs II.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 25: Solving Linear Systems arising from PDEs - II James Demmel.
CS267 L24 Solving PDEs.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 24: Solving Linear Systems arising from PDEs - I James Demmel.
Thomas algorithm to solve tridiagonal matrices
Introduction to Algorithms
Newton's Method for Functions of Several Variables
04/15/2009CS267 Lecture 20 Parallel Spectral Methods: Solving Elliptic Problems with FFTs Horst Simon
1 Sources of Parallelism and Locality in Simulation.
Iterative Methods for Solving Linear Systems of Equations ( part of the course given for the 2 nd grade at BGU, ME )
The Fourier series A large class of phenomena can be described as periodic in nature: waves, sounds, light, radio, water waves etc. It is natural to attempt.
Fast Fourier Transform Irina Bobkova. Overview I. Polynomials II. The DFT and FFT III. Efficient implementations IV. Some problems.
ITERATIVE TECHNIQUES FOR SOLVING NON-LINEAR SYSTEMS (AND LINEAR SYSTEMS)
CS 6068 Parallel Computing Fall 2013 Lecture 10 – Nov 18 The Parallel FFT Prof. Fred Office Hours: MWF.
Scientific Computing Partial Differential Equations Poisson Equation.
Finite Element Method.
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
5.6 Convolution and FFT. 2 Fast Fourier Transform: Applications Applications. n Optics, acoustics, quantum physics, telecommunications, control systems,
The Fast Fourier Transform
Linear Systems Iterative Solutions CSE 541 Roger Crawfis.
Karatsuba’s Algorithm for Integer Multiplication
Applied Symbolic Computation1 Applied Symbolic Computation (CS 300) Karatsuba’s Algorithm for Integer Multiplication Jeremy R. Johnson.
The Fast Fourier Transform and Applications to Multiplication
Parallel Solution of the Poisson Problem Using MPI
Section 5.5 The Real Zeros of a Polynomial Function.
CS 484. Iterative Methods n Gaussian elimination is considered to be a direct method to solve a system. n An indirect method produces a sequence of values.
Introduction to Scientific Computing II Overview Michael Bader.
Lecture 21 MA471 Fall 03. Recall Jacobi Smoothing We recall that the relaxed Jacobi scheme: Smooths out the highest frequency modes fastest.
Applied Symbolic Computation1 Applied Symbolic Computation (CS 567) The Fast Fourier Transform (FFT) and Convolution Jeremy R. Johnson TexPoint fonts used.
Engineering Analysis ENG 3420 Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 11:00-12:00.
Numerical Algorithms Chapter 11.
Divide and Conquer.
Parallel Spectral Methods: Solving Elliptic Problems with FFTs
Solving Systems of Linear Equations: Iterative Methods
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Polynomial + Fast Fourier Transform
DFT and FFT By using the complex roots of unity, we can evaluate and interpolate a polynomial in O(n lg n) An example, here are the solutions to 8 =
The Fast Fourier Transform
Chapter 6: Transform and Conquer
Numerical Analysis Lecture14.
CS 267 Sources of Parallelism and Locality in Simulation – Part 2
James Demmel CS 267: Applications of Parallel Computers Solving Linear Systems arising from PDEs - I James Demmel.
 = N  N matrix multiplication N = 3 matrix N = 3 matrix N = 3 matrix
The Fast Fourier Transform
Presentation transcript:

CS267 L24 Solving PDEs.1 Demmel Sp 1999 Math Poisson’s Equation, Jacobi, Gauss-Seidel, SOR, FFT Plamen Koev

Math Koev, Fall 2003 Outline °Review Poisson equation °Overview of Methods for Poisson Equation °Jacobi’s method °Gauss-Seidel °Red-Black SOR method °FFT

Math Koev, Fall 2003 Poisson’s equation arises in many models °Heat flow: Temperature(position, time) °Diffusion: Concentration(position, time) °Electrostatic or Gravitational Potential: Potential(position) °Fluid flow: Velocity,Pressure,Density(position,time) °Quantum mechanics: Wave-function(position,time) °Elasticity: Stress,Strain(position,time)

Math Koev, Fall 2003 Poisson’s equation in 1D T = 2 Graph and “stencil”

Math Koev, Fall D Poisson’s equation °Similar to the 1D case, but the matrix T is now °3D is analogous T = 4 Graph and “stencil”

Math Koev, Fall 2003 Composite mesh from a mechanical structure

Math Koev, Fall 2003 Converting the mesh to a matrix

Math Koev, Fall 2003 Irregular mesh: NASA Airfoil in 2D (direct solution)

Math Koev, Fall 2003 Jacobi’s Method °To derive Jacobi’s method, write Poisson as: u(i,j) = (u(i-1,j) + u(i+1,j) + u(i,j-1) + u(i,j+1) + b(i,j))/4 °Let u(i,j,m) be approximation for u(i,j) after m steps u(i,j,m+1) = (u(i-1,j,m) + u(i+1,j,m) + u(i,j-1,m) + u(i,j+1,m) + b(i,j)) / 4 °I.e., u(i,j,m+1) is a weighted average of neighbors

Math Koev, Fall 2003 Successive Overrelaxation (SOR) °Similar to Jacobi: u(i,j,m+1) is computed as a linear combination of neighbors °Numeric coefficients and update order are different °Based on 2 improvements over Jacobi Use “most recent values” of u that are available, since these are probably more accurate Update value of u(m+1) “more aggressively” at each step °First, note that while evaluating sequentially u(i,j,m+1) = (u(i-1,j,m) + u(i+1,j,m) … some of the values are for m+1 are already available u(i,j,m+1) = (u(i-1,j,latest) + u(i+1,j,latest) … where latest is either m or m+1

Math Koev, Fall 2003 Gauss-Seidel °Use updated values instead of old ones = converge in half the time for i = 1 to n for j = 1 to n u(i,j,m+1) = (u(i-1,j,m+1) + u(i+1,j,m) + u(i,j-1,m+1) + u(i,j+1,m) + b(i,j)) / 4 °To do in parallel: use a “red-black” order forall black points u(i,j) u(i,j,m+1) = (u(i-1,j,m) + … forall red points u(i,j) u(i,j,m+1) = (u(i-1,j,m+1) + …

Math Koev, Fall 2003 Successive Overrelaxation (SOR) °To motivate next improvement, write basic step in algorithm as: u(i,j,m+1) = u(i,j,m) + correction(i,j,m) °If “correction” is a good direction to move, then one should move even further in that direction by some factor w>1 u(i,j,m+1) = u(i,j,m) + w * correction(i,j,m) °Called successive overrelaxation (SOR) °Can prove w = 2/(1+sin(  /(n+1)) ) for best convergence

Math Koev, Fall 2003 Serial FFT °Let i=sqrt(-1) and index matrices and vectors from 0. °The Discrete Fourier Transform of an m-element vector v is: F*v Where F is the m*m matrix defined as: F[j,k] =  (j*k) Where  is:  = e (2  i/m) = cos(2  /m) + i*sin(2  /m) °This is a complex number with whose m th power is 1 and is therefore called the m th root of unity °E.g., for m = 4:  = 0+1*i,   = -1+0*i,   = 0-1*i,   = 1+0*i,

Math Koev, Fall 2003 The FFT Algorithm °Compute the FFT of an m-element vector v, F*v (F*v)[j] =  F(j,k)*v(k) =   (j*k) * v(k) =  (  j ) k * v(k) = V(  j ) °Where V is defined as the polynomial V(x) =  x k * v(k) m-1 k = 0 m-1 k = 0 m-1 k = 0 m-1 k = 0

Math Koev, Fall 2003 Divide and Conquer FFT °V can be evaluated using divide-and-conquer V(x) =  (x) k * v(k) = v[0] + x 2 *v[2] + x 4 *v[4] + … + x*(v[1] + x 2 *v[3] + x 4 *v[5] + … ) = V even (x 2 ) + x*V odd (x 2 ) °V has degree m, so V even and V odd are polynomials of degree m/2-1 °We evaluate these at points (  j ) 2 for 0<=j<=m-1 °But this is really just m/2 different points, since (  (j+m/2) ) 2 = (  j *  m/2) ) 2 = (  2j *  ) = (  j ) 2 m-1 k = 0

Math Koev, Fall 2003 Divide-and-Conquer FFT FFT(v, v, m) if m = 1 return v[0] else v even = FFT(v[0:2:m-2],  2, m/2) v odd = FFT(v[1:2:m-1],  2, m/2)  -vec = [  0,  1, …  (m/2-1) ] return [v even + (  -vec.* v odd ), v even - (  -vec.* v odd ) ] °The.* above is component-wise multiply. °The […,…] is construction an m-element vector from 2 m/2 element vectors This results in an O(m log m) algorithm. precomputed

Math Koev, Fall 2003 An Iterative Algorithm °The call tree of the d&c FFT algorithm is a complete binary tree of log m levels °Practical algorithms are iterative, going across each level in the tree starting at the bottom FFT(0,1,2,3,…,15) = FFT(xxxx) FFT(1,3,…,15) = FFT(xxx1)FFT(0,2,…,14) = FFT(xxx0) FFT(xx10) FFT(xx11)FFT(xx00) FFT(x100)FFT(x010)FFT(x110)FFT(x001)FFT(x101)FFT(x011)FFT(x111)FFT(x000) FFT(0) FFT(8) FFT(4) FFT(12) FFT(2) FFT(10) FFT(6) FFT(14) FFT(1) FFT(9) FFT(5) FFT(13) FFT(3) FFT(11) FFT(7) FFT(15)