CS267 L12 Sources of Parallelism(3).1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 12: Sources of Parallelism and Locality (Part 3)

Slides:



Advertisements
Similar presentations
CSCI-455/552 Introduction to High Performance Computing Lecture 25.
Advertisements

03/23/07CS267 Lecture 201 CS 267: Multigrid on Structured Grids Kathy Yelick
SOLVING THE DISCRETE POISSON EQUATION USING MULTIGRID ROY SROR ELIRAN COHEN.
Numerical Algorithms ITCS 4/5145 Parallel Computing UNC-Charlotte, B. Wilkinson, 2009.
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Numerical Algorithms Matrix multiplication
Ma221 - Multigrid DemmelFall 2004 Ma 221 – Fall 2004 Multigrid Overview James Demmel
Numerical Algorithms • Matrix multiplication
CSCI 317 Mike Heroux1 Sparse Matrix Computations CSCI 317 Mike Heroux.
03/09/2005CS267 Lecture 14 CS 267: Applications of Parallel Computers Solving Linear Systems arising from PDEs - I James Demmel
CS267 Poisson Demmel Fall 2002 CS 267 Applications of Parallel Computers Solving Linear Systems arising from PDEs - I James Demmel
03/14/06CS267 Lecture 17 CS 267: Applications of Parallel Computers Unstructured Multigrid for Linear Systems James Demmel Based in part on material from.
Sparse Matrix Algorithms CS 524 – High-Performance Computing.
CS267 L12 Sources of Parallelism(3).1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 12: Sources of Parallelism and Locality (Part 3)
CS267 L24 Solving PDEs.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 24: Solving Linear Systems arising from PDEs - I James Demmel.
CS267 Poisson 2.1 Demmel Fall 2002 CS 267 Applications of Parallel Computers Solving Linear Systems arising from PDEs - II James Demmel
02/09/2005CS267 Lecture 71 CS 267 Sources of Parallelism and Locality in Simulation – Part 2 James Demmel
Avoiding Communication in Sparse Iterative Solvers Erin Carson Nick Knight CS294, Fall 2011.
Towards Communication Avoiding Fast Algorithm for Sparse Matrix Multiplication Part I: Minimizing arithmetic operations Oded Schwartz CS294, Lecture #21.
3/1/2004CS267 Lecure 101 CS 267 Sources of Parallelism Kathy Yelick
03/09/06CS267 Lecture 16 CS 267: Applications of Parallel Computers Solving Linear Systems arising from PDEs - II James Demmel
CS 240A: Sources of Parallelism in Physical Simulation Based on slides from David Culler, Jim Demmel, Kathy Yelick, et al., UCB CS267.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
CS 240A: Solving Ax = b in parallel °Dense A: Gaussian elimination with partial pivoting Same flavor as matrix * matrix, but more complicated °Sparse A:
04/13/07CS267 Guest Lecture CS 267: Applications of Parallel Computers Unstructured Multigrid for Linear Systems James Demmel Based in part on material.
CS267 L24 Solving PDEs.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 24: Solving Linear Systems arising from PDEs - I James Demmel.
CS267 L11 Sources of Parallelism(2).1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 11: Sources of Parallelism and Locality (Part 2)
1/26 Design of parallel algorithms Linear equations Jari Porras.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
CS267 L25 Solving PDEs II.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 25: Solving Linear Systems arising from PDEs - II James Demmel.
CS240A: Conjugate Gradients and the Model Problem.
CS267 L24 Solving PDEs.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 24: Solving Linear Systems arising from PDEs - I James Demmel.
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
1 Sources of Parallelism and Locality in Simulation.
Module on Computational Astrophysics Jim Stone Department of Astrophysical Sciences 125 Peyton Hall : ph :
Conjugate gradients, sparse matrix-vector multiplication, graphs, and meshes Thanks to Aydin Buluc, Umit Catalyurek, Alan Edelman, and Kathy Yelick for.
L11: Sparse Linear Algebra on GPUs CS Sparse Linear Algebra 1 L11: Sparse Linear Algebra CS6235
02/14/2007CS267 Lecture 91 CS 267 Sources of Parallelism and Locality in Simulation – Part 2 Kathy Yelick
1 Data Structures for Scientific Computing Orion Sky Lawlor charm.cs.uiuc.edu 2003/12/17.
Computation on meshes, sparse matrices, and graphs Some slides are from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
CS 219: Sparse Matrix Algorithms
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Administrivia: May 20, 2013 Course project progress reports due Wednesday. Reading in Multigrid Tutorial: Chapters 3-4: Multigrid cycles and implementation.
Elliptic PDEs and the Finite Difference Method
02/07/2006CS267 Lecture 71 CS 267 Sources of Parallelism and Locality in Simulation – Part 2 James Demmel
Fall 2011Math 221 Multigrid James Demmel
1 CS240A: Parallelism in CSE Applications Tao Yang Slides revised from James Demmel and Kathy Yelick
CS 267 Sources of Parallelism and Locality in Simulation – Part 2
Parallel Solution of the Poisson Problem Using MPI
CS240A: Conjugate Gradients and the Model Problem.
Introduction to Scientific Computing II Overview Michael Bader.
CS 290H Administrivia: May 14, 2008 Course project progress reports due next Wed 21 May. Reading in Saad (second edition): Sections
1 Data Structures for Scientific Computing Orion Sky Lawlor /04/14.
CS 420 Design of Algorithms Parallel Algorithm Design.
C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.
CS267 L24 Solving PDEs.1 Demmel Sp 1999 Math Poisson’s Equation, Jacobi, Gauss-Seidel, SOR, FFT Plamen Koev
Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.
Optimizing 3D Multigrid to Be Comparable to the FFT Michael Maire and Kaushik Datta Note: Several diagrams were taken from Kathy Yelick’s CS267 lectures.
CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning
Numerical Algorithms Chapter 11.
Parallel Direct Methods for Sparse Linear Systems
Model Problem: Solving Poisson’s equation for temperature
CS 290N / 219: Sparse Matrix Algorithms
CS 267 Sources of Parallelism and Locality in Simulation – Part 2
Computation on meshes, sparse matrices, and graphs
Computational meshes, matrices, conjugate gradients, and mesh partitioning Some slides are from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy.
CS 267 Sources of Parallelism and Locality in Simulation – Part 2
CS 267 Sources of Parallelism and Locality in Simulation – Part 2
James Demmel CS 267 Applications of Parallel Computers Lecture 12: Sources of Parallelism and Locality.
Presentation transcript:

CS267 L12 Sources of Parallelism(3).1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 12: Sources of Parallelism and Locality (Part 3) Tricks with Trees James Demmel

CS267 L12 Sources of Parallelism(3).2 Demmel Sp 1999 Recap of last lecture °ODEs Sparse Matrix-vector multiplication Graph partitioning to balance load and minimize communication °PDEs Heat Equation and Poisson Equation Solving a certain special linear system T Many algorithms, ranging from -Dense Gaussian elimination, slow but very general, to -Multigrid, fast but only works on matrices like T

CS267 L12 Sources of Parallelism(3).3 Demmel Sp 1999 Outline °Continuation of PDEs What do realistic meshes look like? °Tricks with Trees

CS267 L12 Sources of Parallelism(3).4 Demmel Sp 1999 Partial Differential Equations PDEs

CS267 L12 Sources of Parallelism(3).5 Demmel Sp 1999 Poisson’s equation in 1D °Solve Tx=b where T = 2 Graph and “stencil”

CS267 L12 Sources of Parallelism(3).6 Demmel Sp 1999 Poisson’s equation in 2D °Solve Tx=b where °3D is analogous T = 4 Graph and “stencil”

CS267 L12 Sources of Parallelism(3).7 Demmel Sp 1999 Algorithms for 2D Poisson Equation with N unknowns AlgorithmSerialPRAMMemory #Procs °Dense LUN 3 NN 2 N 2 °Band LUN 2 NN 3/2 N °JacobiN 2 NNN °Explicit Inv.N log NNN °Conj.Grad.N 3/2 N 1/2 *log NNN °RB SORN 3/2 N 1/2 NN °Sparse LUN 3/2 N 1/2 N*log NN °FFTN*log Nlog NNN °MultigridNlog 2 NNN °Lower boundNlog NN PRAM is an idealized parallel model with zero cost communication (see next slide for explanation) 222

CS267 L12 Sources of Parallelism(3).8 Demmel Sp 1999 Relation of Poisson’s equation to Gravity, Electrostatics °Force on particle at (x,y,z) due to particle at 0 is -(x,y,z)/r^3, where r = sqrt(x +y +z ) °Force is also gradient of potential V = -1/r = -(d/dx V, d/dy V, d/dz V) = -grad V °V satisfies Poisson’s equation (try it!) 222

CS267 L12 Sources of Parallelism(3).9 Demmel Sp 1999 Comments on practical meshes °Regular 1D, 2D, 3D meshes Important as building blocks for more complicated meshes °Practical meshes are often irregular Composite meshes, consisting of multiple “bent” regular meshes joined at edges Unstructured meshes, with arbitrary mesh points and connectivities Adaptive meshes, which change resolution during solution process to put computational effort where needed

CS267 L12 Sources of Parallelism(3).10 Demmel Sp 1999 Composite mesh for a mechanical structure

CS267 L12 Sources of Parallelism(3).11 Demmel Sp 1999 Converting the mesh to a matrix

CS267 L12 Sources of Parallelism(3).12 Demmel Sp 1999 Effects of Ordering Rows and Columns on Gaussian Elimination

CS267 L12 Sources of Parallelism(3).13 Demmel Sp 1999 Irregular mesh: NASA Airfoil in 2D (direct solution)

CS267 L12 Sources of Parallelism(3).14 Demmel Sp 1999 Irregular mesh: Tapered Tube (multigrid)

CS267 L12 Sources of Parallelism(3).15 Demmel Sp 1999 Adaptive Mesh Refinement (AMR) °Adaptive mesh around an explosion °John Bell and Phil Colella at LBL (see class web page for URL) °Goal of Titanium is to make these algorithms easier to implement in parallel

CS267 L12 Sources of Parallelism(3).16 Demmel Sp 1999 Challenges of irregular meshes (and a few solutions) °How to generate them in the first place Triangle, a 2D mesh partitioner by Jonathan Shewchuk °How to partition them ParMetis, a parallel graph partitioner °How to design iterative solvers PETSc, a Portable Extensible Toolkit for Scientific Computing Prometheus, a multigrid solver for finite element problems on irregular meshes Titanium, a language to implement Adaptive Mesh Refinement °How to design direct solvers SuperLU, parallel sparse Gaussian elimination °These are challenges to do sequentially, the more so in parallel