All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,

Slides:



Advertisements
Similar presentations
Numerical Solution of Linear Equations
Advertisements

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Numerical Algorithms ITCS 4/5145 Parallel Computing UNC-Charlotte, B. Wilkinson, 2009.
Parallel Strategies Partitioning consists of the following steps –Divide the problem into parts –Compute each part separately –Merge the results Divide.
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2007.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Numerical Algorithms • Matrix multiplication
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2007.
Chapter 4 Roots of Equations
and Divide-and-Conquer Strategies
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Assignment Solving System of Linear Equations Using MPI Phạm Trần Vũ.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Iterative Methods for Solving Linear Systems Leo Magallon & Morgan Ulloa.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
CSCI-455/552 Introduction to High Performance Computing Lecture 11.5.
Linear Systems – Iterative methods
Newton’s Method, Root Finding with MATLAB and Excel
Numerical Methods Solution of Equation.
Fall 2008Simple Parallel Algorithms1. Fall 2008Simple Parallel Algorithms2 Scalar Product of Two Vectors Let a = (a 1, a 2, …, a n ); b = (b 1, b 2, …,
Data Structures and Algorithms in Parallel Computing Lecture 10.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Linear Systems Numerical Methods. 2 Jacobi Iterative Method Choose an initial guess (i.e. all zeros) and Iterate until the equality is satisfied. No guarantee.
1a.1 Parallel Computing and Parallel Computers ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006.
CSCI-455/552 Introduction to High Performance Computing Lecture 15.
1 Chapter4 Partitioning and Divide-and-Conquer Strategies 划分和分治的并行技术 Lecture 5.
Particle Kinematics Direction of velocity vector is parallel to path Magnitude of velocity vector is distance traveled / time Inertial frame – non accelerating,
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! 1 ITCS 4/5145 Parallel Computing,
Numerical Algorithms Chapter 11.
CHAPTER 3 NUMERICAL METHODS
Parallel Computing and Parallel Computers
Setup distribution of N particles
Kinematics Introduction to Motion
Gauss-Siedel Method.
Synchronous Computations
Pattern Parallel Programming
Partitioning and Divide-and-Conquer Strategies
Systems of Particles.
Design and Analysis of Algorithm
Lecture 19 MA471 Fall 2003.
Fitting Curve Models to Edges
Autar Kaw Benjamin Rigsby
and Divide-and-Conquer Strategies
Systems of Particles.
Metode Eliminasi Pertemuan – 4, 5, 6 Mata Kuliah : Analisis Numerik
Numerical Algorithms • Parallelizing matrix multiplication
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Pipelined Computations
Advanced Computer Graphics Spring 2008
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt Oct 24, 2013.
Pipelined Pattern This pattern is implemented in Seeds, see
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt March 20, 2014.
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson slides5.ppt August 17, 2014.
Setup distribution of N particles
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Parallel Computing and Parallel Computers
Stencil Pattern ITCS 4/5145 Parallel computing, UNC-Charlotte, B. Wilkinson Oct 14, 2014 slides6b.ppt 1.
Stencil Pattern ITCS 4/5145 Parallel computing, UNC-Charlotte, B. Wilkinson Jan 28,
Quiz Questions Iterative Synchronous Pattern
Stencil Pattern ITCS 4/5145 Parallel computing, UNC-Charlotte, B. Wilkinson StencilPattern.ppt Oct 14,
Data Parallel Pattern 6c.1
Quiz Questions Iterative Synchronous Pattern
Linear Algebra Lecture 16.
N-Body Gravitational Simulations
Pivoting, Perturbation Analysis, Scaling and Equilibration
Presentation transcript:

All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson. slides3b.ppt Oct 14, 2014

Iterative synchronous patterns When a pattern is repeated until some termination condition occurs. Synchronization at each iteration, to establish termination condition, often a global condition. Note this is actually two patterns joined together sequentially if we call iteration a pattern. Pattern Check termination condition Repeat Stop Note these pattern names are our names.

Iterative synchronous all-to-all pattern Repeat Stop Check termination condition Implemented in Seeds as the “CompleteSynchGraph” pattern. 6a.3

Application examples N-body problem Finding positions and movements of bodies subject to forces from other bodies. Could be bodies such as bodies in space (gravitational N-body problem), molecular bodies fundamental particles, ... Each iteration models one time interval. Solving system of linear equations by iteration Each unknown given in terms of the other unknowns Initial guess is made for the unknowns Iterate to converge on the solution

Gravitational N-Body Problem Finding positions and movements of bodies in space subject to gravitational forces from other bodies. Use Newtonian laws of physics: Equations: Gravitational force between two bodies of masses ma and mb is: G, gravitational constant. r distance between bodies. Subject to forces, body accelerates according to Newton’s 2nd law: F = ma m is the mass of body, F force it experiences, a is the resultant acceleration.

Details Force – First compute the force:   Force – First compute the force: Velocity (from F=ma) Let time interval be Dt. where mass of body is m, v t+1 is velocity at time t + 1 and v t is velocity at time t. New velocity is: Position -- Over interval Dt, position changes by where xt is its position at time t. Once bodies move to new positions, forces change. Computation has to be repeated to find movement over time.

This then gives the velocity and positions in three directions.   This then gives the velocity and positions in three directions.

This then gives the velocity and positions in two directions.   two This then gives the velocity and positions in two directions.

Two-dimensional space -- a little easier to visualize.   y x Force on body r Another body Add the force cause by each body in x and x directions Moves Movement

Code for 2-D Gravitational N-body problem Suppose a table is used to hold initial and computed data over time steps: Body Mass Position in x direction Position in y direction Velocity in x direction Velocity in y direction 1 2 . N On each iteration, position and velocities are updated. Table can be used to display movement of bodies

Sequential Code. The overall gravitational N-body computation can be described by the following steps: for (t = 0; t < tmax; t++) { //for each time period for (i = 0; i < N; i++) { //for body i, calculate force on body due to other bodies for (j = 0; j < N; j++) { if (i != j) { // for different bodies x_diff = ... ; // compute distance between body i and body j in x direction y_diff = ... ; // compute distance between body i and body j in y direction r = ... ; // compute distance r F = ... ; // compute force on bodies Fx[i] += ... ; // resolve and accumulate force in x direction Fy[i] += … ; // resolve and accumulate force in y direction } for (i = 0; i < N; i++) { // for each body, update positions and velocity A[i][x_velocity]= ... ; // new velocity in x direction A[i][y_velocity]= ... ; // new velocity in y direction A[i][x_position] = ... ; // new position in x direction A[i][y_position] = ... ; // new position in y direction } // end time period

Time complexity Brute-force sequential algorithm is an O(N2) algorithm for one iteration as each of the N bodies is influenced by each of the other N - 1 bodies. For t iterations, O(N2t) Not feasible to use this direct algorithm for most interesting N-body problems where N is very large.

Reducing time complexity Time complexity can be reduced approximating a cluster of distant bodies as a single distant body with mass sited at the center of mass of the cluster:

Barnes-Hut Algorithm 3-D space: Start with whole space in which one cube contains the bodies. • First, this cube is divided into eight subcubes. • If a subcube contains no bodies, subcube deleted from further consideration. • If a subcube contains one body, subcube retained. • If a subcube contains more than one body, it is recursively divided until every subcube contains one body.

Creates an octtree - a tree with up to eight edges from each vertex (node). Leaves represent cells each containing one body. After tree constructed, total mass and center of mass of subcube stored at each vertex (node).

Force on each body obtained by traversing tree starting at root, stopping at a node when the clustering approximation can be used, e.g. when r is greater than some specified distance D. Constructing tree requires a time of O(NlogN), and so does computing all the forces, so that overall time complexity of method is O(NlogN).

Example for 2-dimensional space Construction of tree: At each vertex, store coordinates of center of mass and total mass of bodies in space below (bodies) One body

Computing force on each body -- traverse tree starting at root, stopping at a node when clustering approximation can be used, i.e. when r is greater than some set distance D. For each body Mass and coordinates of center of mass of bodies in sub space

Orthogonal Recursive Bisection An alternative way of dividing space. (For 2-dimensional area) First, a vertical line found that divides area into two areas each with equal number of bodies. For each area, a horizontal line found that divides it into two areas each with equal number of bodies. Repeated as required.

Solving General System of Linear Equations by iteration Suppose equations are of a general form with n equations and n unknowns: where the unknowns are x0, x1, x2, … xn-1 (0 <= i < n). 6a.2020

By rearranging the ith equation: This equation gives xi in terms of the other unknowns. Can be used as an iteration formula for each of the unknowns to obtain better approximations. Process i computes xi 6a.2121

Suppose each process computes one unknown. Pi computes xi Process Pi needs unknowns from all other processes on each iteration P0 Pn-1 (Excluding Pi) Computes: Pi Needs iterative synchronous all-to-all pattern 6a.22

Jacobi Iteration Name given to a computation that uses the previous iteration value to compute the next values. All values of x are updated together. Other (non-parallel) methods use some of the present iteration values to compute the present values, see later. 6a.2323

This condition is a sufficient but not a necessary condition. Convergence: Can be proven that the Jacobi method will converge if diagonal values of a have an absolute value greater than sum of absolute values of other a’s on row, i.e. if This condition is a sufficient but not a necessary condition. Diagonal a’s

Termination Simple, common approach is compare values computed in one iteration to values obtained from the previous iteration. Terminate computation when all values are within given tolerance; i.e., when However, this does not guarantee the solution to that accuracy. Why? 6a.2525

Convergence Rate 6a.26

Questions

More Information

Seeds “CompleteSynchGraph” Pattern All-to-all pattern that includes synchronous iteration feature to pass results of one iteration to all nodes before next iteration. Workers gets replicas of the initial data set. At each iteration, workers synchronize and update their replicas and proceed to new computations. Master node and framework will not get control of the data flow until all iterations done. More information: “Seeds Framework – The CompleteSynchGraph Template Tutorial,” Jeremy Villalobos and Yawo K. Adibolo, June 18, 2012. at http://coitweb.uncc.edu/~abw/PatternProgGroup/index.html (to be moved) Gives details with code for the Jacobi iteration method of solving system of linear equations

System of linear equations can be described in matrix-vector form: AX = B where: Possible to create an iteration formula: Xk = CXk-1 + D where Xk is solution vector at iteration k Xk-1 the solution vector at iteration k-1 C a matrix derived from input matrix A D a vector derived from input vector B. 6a.30