Presentation is loading. Please wait.

Presentation is loading. Please wait.

MA/CS 471 Lecture 15, Fall 2002 Introduction to Graph Partitioning.

Similar presentations


Presentation on theme: "MA/CS 471 Lecture 15, Fall 2002 Introduction to Graph Partitioning."— Presentation transcript:

1 MA/CS 471 Lecture 15, Fall 2002 Introduction to Graph Partitioning

2 MA/CS 471 Fall 20022 Graph (or mesh) Partitioning  We have so far implemented a finite element Poisson solver.  The implementation is serial and not suited to parallel computing immediately  We have started to make the algorithm more suitable by switching from the LU factorization approach to solving the linear system –> to a conjugate gradient, iterative, algorithm which does not have the same bottlenecks to parallel computation

3 MA/CS 471 Fall 20023 Next Step To Parallelism  Now we have made sure that there are no intrinsically serial computation steps in system solve we are free to divide up the work between processes.  We will proceed by deciding which finite-element triangle goes to which processor

4 MA/CS 471 Fall 20024 Mesh Partioning  So far, I have supplied files which include information on which triangle goes to which processor  These files were generated using pmetis  http://www-users.cs.umn.edu/~karypis http://www-users.cs.umn.edu/~karypis  This is a serial routine, however Karypis has written a parallel version which can be used as a library.  The library is called parmetis…

5 MA/CS 471 Fall 20025

6 6

7 7

8 8

9 9

10 10 Team Project Continued  Now we are ready to progress towards making the serial Poisson solver work in paralllel.  This task divides into a number of steps:  Conversion of umDriver, umMESH, umStartUp, umMatrix and umSolve  Adding a routine to read in a partition file (or call parMetis to obtain a partition vector)

11 MA/CS 471 Fall 200211 umDriver modification  This code should now initialize MPI  This code should call the umPartition routine  This should be modified to find the number of processors and local processor ID (stored in your struct/class..)  This code should finalize MPI

12 MA/CS 471 Fall 200212 umPartition  This code should read in a partition from file  The input should be the name of the partition file, the current process ID (rank) and the number of processes (size)  The output should be a list of elements belonging to this process

13 MA/CS 471 Fall 200213 umMESH Modifications  This routine should now be fed a partition file determining which elements it should read in from the.neu input mesh file  You should replace the elmttoelmt part with a piece of code which goes through the.neu file and reads in which element/face lies on the boundary and use this to mark whether a node is known or unknown  Each process should send a list of its “known” vertices’ global numbers to each other process so all nodes can be correctly identified as lying on the boundary or not

14 MA/CS 471 Fall 200214 umStartUp modification  Remains largely unchanged (depending on how you read in umVertX,umVertY, elmttonode).

15 MA/CS 471 Fall 200215 umMatrix modification  This routine should be modified so that instead of creating the mat matrix it should be fed a vector vec and returns mat*vec  IT SHOULD NOT STORE THE GLOBAL MATRIX AT ALL!!  I strongly suggest creating a new routine (umMatrixOP) and comparing the output from this with using umMatrix to build and multiply some vector as debugging

16 MA/CS 471 Fall 200216 umSolve modification  The major biggy here is the replacement of umAinvB with a call to your own conjugate gradient solver  Note – the rhs vector is filled up here with a global gather of the elemental contributions, so this will have to be modified due to the elements on other processes.

17 MA/CS 471 Fall 200217 umCG modification  umCG is the routine which should take a rhs and return an approximate solution using CG.  Each step of the CG algorithm needs to be analyzed to determine the process data dependency  For the matrix*vector steps a certain amount of data swap is required  For the dot products an allreduce is required.  Strongly suggest creating the exchange sequence before the iterations start.

18 MA/CS 471 Fall 200218 Work Partition  Here’s the deal – there are approximately six unequal chunks of work to be done.  I suggest the following code split up  umDriver, umCG  umPartition, umSolve  umMESH, umStartUp  umMatrixOP  However, you are free to choose.  Try to minimize the amount of data stored on multiple processes (but do not make the task too difficult, by not sharing anything)

19 MA/CS 471 Fall 200219 Discussion and Project Write-Up  This is a little tricky so now is the time to form a plan and to ask any questions.  This will be due on Tuesday 22 nd October  As usual I need a complete write up.  This should include parallel timings and speed up tests (I.e. for a fixed grid find wall clock time umCG for Nprocs =2,4,6,8,10,12,14,16 and compare in a graph)  Test the code to make sure it is giving the same results (up to convergence tolerance) as the serial code  Profile your code using upshot  Include pictures showing partition (use a different colour per partition) and parallel solution.


Download ppt "MA/CS 471 Lecture 15, Fall 2002 Introduction to Graph Partitioning."

Similar presentations


Ads by Google