Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Advanced Simulation & Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at The University of Chicago The Center for Astrophysical Thermonuclear.

Similar presentations


Presentation on theme: "An Advanced Simulation & Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at The University of Chicago The Center for Astrophysical Thermonuclear."— Presentation transcript:

1 An Advanced Simulation & Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at The University of Chicago The Center for Astrophysical Thermonuclear Flashes FLASH Tutorial May 13, 2004 Parallel Computing and MPI

2 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago What is Parallel Computing ? And why is it useful qParallel Computing is more than one cpu working together on one problem qIt is useful when q Large problem, could take very long q Data size too big to fit in the memory of one processor qWhen to parallelize q Problem could be subdivided into relatively independent tasks qHow much to parallelize q While the speedup in computation relative to single processor is of the order of number of processors

3 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Parallel paradigms qSIMD – Single instruction multiple data q Processors work in lock-step qMIMD – Multiple instruction multiple data q Processors do their own thing with occasional synchronization qShared Memory q One way communications qDistributed Memory q Message passing qLoosely Coupled q When the process on each cpu is fairly self contained and relatively independent of processes on other cpu’s qTightly Coupled q When cpu’s need to communicate with each other frequently

4 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago How to Parallelize qDivide a problem into a set of mostly independent tasks q Partitioning a problem qTasks get their own data q Localize a task qThey operate on their own data for the most part q Try to make it self contained qOccasionally q Data may be needed from other tasks q Inter-process communication q Synchronization may be required between tasks q Global operation qMap tasks to different processors q One processor may get more than one task q Task distribution should be well balanced

5 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago New Code Components qInitialization qQuery parallel state q Identify process q Identify number of processes qExchange data between processes q Local, Global qSynchronization q Barriers, Blocking Communication, Locks qFinalization

6 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago MPI qMessage Passing Interface, standard for distributed memory model of parallelism qMPI-2 will support one-way communication, commonly associated with shared memory operations qWorks with communicators; a collection of processors q MPI_COMM_WORLD default qHas support for lowest level communication operations and composite operations qHas blocking and non-blocking operations

7 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Communicators COMM1 COMM2

8 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Low level Operations in MPI qMPI_Init qMPI_Comm_size q Find number of processors qMPI_Comm_rank q Find my processor number qMPI_Send/Recv q Communicate with other processors one at a time qMPI_Bcast q Global data transmission qMPI_Barrier q Synchronization qMPI_Finalize

9 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Advanced Constructs in MPI qComposite Operations q Gather/Scatter q Allreduce q Alltoall qCartesian grid operations q Shift qCommunicators q Creating subgroups of processors to operate on qUser-defined Datatypes qI/O q Parallel file operations

10 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Communication Patterns 10 32 Collective 0123 Shift 10 2 All to All 10 32 Point to Point 10 32 One to All Broadcast

11 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Communication Overheads qLatency vs. Bandwidth qBlocking vs. Non-Blocking q Overlap q Buffering and copy qScale of communication q Nearest neighbor q Short range q Long range qVolume of data q Resource contention for links qEfficiency q Hardware, software, communication method

12 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Parallelism in FLASH qShort range communications q Nearest neighbor qLong range communications q Regridding qOther global operations q All-reduce operations on physical quantities q Specific to solvers q multi-pole method q FFT based solvers

13 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Domain Decomposition P0 P1 P2P3

14 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Border Cells / Ghost Points qWhen splitting up solnData, need data from other processors. qNeed a layer of cells from each processor qNeed to update each time step

15 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Border/Ghost Cells Short Range communication

16 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Two MPI Methods for doing it qMPI_Cart_create q Create topology qMPE_Decomp1d q Domain decomp on topology qMPI_Cart_shift q Who’s on the left/right? qMPI_SendRecv q Ghost cells left qMPI_SendRecv q Ghost cells right qMPI_Comm_rank qMPI_Comm_size qManually decompose grid over processors qCalculate left/right qMPI_Send/MPI_Recv q Carefully to avoid deadlocks

17 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Adaptive Grid Issues qDiscretization not uniform qSimple left-right guard cell fills inadequate qAdjacent grid points may not be mapped to the nearest neighbors in processors topology qRedistribution of work necessary

18 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Regridding qChange in number of cells/blocks qSome processors get more work than others qLoad imbalance qRedistribute data to even out work on all processors qLong range communications qLarge quantities of data moved

19 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Regridding

20 The ASC/Alliances Center for Astrophysical Thermonuclear Flashes The University of Chicago Other parallel operations in FLASH qGlobal max/sum etc (Allreduce) q Physical quantities q In solvers q Performance monitoring qAlltoall q FFT based solver on UG qUser defined datatypes and file operations q Parallel I/O


Download ppt "An Advanced Simulation & Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at The University of Chicago The Center for Astrophysical Thermonuclear."

Similar presentations


Ads by Google