PIMA-motivation PIMA: Partition Improvement using Mesh Adjacencies  Parallel simulation requires that the mesh be distributed with equal work-load and.

Slides:



Advertisements
Similar presentations
Chapter 4 Partition I. Covering and Dominating.
Advertisements

Dynamic Load Balancing in Scientific Simulation Angen Zheng.
Sharks and Fishes – The problem
epiC: an Extensible and Scalable System for Processing Big Data
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Wavelength Assignment in Optical Network Design Team 6: Lisa Zhang (Mentor) Brendan Farrell, Yi Huang, Mark Iwen, Ting Wang, Jintong Zheng Progress Report.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
Adaptive Mesh Applications
Locality Aware Dynamic Load Management for Massively Multiplayer Games Written by Jin Chen 1, Baohua Wu 2, Margaret Delap 2, Bjorn Knutsson 2, Honghui.
Locality Aware Dynamic Load Management for Massively Multiplayer Games Jin Chen, Baohua Wu, Margaret Delap, Bjorn Knutson, Honghui Lu and Cristina Amza.
CISC October Goals for today: Foster’s parallel algorithm design –Partitioning –Task dependency graph Granularity Concurrency Collective communication.
CS 584. Review n Systems of equations and finite element methods are related.
Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.
Software Version Control SubVersion software version control system WebSVN graphical interface o View version history logs o Browse directory structure.
Rectangle Visibility Graphs: Characterization, Construction, Compaction Ileana Streinu (Smith) Sue Whitesides (McGill U.)
University of CreteCS4831 The use of Minimum Spanning Trees in microarray expression data Gkirtzou Ekaterini.
UNIVERSITY OF JYVÄSKYLÄ Topology Management in Unstructured P2P Networks Using Neural Networks Presentation for IEEE Congress on Evolutionary Computing.
Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.
Distributed Process Management1 Learning Objectives Distributed Scheduling Algorithms Coordinator Elections Orphan Processes.
FLANN Fast Library for Approximate Nearest Neighbors
Topology Aware Mapping for Performance Optimization of Science Applications Abhinav S Bhatele Parallel Programming Lab, UIUC.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
The sequence of graph transformation (P1)-(P2)-(P4) generating an initial mesh with two finite elements GENERATION OF THE TOPOLOGY OF INITIAL MESH Graph.
Fast Spectrum Allocation in Coordinated Dynamic Spectrum Access Based Cellular Networks Anand Prabhu Subramanian*, Himanshu Gupta*,
ParFUM Parallel Mesh Adaptivity Nilesh Choudhury, Terry Wilmarth Parallel Programming Lab Computer Science Department University of Illinois, Urbana Champaign.
Multi-dimensional Queries in P2P Systems. Applications Photo-sharing (photographs tagged with metadata) Multi-player online games (locate objects and.
7 th Annual Workshop on Charm++ and its Applications ParTopS: Compact Topological Framework for Parallel Fragmentation Simulations Rodrigo Espinha 1 Waldemar.
Graph Algorithms for Irregular, Unstructured Data John Feo Center for Adaptive Supercomputing Software Pacific Northwest National Laboratory July, 2010.
Kevin Ross, UCSC, September Service Network Engineering Resource Allocation and Optimization Kevin Ross Information Systems & Technology Management.
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
Efficient Deployment Algorithms for Prolonging Network Lifetime and Ensuring Coverage in Wireless Sensor Networks Yong-hwan Kim Korea.
Locality Aware Dynamic Load Management for Massively Multiplayer Games Jin Chen, Baohua Wu, Margaret Delap, Bjorn Knutsson, Margaret Delap, Bjorn Knutsson,
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
Application Paradigms: Unstructured Grids CS433 Spring 2001 Laxmikant Kale.
CS 584. Load Balancing Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC.
Iterative Partition Improvement using Mesh Topology for Parallel Adaptive Analysis M.S. Shephard, C. Smith, M. Zhou Scientific Computation Research Center,
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Adaptive Mesh Applications Sathish Vadhiyar Sources: - Schloegel, Karypis, Kumar. Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes. JPDC.
Partitioning using Mesh Adjacencies  Graph-based dynamic balancing Parallel construction and balancing of standard partition graph with small cuts takes.
An Evaluation of Partitioners for Parallel SAMR Applications Sumir Chandra & Manish Parashar ECE Dept., Rutgers University Submitted to: Euro-Par 2001.
Dzmitry Kliazovich University of Luxembourg, Luxembourg
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Static Process Scheduling
BOĞAZİÇİ UNIVERSITY – COMPUTER ENGINEERING Mehmet Balman Computer Engineering, Boğaziçi University Parallel Tetrahedral Mesh Refinement.
Pathfinding Algorithms for Mutating Weight Graphs Haitao Mao Computer Systems Lab
Scalable and Topology-Aware Load Balancers in Charm++ Amit Sharma Parallel Programming Lab, UIUC.
ParMA: Towards Massively Parallel Partitioning of Unstructured Meshes Cameron Smith, Min Zhou, and Mark S. Shephard Rensselaer Polytechnic Institute, USA.
Predictive Load Balancing Using Mesh Adjacencies for Mesh Adaptation  Cameron Smith, Onkar Sahni, Mark S. Shephard  Scientific Computation Research Center.
1 1 Zoltan: Toolkit of parallel combinatorial algorithms for unstructured, dynamic and/or adaptive computations Unstructured Communication Tools -Communication.
Dynamic Load Balancing in Scientific Simulation
 Dan Ibanez, Micah Corah, Seegyoung Seol, Mark Shephard  2/27/2013  Scientific Computation Research Center  Rensselaer Polytechnic Institute 1 Advances.
On the Ability of Graph Coloring Heuristics to Find Substructures in Social Networks David Chalupa By, Tejaswini Nallagatla.
EpiC: an Extensible and Scalable System for Processing Big Data Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian Lee Tan, Sai Wu School of Computing, National.
Dynamic Mobile Cloud Computing: Ad Hoc and Opportunistic Job Sharing.
High Performance Computing Seminar
Parallel Algorithm Oriented Mesh Database
2D AFEAPI Overview Goals, Design Space Filling Curves Code Structure
Parallel Graph Algorithms
Scalable Load-Distance Balancing
Ana Gainaru Aparna Sasidharan Babak Behzad Jon Calhoun
Efficient Join Query Evaluation in a Parallel Database System
Minimum Spanning Tree 8/7/2018 4:26 AM
Performance Evaluation of Adaptive MPI
Towards Effective Partition Management for Large Graphs
Parallelization of CPAIMD using Charm++
Compute convex lower bounding function and optimize it instead!
Adaptive Mesh Applications
Dynamic Load Balancing of Unstructured Meshes
Presentation transcript:

PIMA-motivation PIMA: Partition Improvement using Mesh Adjacencies  Parallel simulation requires that the mesh be distributed with equal work-load and minimum inter-part communications  Graph/hyper-graph partitions are powerful for unstructured meshes, however they use one type of mesh entities as the graph nodes, hence the balance of other mesh entities may not be optimal  Different applications have different requirement for load balance, different types of mesh entities may have to be balanced at the same time  LIIPBMod* has been developed to knock down the peaks of vertices *M. Zhou, O. Sahni, K.D. Devine, M.S. Shephard, K.E. Jansen, “Controlling unstructured mesh partitions for massively parallel simulations”, SIAM Journal on Scientific Computing, 32(6): , 2010

PIMA-advantages  Problems in partitions obtained by graph/hyper-graph based methods are limited to a small number of heavily loaded parts, referred to as spikes, the peak limits the scalabilities of applications  Uses mesh adjacencies -- Richer information than graph/hyper- graph based method, chances to provide better partitions  All adjacencies are obtainable in O(1) operations (not a function of mesh size) -- algorithm is efficient  Takes advantages of neighborhood communications -- Work well on massively parallel computations, since the limited communications used even at extreme scale PIMA is designed to migrate a small number of mesh entities on inter-part boundaries from heavily loaded parts to their lightly loaded neighbors to improve load balance

PIMA-algorithm  Input from users: – Types of mesh entities need to be balanced (Rgn, Face, Edge, Vtx) – The relative importance (priority) between them (= or >) e.g., “Vtx=Edge>Rgn” or “Rgn>Face=Edge>Vtx”, etc. The ones not specified in the input has no interest for balance  Steps of PIMA: – From high to low priority if separated by “>” (different groups) – From low to high dimensions based on entities topologies if separated by “=” (same group) e.g., “Rgn>Face=Edge>Vtx” is the user’s input Step 1: reduce the spikes for mesh regions Step 2.1: reduce the spikes for mesh edges Step 2.2: reduce the spikes for mesh faces Step 3: reduce the spikes for mesh vertices

PIMA-algorithm  PIMA: migrate a small number of mesh entities on inter-part boundaries (candidate mesh entities) to their lightly loaded neighboring parts (candidate parts) to improve the partitions  Candidate parts: – Absolutely and relatively lightly loaded – Lightly loaded for the mesh entities in the current group and the ones with higher priority – “Relatively” lightly loaded and iterative nature of the algorithm allow diffusive improvement of the partition balance  Candidate mesh entities – Ones on inter-part boundaries on heavily loaded parts are selected to be migrated s.t. the migration reduces the load imbalance and meanwhile maintains/improves the inter-part boundary

PIMA-algorithm: Candidate mesh entities 1.Vertex balance improvement: the vertices on inter-part boundaries bounding a small number of regions 2. Edge balance improvement: the edges on inter-part boundaries bounding a small number of faces 3. Face/Region balance improvement: Regions which have two or three faces on inter- part boundaries

PIMA-Tests 133M region mesh on 16k parts Table 1: Users input Table 2:Balance of partitions Table 3: Time usage and iterations (tests on Jaguar Cray XT5 system)