Balanced Graph Edge Partition ACM KDD 2014 Florian Bourse ENS Marc Lelarge INRIA-ENS Milan Vojnovic Microsoft Research.

Slides:



Advertisements
Similar presentations
Dynamic Data Partitioning for Distributed Graph Databases Xavier Martínez Palau David Domínguez Sal Josep Lluís Larriba Pey.
Advertisements

CrowdER - Crowdsourcing Entity Resolution
Diversity Maximization Under Matroid Constraints Date : 2013/11/06 Source : KDD’13 Authors : Zeinab Abbassi, Vahab S. Mirrokni, Mayur Thakur Advisor :
Differentiated Graph Computation and Partitioning on Skewed Graphs
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Distributed Graph Analytics Imranul Hoque CS525 Spring 2013.
Author: Jie chen and Yousef Saad IEEE transactions of knowledge and data engineering.
Parallel Subgraph Listing in a Large-Scale Graph Yingxia Shao  Bin Cui  Lei Chen  Lin Ma  Junjie Yao  Ning Xu   School of EECS, Peking University.
LFGRAPH: SIMPLE AND FAST DISTRIBUTED GRAPH ANALYTICS Hoque, Imranul, Vmware Inc. and Gupta, Indranil, University of Illinois at Urbana-Champaign – TRIOS.
Distributed Message Passing for Large Scale Graphical Models Alexander Schwing Tamir Hazan Marc Pollefeys Raquel Urtasun CVPR2011.
DISC-Finder: A distributed algorithm for identifying galaxy clusters.
Matroids, Secretary Problems, and Online Mechanisms Nicole Immorlica, Microsoft Research Joint work with Robert Kleinberg and Moshe Babaioff.
A scalable multilevel algorithm for community structure detection
New Algorithm DOM for Graph Coloring by Domination Covering
4/17/2017 Section 8.5 Euler & Hamilton Paths ch8.5.
SWE 423: Multimedia Systems Project #1: Image Segmentation Using Graph Theory.
A Framework For Community Identification in Dynamic Social Networks Chayant Tantipathananandh Tanya Berger-Wolf David Kempe Presented by Victor Lee.
Clustering Vertices of 3D Animated Meshes
BiGraph BiGraph: Bipartite-oriented Distributed Graph Partitioning for Big Learning Jiaxin Shi Rong Chen, Jiaxin Shi, Binyu Zang, Haibing Guan Institute.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Titan Graph Database Meet Bhatt(13MCEC02).
Load Balancing Tasks with Overlapping Requirements Milan Vojnovic Microsoft Research Joint work with Dan Alistarh, Christos Gkantsidis, Jennifer Iglesias,
Radial Basis Function Networks
Network Aware Resource Allocation in Distributed Clouds.
1 Fast Failure Recovery in Distributed Graph Processing Systems Yanyan Shen, Gang Chen, H.V. Jagadish, Wei Lu, Beng Chin Ooi, Bogdan Marius Tudor.
Graph Partitioning and Clustering E={w ij } Set of weighted edges indicating pair-wise similarity between points Similarity Graph.
Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,
GRAPH PROCESSING Hi, I am Mayank and the second presenter for today is Shadi. We will be talking about Graph Processing.
Scalable and Fully Distributed Localization With Mere Connectivity.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Efficient Deployment Algorithms for Prolonging Network Lifetime and Ensuring Coverage in Wireless Sensor Networks Yong-hwan Kim Korea.
Co-clustering Documents and Words Using Bipartite Spectral Graph Partitioning Jinghe Zhang 10/28/2014 CS 6501 Information Retrieval.
CS 584. Load Balancing Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
ANALYSIS AND IMPLEMENTATION OF GRAPH COLORING ALGORITHMS FOR REGISTER ALLOCATION By, Sumeeth K. C Vasanth K.
A Robust, Optimization-Based Approach for Approximate Answering of Aggregate Queries Surajit Chaudhuri Gautam Das Vivek Narasayya Presented by Sushanth.
Paper # – 2009 A Comparison of Heterogeneous Video Multicast schemes: Layered encoding or Stream Replication Authors: Taehyun Kim and Mostafa H.
Shortest Path Problems Dijkstra’s Algorithm. Introduction Many problems can be modeled using graphs with weights assigned to their edges: Airline flight.
QoS Supported Clustered Query Processing in Large Collaboration of Heterogeneous Sensor Networks Debraj De and Lifeng Sang Ohio State University Workshop.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Partitioning using Mesh Adjacencies  Graph-based dynamic balancing Parallel construction and balancing of standard partition graph with small cuts takes.
1 Subscription Partitioning and Routing in Content-based Publish/Subscribe Networks Yi-Min Wang, Lili Qiu, Dimitris Achlioptas, Gautam Das, Paul Larson,
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Pathfinding Algorithms for Mutating Weight Graphs Haitao Mao Computer Systems Lab
Practical Message-passing Framework for Large-scale Combinatorial Optimization Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin KAIST 2015.
Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs Joseph E. Gonzalez, Yucheng Low, Haijie Gu, and Danny Bickson, Carnegie Mellon University;
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
Sampling Based Range Partition for Big Data Analytics + Some Extras Milan Vojnović Microsoft Research Cambridge, United Kingdom Joint work with Charalampos.
Construction of Optimal Data Aggregation Trees for Wireless Sensor Networks Deying Li, Jiannong Cao, Ming Liu, and Yuan Zheng Computer Communications and.
Supporting On-Demand Elasticity in Distributed Graph Processing Mayank Pundir*, Manoj Kumar, Luke M. Leslie, Indranil Gupta, Roy H. Campbell University.
Chen Qian, Xin Li University of Kentucky
International Conference on Data Engineering (ICDE 2016)
Parallel Graph Algorithms
PREGEL Data Management in the Cloud
Server Allocation for Multiplayer Cloud Gaming
Mayank Bhatt, Jayasi Mehar
Mélange: Multi-tenant Scheduling for Graph Processing Jobs
A* Path Finding Ref: A-star tutorial.
Homework Assignment 1: Use the following data set to test the performance difference of three clustering algorithms: K-means, AP clustering and Spectral.
تقسیم گراف در سیستم های کلان داده مرکزیت راس
Problem Solving 4.
Chapter 5: Relations & Functions
GANG: Detecting Fraudulent Users in OSNs
DryadInc: Reusing work in large-scale computations
Clustering The process of grouping samples so that the samples are similar within each group.
Graph Search in C++ Andrew Lindsay.
Instructor: Aaron Roth
Presentation transcript:

Balanced Graph Edge Partition ACM KDD 2014 Florian Bourse ENS Marc Lelarge INRIA-ENS Milan Vojnovic Microsoft Research

Balanced Graph Partition 2

Different Variants VP EPA u u u u u u EP VPA Vertex partition Edge partition No Aggregation Aggregation traditional ? ? ? PowerGraph [OSDI 2012] 3

Questions Performance benefits of using balanced edge partition as opposed to using more traditional balanced vertex partition ? Practical algorithms for balanced edge partition w/o aggregation and their theoretical guarantees ? Streaming heuristics for balanced edge partition ? 4

Costs: Cuts and Loads 5 Master vertex assignment

Expected Costs of Random Assignments 6

Random Assignment Comparison 7

Approximation Guarantees 8

Approximation Guarantees (cont’d) 9

Streaming Heuristics Online assignment of vertices or edges as they are observed in an input stream Irrevocable assignments Reassignments are expensive in web-scale systems (consistency of distributed state) Use local graph knowledge (neighbourhood sets) Scalable One pass through the vertices or edges Previously proposed streaming heuristic: PowerGraph [OSDI 2012] 10

PowerGraph Streaming Heuristic Prioritizes assignment of edges to clusters that already contain its end vertices: prone to large load imbalance Place e to Place e to a least loaded cluster

Greedy: Least Incremental Cost 12

Experimental Evaluation 13

Performance of Random Assignment Graph: Amazon 14

Streaming Heuristics Graph: Amazon 15

Performance of Random Assignment (cont’d) Graph: Youtube 16

Concluding Remarks 17

Streaming Heuristics 18