PODC 2007 © 2007 IBM Corporation Constructing Scalable Overlays for Pub/Sub With Many Topics Problems, Algorithms, and Evaluation G. Chockler, R. Melamed,

Slides:



Advertisements
Similar presentations
Optimizing Cost and Performance for Multihoming Nick Feamster CS 6250 Fall 2011.
Advertisements

Analysis of Algorithms
Energy-Efficient Distributed Algorithms for Ad hoc Wireless Networks Gopal Pandurangan Department of Computer Science Purdue University.
Cristian Lumezanu Neil Spring Bobby Bhattacharjee Decentralized Message Ordering for Publish/Subscribe Systems.
22C:19 Discrete Math Algorithms and Complexity
Weighted Matching-Algorithms, Hamiltonian Cycles and TSP
Chapter 4 Partition I. Covering and Dominating.
Lectures on NP-hard problems and Approximation algorithms
Web-pa – the tutors’ view Web-PA – a tutors’ view Peter Willmot (School of Mechanical and Manufacturing Engineering)
New Opportunities for Load Balancing in Network-Wide Intrusion Detection Systems Victor Heorhiadi, Michael K. Reiter, Vyas Sekar UNC Chapel Hill UNC Chapel.
U of Houston – Clear Lake
Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
Fast Algorithms For Hierarchical Range Histogram Constructions
A Model of Computation for MapReduce
1 Greedy Forwarding in Dynamic Scale-Free Networks Embedded in Hyperbolic Metric Spaces Dmitri Krioukov CAIDA/UCSD Joint work with F. Papadopoulos, M.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Research: Group communication in distributed interactive applications Student: Knut-Helge Vik Institute: University of Oslo, Simula Research Labs.
Wavelength Assignment in Optical Network Design Team 6: Lisa Zhang (Mentor) Brendan Farrell, Yi Huang, Mark Iwen, Ting Wang, Jintong Zheng Progress Report.
Information Networks Graph Clustering Lecture 14.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Combinatorial Algorithms
Small-World Graphs for High Performance Networking Reem Alshahrani Kent State University.
Unstructured overlays: construction, optimization, applications Anne-Marie Kermarrec Joint work with Laurent Massoulié and Ayalvadi Ganesh.
Overlay Networks + Internet routing has exhibited scalability - Internet routing is inefficient -Difficult to add intelligence to Internet Solution: Overlay.
A general approximation technique for constrained forest problems Michael X. Goemans & David P. Williamson Presented by: Yonatan Elhanani & Yuval Cohen.
Analysis of Algorithms CS 477/677
Minimum Maximum Degree Publish-Subscribe Overlay Network Design Melih Onus TOBB Ekonomi ve Teknoloji Üniversitesi, 28 Mayıs 2009.
Distributed Combinatorial Optimization
Lecture 3 Power Law Structure Ding-Zhu Du Univ of Texas at Dallas.
CSE 550 Computer Network Design Dr. Mohammed H. Sqalli COE, KFUPM Spring 2007 (Term 062)
Finding a maximum independent set in a sparse random graph Uriel Feige and Eran Ofek.
UNIVERSITY OF JYVÄSKYLÄ Resource Discovery in Unstructured P2P Networks Distributed Systems Research Seminar on Mikko Vapa, research student.
Grace Hopper Celebration of Women in Computing Evaluating Algorithmic Design Paradigms Sashka Davis Advised by Russell Impagliazzo UC San Diego October.
Connected Dominating Sets in Wireless Networks My T. Thai Dept of Comp & Info Sci & Engineering University of Florida June 20, 2006.
April 14, 2009, Arizona State University Committee: Andrea W. Richa (Chair) Goran Konjevod Rida Bazzi Christian Scheideler Overlay Network Construction.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Topology aggregation and Multi-constraint QoS routing Presented by Almas Ansari.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
A Polynomial Time Approximation Scheme For Timing Constrained Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, Charles J. Alpert** *Dept of Electrical.
Advanced Algorithm Design and Analysis (Lecture 13) SW5 fall 2004 Simonas Šaltenis E1-215b
June 21, 2007 Minimum Interference Channel Assignment in Multi-Radio Wireless Mesh Networks Anand Prabhu Subramanian, Himanshu Gupta.
MIDDLEWARE SYSTEMS RESEARCH GROUP Scaling Construction of Low Fan-out Overlays for Topic-based Publish/Subscribe Systems Chen Chen 1 joint work with Roman.
Approximating the Minimum Degree Spanning Tree to within One from the Optimal Degree R 陳建霖 R 宋彥朋 B 楊鈞羽 R 郭慶徵 R
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Lecture 1-5 Power Law Structure Weili Wu Ding-Zhu Du Univ of Texas at Dallas.
Minimal Broker Overlay Design for Content-Based Publish/Subscribe Systems Naweed Tajuddin Balasubramaneyam Maniymaran Hans-Arno Jacobsen University of.
6 December On Selfish Routing in Internet-like Environments paper by Lili Qiu, Yang Richard Yang, Yin Zhang, Scott Shenker presentation by Ed Spitznagel.
The 30th International Conference on Distributed Computing Systems June 2010, Genoa, Italy Parameterized Maximum and Average Degree Approximation in Topic-based.
Reliable Multicast Routing for Software-Defined Networks.
Peter R Pietzuch and Jean Bacon Peer-to-Peer Overlay Networks in an Event-Based Middleware DEBS’03, San Diego, CA, USA,
A Simulation-Based Study of Overlay Routing Performance CS 268 Course Project Andrey Ermolinskiy, Hovig Bayandorian, Daniel Chen.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
MIDDLEWARE SYSTEMS RESEARCH GROUP Divide and Conquer Algorithms for Pub/Sub Overlay Design Chen Chen 1 joint work with Hans-Arno Jacobsen 1,2, Roman Vitenberg.
Errol Lloyd Design and Analysis of Algorithms Approximation Algorithms for NP-complete Problems Bin Packing Networks.
TU/e Algorithms (2IL15) – Lecture 11 1 Approximation Algorithms.
8.3.2 Constant Distance Approximations
Mathematical Foundations of AI
A Study of Group-Tree Matching in Large Scale Group Communications
Distributed Dynamic BDD Reordering
Greedy Algorithms / Interval Scheduling Yin Tat Lee
The Subset Sum Game Revisited
Constrained Bipartite Vertex Cover: The Easy Kernel is Essentially Tight Bart M. P. Jansen June 4th, WORKER 2015, Nordfjordeid, Norway.
Bin Fu Department of Computer Science
Coverage Approximation Algorithms
REDUCESEARCH Polynomial Kernels for Hitting Forbidden Minors under Structural Parameterizations Bart M. P. Jansen Astrid Pieterse ESA 2018 August.
CSE 550 Computer Network Design
Lecture 28 Approximation of Set Cover
Lecture 24 Vertex Cover and Hamiltonian Cycle
Presentation transcript:

PODC 2007 © 2007 IBM Corporation Constructing Scalable Overlays for Pub/Sub With Many Topics Problems, Algorithms, and Evaluation G. Chockler, R. Melamed, Y. Tock G. Chockler, R. Melamed, Y. Tock, IBM Haifa Research Lab R. Vitenberg R. Vitenberg, University of Oslo

© 2007 IBM Corporation Publish/Subscribe (Pub/Sub) N1 Subscription(N1)={B,C,D} N2 {A,B,C,E,} N3 {A,D} N4 {A,B,X} N5 {A,X} Message Bus Publish(M1, A) M1

© 2007 IBM Corporation Scalability of Pub/Sub Most traditional pub/sub systems are geared towards small scale deployment Most traditional pub/sub systems are geared towards small scale deployment –E.g., Isis MDS, TIB, MQSeries, Gryphon New generation of applications… New generation of applications… –Large data centers: Amazon, Google, Yahoo, EBay,… –RSS, feed/news readers, on-line stock trading and banking –Web 2.0, Second Life …drive dramatic growth in scale …drive dramatic growth in scale –10,000s of nodes, 1000s of topics, Internet-wide distribution Emerging systems address this trend using P2P techniques Emerging systems address this trend using P2P techniques

© 2007 IBM Corporation Overlay-Based Pub/Sub N1 {B,C,D} N2 {A,B,C,E} N3 {A,D} {A,B,X} N5 {A,X} N4 (M1, A) SCRIBE Corona Feedtree Sub-2-Sub TERA... Relay

© 2007 IBM Corporation Overlay Topologies for Pub/Sub “Good” overlay will allow for efficient and simple publication routing “Good” overlay will allow for efficient and simple publication routing –Small routing tables, low load on relays, –low latency Ideally, overlay is topic-connected: i.e., one connected component for each topic- induced sub-graph Ideally, overlay is topic-connected: i.e., one connected component for each topic- induced sub-graph –Most existing implementations construct topic- connected overlays

© 2007 IBM Corporation Topic-Connectivity Topics B,C,X,E are connected Topics B,C,X,E are connected Topics A and D are disconnected Topics A and D are disconnected N1 {B,C,D} N2 {A,B,C,E} N3 {A,D} {A,B,X} N5 {A,X} N4

© 2007 IBM Corporation Topic-Connectivity: Simple Solution N1 {B,C,D} N2 {A,B,C,E} N3 {A,D} {A,B,X} N5 {A,X} N4  Node degree grows linearly with the subscription size  Roughly twice as big as the average subscription size for rings/trees

© 2007 IBM Corporation Scalability of the Simple Solution Negative impact on performance due to Negative impact on performance due to –CPU load: neighbor monitoring, message processing –Connection maintenance and header overhead –Memory overhead: per-link state associated with routing and/or compression schemes being used, etc.  Scalability barrier for large systems offering a wide range of subscription choices Can we do better?

© 2007 IBM Corporation The Min-TCO Problem Minimum Topic-Connected Overlay (Min- TCO) problem: Minimum Topic-Connected Overlay (Min- TCO) problem: –For a set of nodes V, set of topics T, and I nterest: V  T  {true, false} –Construct a topic-connected overlay G with the minimum possible number of edges (or average degree) TCO (decision version): TCO (decision version): –Decide whether there is a topic-connected overlay consisting of k edges (for a given k )

© 2007 IBM Corporation Complexity of TCO Lemma: TCO(V,T,Interest,k)  NP Proof: Topic connectivity is verifyable in polynomial time Lemma: TCO(V,T,Interest,k) is NP-hard Proof : 1.Define an auxiliary problem Single Node TCO (SN-TCO) which is to decide if there is a topic-connected overlay in which the degree of single given node  d 2.Set Cover is polynomially reducible to SN-TCO 3.SN-TCO is polynomially reducible to TCO Theorem : TCO is NP-complete N5 {B,C,D} N2 {A,B} N3 {A,D} {A,C} {A,B,C,D} N4 N1

© 2007 IBM Corporation Approximating Min-TCO The idea: exploiting subscription overlaps The idea: exploiting subscription overlaps –Connecting the nodes with overlapping interests improves connectivity of several topics at once Greedy Merge (GM) algorithm: Greedy Merge (GM) algorithm: –Start from a singleton connected component for each (v, t)  V  T –At each iteration: add an edge that reduces the number of connected components for the biggest number of topics –Stop, once there is a single connected component for each topic

© 2007 IBM Corporation Greedy Merge N1 {B,C,D} N2 {A,B,C,E} N3 {A,D} {A,B,X} N5 {A,X} N4 Topic # of conn. comps A4 B3 C2 D2 X2 E1

© 2007 IBM Corporation Greedy Merge N1 {B,C,D} N2 {A,B,C,E} N3 {A,D} {A,B,X} N5 {A,X} N4 Topic # of conn. comps A4 B2 C1 D2 X2 E1

© 2007 IBM Corporation Greedy Merge N1 {B,C,D} N2 {A,B,C,E} N3 {A,D} {A,B,X} N5 {A,X} N4 Topic # of conn. comps A3 B1 C1 D2 X2 E1

© 2007 IBM Corporation Greedy Merge N1 {B,C,D} N2 {A,B,C,E} N3 {A,D} {A,B,X} N5 {A,X} N4 Topic # of conn. comps A2 B1 C1 D2 X1 E1

© 2007 IBM Corporation Greedy Merge N1 {B,C,D} N2 {A,B,C,E} N3 {A,D} {A,B,X} N5 {A,X} N4 Topic # of conn. comps A2 B1 C1 D1 X1 E1

© 2007 IBM Corporation Greedy Merge N1 {B,C,D} N2 {A,B,C,E} N3 {A,D} {A,B,X} N5 {A,X} N4 Topic # of conn. comps A1 B1 C1 D1 X1 E1  Average degree of 2 vs. almost 3 for ring-per-topic!

© 2007 IBM Corporation GM Running Time O(|V| 4  |T|) O(|V| 4  |T|) –At most |V| 2 iterations –At most |V| 2 edges inspected at each iteration –At most |T| steps to inspect an edge Can be optimized to run in O(|V| 2  |T|) Can be optimized to run in O(|V| 2  |T|) –For each e  V  V, weight(e) = the number of connected components merged by e –At each iteration, output the heaviest edge and adjust the other edge weights accordingly –Stop once there are no more edges with weight > 0

© 2007 IBM Corporation Approximability Results Lemma: 1.The number of edges in the overlay constructed by GM  log(|V|  |T|) OPT Proof: Similar to that of the approximation ratio of the greedy algorithm for Set Cover 2.There exists an input on which GM’s output meets this ratio Theorem: No algorithm can approximate Min-TCO within a constant factor (unless P=NP) Proof: Existence of such an algorithm would imply existence of the constant factor approximation for Set Cover which is known to be impossible (unless P=NP)

© 2007 IBM Corporation Practical Benefits

© 2007 IBM Corporation More Overlay Design Problems Filtering: Given an upper bound d on the node degree, minimize the number of relays used to connect each topic Filtering: Given an upper bound d on the node degree, minimize the number of relays used to connect each topic –Captures the cases when full topic-connectivity is infeasible because of resource constraints Diameter: Given an upper bound d on the node degree, minimize the diameter of each topic in the overlay Diameter: Given an upper bound d on the node degree, minimize the diameter of each topic in the overlay –Latency optimal routing under resource constraints …

© 2007 IBM Corporation Conclusions Initiated formal study of the problem of designing efficient and scalable overlay topologies for pub/sub Initiated formal study of the problem of designing efficient and scalable overlay topologies for pub/sub Defined a representative problem (Min-TCO) capturing the cost of constructing topic- connected overlays Defined a representative problem (Min-TCO) capturing the cost of constructing topic- connected overlays –NP-Completeness, polynomial approximation, inapproximability results Empirical evaluation showed effectiveness of our approximation algorithm on practical inputs Empirical evaluation showed effectiveness of our approximation algorithm on practical inputs

© 2007 IBM Corporation Future Directions Study dynamic case Study dynamic case Investigate other overlay design problems Investigate other overlay design problems Study distributed case Study distributed case –Partial knowledge of other node interest –Dynamically changing interest assignments

© 2007 IBM Corporation Thank You!