MIDDLEWARE SYSTEMS RESEARCH GROUP Scaling Construction of Low Fan-out Overlays for Topic-based Publish/Subscribe Systems Chen Chen 1 joint work with Roman.

Slides:



Advertisements
Similar presentations
Costas Busch Louisiana State University CCW08. Becomes an issue when designing algorithms The output of the algorithms may affect the energy efficiency.
Advertisements

Cristian Lumezanu Neil Spring Bobby Bhattacharjee Decentralized Message Ordering for Publish/Subscribe Systems.
PODC 2007 © 2007 IBM Corporation Constructing Scalable Overlays for Pub/Sub With Many Topics Problems, Algorithms, and Evaluation G. Chockler, R. Melamed,
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
JetStream: Achieving Predictable Gossip Dissemination by Leveraging Social Network Principles Jay A. Patel 1, Indranil Gupta 1, and Noshir Contractor 2.
Cognitive Publish/Subscribe for Heterogeneous Clouds Šarūnas Girdzijauskas, Swedish Institute of Computer Science (SICS) Joint work with:
1 Maximal Independent Set. 2 Independent Set (IS): In a graph G=(V,E), |V|=n, |E|=m, any set of nodes that are not adjacent.
1 Message Oriented Middleware and Hierarchical Routing Protocols Smita Singhaniya Sowmya Marianallur Dhanasekaran Madan Puthige.
TOPOLOGIES FOR POWER EFFICIENT WIRELESS SENSOR NETWORKS ---KRISHNA JETTI.
Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon.
Gossip Scheduling for Periodic Streams in Ad-hoc WSNs Ercan Ucan, Nathanael Thompson, Indranil Gupta Department of Computer Science University of Illinois.
LightFlood: An Optimal Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
1 On Constructing k- Connected k-Dominating Set in Wireless Networks Department of Computer Science and Information Engineering National Cheng Kung University,
Transactional Mobility in Distributed Content-Based Publish/Subscribe Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese.
Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein.
1 Data Persistence in Large-scale Sensor Networks with Decentralized Fountain Codes Yunfeng Lin, Ben Liang, Baochun Li INFOCOM 2007.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Robust Communications for Sensor Networks in Hostile Environments Ossama Younis and Sonia Fahmy Department of Computer Sciences, Purdue University Paolo.
Carnegie Mellon University Complex queries in distributed publish- subscribe systems Ashwin R. Bharambe, Justin Weisz and Srinivasan Seshan.
Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.
Adaptive Routing Proshanto Mukherji CSC 457: Computer Networks University of Rochester.
Minimum Maximum Degree Publish-Subscribe Overlay Network Design Melih Onus TOBB Ekonomi ve Teknoloji Üniversitesi, 28 Mayıs 2009.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Wireless Sensor Networks 22nd Lecture Christian Schindelhauer.
Distributed Combinatorial Optimization
Correctness of Gossip-Based Membership under Message Loss Maxim Gurevich, Idit Keidar Technion.
April 14, 2009, Arizona State University Committee: Andrea W. Richa (Chair) Goran Konjevod Rida Bazzi Christian Scheideler Overlay Network Construction.
Alex King Yeung Cheung and Hans-Arno Jacobsen University of Toronto June, 24 th 2010 ICDCS 2010 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Rule Generation [Chapter ]
Topology Design for Service Overlay Networks with Bandwidth Guarantees Sibelius Vieira* Jorg Liebeherr** *Department of Computer Science Catholic University.
On the Construction of Data Aggregation Tree with Minimum Energy Cost in Wireless Sensor Networks: NP-Completeness and Approximation Algorithms National.
Surface Simplification Using Quadric Error Metrics Michael Garland Paul S. Heckbert.
07/21/2005 Senmetrics1 Xin Liu Computer Science Department University of California, Davis Joint work with P. Mohapatra On the Deployment of Wireless Sensor.
Publisher Mobility in Distributed Publish/Subscribe Systems Vinod Muthusamy, Milenko Petrovic, Dapeng Gao, Hans-Arno Jacobsen University of Toronto June.
Network Aware Resource Allocation in Distributed Clouds.
MIDDLEWARE SYSTEMS RESEARCH GROUP Denial of Service in Content-based Publish/Subscribe Systems M.A.Sc. Candidate: Alex Wun Thesis Supervisor: Hans-Arno.
Rate-based Data Propagation in Sensor Networks Gurdip Singh and Sandeep Pujar Computing and Information Sciences Sanjoy Das Electrical and Computer Engineering.
IEEE Globecom 2010 Tan Le Yong Liu Department of Electrical and Computer Engineering Polytechnic Institute of NYU Opportunistic Overlay Multicast in Wireless.
Scalable Data Aggregation for Dynamic Events in Sensor Networks Kai-Wei Fan Kai-Wei Fan, Sha Liu, and Prasun.
Content-Based Routing in Mobile Ad Hoc Networks Milenko Petrovic, Vinod Muthusamy, Hans-Arno Jacobsen University of Toronto July 18, 2005 MobiQuitous 2005.
Understanding Crowds’ Migration on the Web Yong Wang Komal Pal Aleksandar Kuzmanovic Northwestern University
1 Maximal Independent Set. 2 Independent Set (IS): In a graph G=(V,E), |V|=n, |E|=m, any set of nodes that are not adjacent.
Gennaro Cordasco - How Much Independent Should Individual Contacts be to Form a Small-World? - 19/12/2006 How Much Independent Should Individual Contacts.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Total Order in Content-based Publish/Subscribe Systems Joint work with: Vinod Muthusamy, Hans-Arno Jacobsen.
MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Efficient k-Coverage Algorithms for Wireless Sensor Networks Mohamed Hefeeda.
MIDDLEWARE SYSTEMS RESEARCH GROUP Adaptive Content-based Routing In General Overlay Topologies Guoli Li, Vinod Muthusamy Hans-Arno Jacobsen Middleware.
Minimal Broker Overlay Design for Content-Based Publish/Subscribe Systems Naweed Tajuddin Balasubramaneyam Maniymaran Hans-Arno Jacobsen University of.
ICDCS Beijing China Routing of XML and XPath Queries in Data Dissemination Networks Guoli Li, Shuang Hou Hans-Arno Jacobsen Middleware Systems Research.
6 December On Selfish Routing in Internet-like Environments paper by Lili Qiu, Yang Richard Yang, Yin Zhang, Scott Shenker presentation by Ed Spitznagel.
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Analysis and algorithms of the construction of the minimum cost content-based publish/subscribe overlay Yaxiong Zhao and Jie Wu
The 30th International Conference on Distributed Computing Systems June 2010, Genoa, Italy Parameterized Maximum and Average Degree Approximation in Topic-based.
Stefanos Antaris A Socio-Aware Decentralized Topology Construction Protocol Stefanos Antaris *, Despina Stasi *, Mikael Högqvist † George Pallis *, Marios.
Brief Announcement : Measuring Robustness of Superpeer Topologies Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,
Scaling Properties of the Internet Graph Aditya Akella, CMU With Shuchi Chawla, Arvind Kannan and Srinivasan Seshan PODC 2003.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Distributed Ranked Data Dissemination in Social Networks Joint work with: Mo Sadoghi Vinod Muthusamy Hans-Arno.
An Adaptive Zone-based Storage Architecture for Wireless Sensor Networks Thang Nam Le, Dong Xuan and *Wei Yu Department of Computer Science and Engineering,
Community Clustering in Distributed Publish/Subscribe System Wei Li 1,2,Songlin Hu 1, Jintao Li 1, Hans-Arno Jacobsen 3 1 Institute of Computing Technology,
Stefanos Antaris Distributed Publish/Subscribe Notification System for Online Social Networks Stefanos Antaris *, Sarunas Girdzijauskas † George Pallis.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
COMMUNICATING VIA FIREFLIES: GEOGRAPHIC ROUTING ON DUTY-CYCLED SENSORS S. NATH, P. B. GIBBONS IPSN 2007.
MIDDLEWARE SYSTEMS RESEARCH GROUP Divide and Conquer Algorithms for Pub/Sub Overlay Design Chen Chen 1 joint work with Hans-Arno Jacobsen 1,2, Roman Vitenberg.
Congestion Avoidance with Incremental Filter Aggregation in Content-Based Routing Networks Mingwen Chen 1, Songlin Hu 1, Vinod Muthusamy 2, Hans-Arno Jacobsen.
Scaling Properties of the Internet Graph Aditya Akella With Shuchi Chawla, Arvind Kannan and Srinivasan Seshan PODC 2003.
Hiding Contextual Information in WSNs Alejandro Proaño and Loukas Lazos Dept. of Electrical and Computer Engineering University of Arizona.
Navneet Kumar Pandey1 Stéphane Weiss1 Roman Vitenberg1
Coded Caching in Information-Centric Networks
Data-Centric Networking
Presentation transcript:

MIDDLEWARE SYSTEMS RESEARCH GROUP Scaling Construction of Low Fan-out Overlays for Topic-based Publish/Subscribe Systems Chen Chen 1 joint work with Roman Vitenberg 3, Hans-Arno Jacobsen 1,2 1 Department of Electrical and Computer Engineering 2 Department of Computer Science University of Toronto 3 Department of Informatics University of Oslo ICDCS 20111

MIDDLEWARE SYSTEMS RESEARCH GROUP Example: pub/sub Interests: IBM Interests: Microsoft 2 ICDCS 2011

MIDDLEWARE SYSTEMS RESEARCH GROUP Pub/Sub A communication paradigm –Subscribers express their interests –Publishers disseminate messages Many applications and industry standards –Application integration, financial data dissemination, RSS feed distribution, business process management –WS Notifications, WS Eventing, OMGs’ Real-time Data Dissemination Service Topic-based pub/sub –TIBCO RV –Google’s GooPS ICDCS 20113

MIDDLEWARE SYSTEMS RESEARCH GROUP Two directions for pub/sub Design of routing protocols The design of protocols so that publications and subscriptions are sent most efficiently across the overlay network. G. Li et al., ICDCS’08 M. Castro et al., JSAC’02 Construction of overlay The construction of the overlay topology such that network traffic is minimized. Chockler et al., PODC’07 Onus et al., INFOCOM’09 ICDCS 20114

MIDDLEWARE SYSTEMS RESEARCH GROUP Desirable properties for overlays Low average node degree Low maximum node degree Low diameter Topic-connectivity Efficiency to construct Adaptability to churn Ease of distributed implementation ICDCS V5V5 V1V1 {b,c,d} V2V2 {a} {b,d} V4V4 {a,b} V3V3 {a,c}

MIDDLEWARE SYSTEMS RESEARCH GROUP Our contributions 6 Previous greedy algorithm High runtime cost Full knowledge requirement Centralized operation (difficult to decentralize) Our divide-and-conquer algorithm Low runntime cost Partial knowledge requirement Centralized operation (easy to decentralize) ICDCS 2011

MIDDLEWARE SYSTEMS RESEARCH GROUP Topic-connected overlay (TCO) V5 {a,c} V1 {b,c,d} V2 {a} {b,d} V4 {a,b} V3 V5 {a,c} V2 {a}{a} V4 {a,b} V1 {b,c,d} {b,d} V4 {a,b} V3 An overlay G Suboverlay Ga is topic-connected Suboverlay Gb is NOT topic-connected ICDCS 20117

MIDDLEWARE SYSTEMS RESEARCH GROUP MinMax-TCO V5 V1 {b,c,d} V2 {a} {b,d} V4 {a,b} V3 V 5 has 3 edges {a,c} V5 V1 {b,c,d} V2 {a} {b,d} V4 {a,b} V3 V 1 has 4 edges {a,c} ICDCS 20118

MIDDLEWARE SYSTEMS RESEARCH GROUP ICDCS MinMax-TCO problem and GM-M algorithm [Onus, 2009] Minimum Maximum Degree Topic-Connected Overlay (MinMax-TCO) problem –Given a set of nodes V, set of topics T, and Interest: V  T  {true, false}, construct a topic-connected overlay G with minimum maximum degree. Theorem : MinMax-TCO is NP-complete GM-M algorithm ( MinMax-ODA ) –always greedily adding an edge which 1) has the largest edge contribution, and 2) increases the maximum node degree minimally –logarithmic approximation ratio –time complexity

MIDDLEWARE SYSTEMS RESEARCH GROUP Why divide-and-conquer GM-M ’s runtime cost is expensive –time complexity –487 minutes: |V|=1000, |T|=100, uniform distribution* * each topic has an equal probability for all nodes that may be interested in that topic The number of nodes is the dominant factor ICDCS To improve running time Reduce the size of node set Divide-and-conquer based on node set V

MIDDLEWARE SYSTEMS RESEARCH GROUP Divide-and-conquer ( DC ) V 12 V0V0 {c} V6V6 {d} V9V9 {a,b,c} V3V3 {d} {a,b,c} V8V8 V 11 V2V2 {a} V5V5 {a,b,d} V 14 {b,c,d} {a,b,c} {a,b,d} V 13 V1V1 V4V4 {c} V 10 V7V7 {c} {a,c,d} {c} {a} ICDCS Divide overlay based on V - Conquer each sub-TCO by GM-M - Combine via cross-TCO links 11

MIDDLEWARE SYSTEMS RESEARCH GROUP Challenges for divide Node clustering Nodes with similar interests are placed together High runtime cost Not trivial to decentralize Outputs with varying sizes Random partitioning Each node flips a coin and gets assigned to one of the partitions Fast Easy to tune Straightforward to decentralize However, May lose correlation among nodes due to randomness Maximum node degree is very sensitive to random partitioning ICDCS Divide the MinMax-TCO problem into several sub-overlay construction problems

MIDDLEWARE SYSTEMS RESEARCH GROUP Bad case for random partitioning ICDCS v all V a1 V b1 V a2 V b2 V b3 V b4 V1V1 V2V2 V3V3 V4V4 V5V5 V6V6 V7V7 V8V8 V1V1 V2V2 V3V3 V4V4 V5V5 V6V6 V7V7 V8V8 v all V a1 V b1 V a2 V b2 V b3 V b4 {t 1, t 2, t 3, t 4, t 5, t 6, t 7, t 8 } {t 1, t 2, t 3, t 4 }{t 5, t 6, t 7, t 8 } {t 1, t 2 } {t 3, t 4 }{t 5, t 6 } {t 7, t 8 } {t 1 } {t 2 } {t 3 }{t 4 } {t 5 } {t 6 }{t 7 }{t 8 } Random partitioning may increase the degrees of individual nodes by a factor of

MIDDLEWARE SYSTEMS RESEARCH GROUP Poor performance of DC for MinMax-TCO ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Pub/sub workloads The number of nodes |V| : from 1000 to 8000 The number of topics |T|: from 100 to 1000 The subscription size: from 50 to 150 on average Topic popularity –Uniform: [Chockler, 2007] –Zipf: feed popularity distribution in RSS [Liu, 2005] –Exponential: stock popularity in NYSE [Tock, 2005] ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Learn from workloads Observations Increased maximum node degree occurs when a node subscribes to a large number of topics “Pareto 80-20” rule: –most nodes subscribe to a relatively small number of topics –only a relatively small number of nodes might be interested in a large number of topics Basic idea special treatment for those nodes interested in many topics ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Bulk nodes Given (V,T,Int) the bulk node set is a subset such that where T v is the topic set subscribed by node v and η is defined as bulk subscriber threshold The lightweight node set is L = V – B The bulk subscriber threshold η can be determined based on historical results ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Challenges for combine Combine multiple sub-TCOs into one by adding cross-TCO links as bridges Not all nodes need to participate How to select node subsets for cross-TCO links? –small : increasing node degrees –large : degrading time efficiency ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Representative set Given a TCO (V,T,Int,E), a representative set (rep set) is a subset of V that covers all V’s topics λ times. ICDCS V5V5 V1V1 {b,c,d} V2V2 {a} {b,d} V4V4 {a,b} V3V3 A topic-connected overlay {v 3,v 5 } is a 1-rep set which covers all topics {a,b,c,d} V5V5 V1V1 {b,c,d} V2V2 {a} {b,d} V4V4 {a,b} V3V3 V5V5 V1V1 {b,c,d} V2V2 {b,d} V4V4 {a,b} V3V3 {a,c} {v 1,v 2,v 3,v 5 } is a 2-rep set; {a,b,c,d} is covered twice. {a}{a}{a,c}

MIDDLEWARE SYSTEMS RESEARCH GROUP Representative nodes Representative nodes (rep-nodes) –Represents the interests of all the nodes –Can function as bridges to determine cross-TCO links –Coverage factor λ : for tuning the size of rep set Observation For typical pub/sub workload and sufficiently large partitions, minimal rep sets tend to be several times smaller than the total number of nodes. How to find a minimal rep set R λ for (V,T,Int)? –Linearly reducible to classic set cover problem: NP-complete –Greedy algorithm: always adding a node with the largest number of topics that are not yet λ-covered a logarithmic approximation ratio efficiently implemented ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Divide-and-Conquer with Bulk and Lightweight Rep-nodes ( DCBR-M) V0V0 21 V3V3 V6V6 V 12 V9V9 V 15 V 18 V 19 V 20 V1V1 V4V4 V7V7 V 13 V 10 V 16 V2V2 V5V5 V8V8 V 14 V 11 V 17 {a,c,h} {b,c,d,e} {d,f,g,h} {c,e,h} {a,d,e,g} {a,c,e,f} {a,e,f,g} {a,c,d,e} {a,d,f,g} {b,d,e,f} {b,d,e,g} {a,e,f} {c,d,g,h} {b,f,h} {b,d,e} {a,c,g,h} {a,d,e} {a,c,e,g} {a,b,c,e,f,g} {a,b,c,d,f,g} {a,b,c,e,f,g,h} ICDCS 2011

MIDDLEWARE SYSTEMS RESEARCH GROUP Design of DCBR-M algorithm Different parameters for tuning the algorithm: –The bulk subscriber threshold ηdivide, combine bulk nodes vs. lightweight nodes –The coverage factor λ combine time efficiency vs. the quality of TCO –The number of lightweight partitions p divide, conquer p = |L| (one node one partition): combine only p = 1 (all node one partition): conquer only How to decentralize DCBR-M –Nodes autonomously organize themselves into random partitions –Different partitions construct inner edges in parallel –Different partitions compute rep sets in parallel –Bulk nodes and rep-nodes communicate and compute outer edges ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Theoretical analysis of DCBR-M DCBR-M will generate a TCO whose maximum node degree is asymptotically the same as that of the TCO output by GM-M under the realistic assumption for typical pub/sub workloads. The running time of DCBR-M is Considerable speedup when |B| and |R| are small ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Evaluation for DCBR-M (1) 24ICDCS 2011

MIDDLEWARE SYSTEMS RESEARCH GROUP Evaluation for DCBR-M (2) ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Evaluation for DCBR-M (3) 26ICDCS 2011

MIDDLEWARE SYSTEMS RESEARCH GROUP Conclusion ICDCS Running time max degreeavg degree Required information Potential to Decentralize RingPT goodpoor: 168poor: 92full knowledgegood GM-M poor: 487 mingood: 5good: 3.88full knowledgepoor DCBR-M good: 13.6 secgood: 6good: 4.29partial knowledgegood

MIDDLEWARE SYSTEMS RESEARCH GROUP BACKUP ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Related work Construction of the overlay –MinAvg-TCO, Chockler et al. PODC’2007 –MinMax-TCO, Onus et al. Infocom’2009 –Low-TCO, Onus et al. ICDCS’2010 –DC for MinAvg-TCO, Chen et al. ICDCS’2010 Design of routing protocols –G. Li et al. ICDCS’2008 –M. Castro et al. JASC’2002 ICDCS

MIDDLEWARE SYSTEMS RESEARCH GROUP Minimal Number of Links A typical pub/sub system combines a number of protocols, many of which maintaining per-link state –A node must constantly monitor the availability of each of its neighbors (heartbeats and keep-alive state) –If the links are maintained using TCP, there is the cost of connection state for each link –The more links there are, the fewer topics can be routed over each individual link, thereby diminishing cross-topic aggregation benefits –If sequential-diff-based compression scheme is used, there is an extra cost associated with a history table ICDCS 2011

MIDDLEWARE SYSTEMS RESEARCH GROUP DCBR-M vs DC MinMax-TCO vs MinAvg-TCO Fundamentally different problems –Average node degree is a “global” property; maximum node degree possess both “global” and “local” properties. –DC for MinAvg-TCO does not directly apply to MinMax-TCO. –MinMax-TCO is more sensitive to divide, conquer and combine. –Different algorithm design, theoretical analysis, and experiments. ICDCS