PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011 MIDDLEWARE SYSTEMS RESEARCH.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Opportunistic Multipath Forwarding in Publish/Subscribe Systems Reza Sherafat Kazemzadeh AND Hans-Arno Jacobsen Middleware Systems Research Group University.
Dynamic Task Assignment Load Index for Geographically Distributed Web Services PhD Research Proposal By: Dhiah Al-Shammary Supervised.
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
1 Cycle Detection in Publish/Subscribe Overlay Networks Reza Sherafat Alex Cheung Prof. Cristiana Amza ECE1747 – Course Project University of Toronto.
1 Message Oriented Middleware and Hierarchical Routing Protocols Smita Singhaniya Sowmya Marianallur Dhanasekaran Madan Puthige.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Hans-Arno Jacobsen June 23, 2011 Resource Allocation Algorithms for Publish/Subscribe Systems
Small-Scale Peer-to-Peer Publish/Subscribe
Transactional Mobility in Distributed Content-Based Publish/Subscribe Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese.
MIDDLEWARE SYSTEMS RESEARCH GROUP A Taxonomy for Denial of Service Attacks in Content-based Publish/Subscribe Systems Alex Wun, Alex Cheung, Hans-Arno.
SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,
©NEC Laboratories America 1 Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California.
Scalable Adaptive Data Dissemination Under Heterogeneous Environment Yan Chen, John Kubiatowicz and Ben Zhao UC Berkeley.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
1 An Overlay Scheme for Streaming Media Distribution Using Minimum Spanning Tree Properties Journal of Internet Technology Volume 5(2004) No.4 Reporter.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
CS218 – Final Project A “Small-Scale” Application- Level Multicast Tree Protocol Jason Lee, Lih Chen & Prabash Nanayakkara Tutor: Li Lao.
Distributed Publish/Subscribe Network Presented by: Yu-Ling Chang.
Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Alex King Yeung Cheung and Hans-Arno Jacobsen University of Toronto June, 24 th 2010 ICDCS 2010 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Effects of Routing Computations in Content-Based Routing Networks with Mobile Data Sources Vinod Muthusamy, Milenko Petrovic, Hans-Arno Jacobsen University.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Module 13: Network Load Balancing Fundamentals. Server Availability and Scalability Overview Windows Network Load Balancing Configuring Windows Network.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
Kaifei Chen, Siyuan He, Beidi Chen, John Kolb, Randy H. Katz, David E
Publisher Mobility in Distributed Publish/Subscribe Systems Vinod Muthusamy, Milenko Petrovic, Dapeng Gao, Hans-Arno Jacobsen University of Toronto June.
MIDDLEWARE SYSTEMS RESEARCH GROUP Denial of Service in Content-based Publish/Subscribe Systems M.A.Sc. Candidate: Alex Wun Thesis Supervisor: Hans-Arno.
Gil EinzigerRoy Friedman Computer Science Department Technion.
Content-Based Routing in Mobile Ad Hoc Networks Milenko Petrovic, Vinod Muthusamy, Hans-Arno Jacobsen University of Toronto July 18, 2005 MobiQuitous 2005.
DISTRIBUTED EVENT AGGREGATION FOR CONTENT-BASED PUBLISH/SUBSCRIBE SYSTEMS Navneet Kumar Pandey 1 Stéphane Weiss 1 Roman Vitenberg 1 Kaiwen Zhang 2 Hans-Arno.
Dynamic Load Balancing in Distributed Content-based Publish/Subscribe Alex K. Y. Cheung & Hans-Arno Jacobsen University of Toronto November 30 th, 2006.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Total Order in Content-based Publish/Subscribe Systems Joint work with: Vinod Muthusamy, Hans-Arno Jacobsen.
Distributed Automatic Service Composition in Large-Scale Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese Academy of.
Historic Data Access in Publish/Subscribe Middleware System Research Group University of Toronto.
MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.
Parallel Event Processing for Content-Based Publish/Subscribe Systems Amer Farroukh Department of Electrical and Computer Engineering University of Toronto.
MIDDLEWARE SYSTEMS RESEARCH GROUP Adaptive Content-based Routing In General Overlay Topologies Guoli Li, Vinod Muthusamy Hans-Arno Jacobsen Middleware.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011 MIDDLEWARE SYSTEMS RESEARCH.
Minimal Broker Overlay Design for Content-Based Publish/Subscribe Systems Naweed Tajuddin Balasubramaneyam Maniymaran Hans-Arno Jacobsen University of.
ICDCS Beijing China Routing of XML and XPath Queries in Data Dissemination Networks Guoli Li, Shuang Hou Hans-Arno Jacobsen Middleware Systems Research.
Accommodating Bursts in Distributed Stream Processing Systems Yannis Drougas, ESRI Vana Kalogeraki, AUEB
Analysis and algorithms of the construction of the minimum cost content-based publish/subscribe overlay Yaxiong Zhao and Jie Wu
Information-Centric Networks10b-1 Week 10 / Paper 2 Hermes: a distributed event-based middleware architecture –P.R. Pietzuch, J.M. Bacon –ICDCS 2002 Workshops.
Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.
Information-Centric Networks Section # 10.2: Publish/Subscribe Instructor: George Xylomenos Department: Informatics.
Peter R Pietzuch and Jean Bacon Peer-to-Peer Overlay Networks in an Event-Based Middleware DEBS’03, San Diego, CA, USA,
Distributed Automatic Service Composition in Large-Scale Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese Academy of.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Distributed Ranked Data Dissemination in Social Networks Joint work with: Mo Sadoghi Vinod Muthusamy Hans-Arno.
Community Clustering in Distributed Publish/Subscribe System Wei Li 1,2,Songlin Hu 1, Jintao Li 1, Hans-Arno Jacobsen 3 1 Institute of Computing Technology,
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
MIDDLEWARE SYSTEMS RESEARCH GROUP Divide and Conquer Algorithms for Pub/Sub Overlay Design Chen Chen 1 joint work with Hans-Arno Jacobsen 1,2, Roman Vitenberg.
Congestion Avoidance with Incremental Filter Aggregation in Content-Based Routing Networks Mingwen Chen 1, Songlin Hu 1, Vinod Muthusamy 2, Hans-Arno Jacobsen.
1 Towards Scalable Pub/Sub Systems Shuping Ji 1, Chunyang Ye 2, Jun Wei 1 and Arno Jacobsen 3 1 Chinese Academy of Sciences 2 Hainan University 3 Middleware.
Introduction to Load Balancing:
A Framework for Object-Based Event Composition in Distributed Systems
Navneet Kumar Pandey1 Stéphane Weiss1 Roman Vitenberg1
SCOPE: Scalable Consistency in Structured P2P Systems
Distributed Publish/Subscribe Network
Composite Subscriptions in Content-based Pub/Sub Systems
Foundations for Highly-Available Content-based Publish/Subscribe Overlays Young Yoon, Vinod Muthusamy and Hans-Arno Jacobsen.
Small-Scale Peer-to-Peer Publish/Subscribe
Dynamic Replica Placement for Scalable Content Delivery
Storing and Replication in Topic-Based Pub/Sub Networks
Presentation transcript:

PhD Candidate: Alex K. Y. Cheung Supervisor: Hans-Arno Jacobsen PhD Thesis Presentation University of Toronto March 28, 2011 MIDDLEWARE SYSTEMS RESEARCH GROUP Resource Allocation Algorithms for Event-Based Enterprise Systems

PhD Thesis Presentation, Alex Cheung © 2011 Introduction to Distributed Content- based Publish/Subscribe 2 subscriber brand = ‘Honda’ cashback > $2000 subscriber brand= ‘Honda’ cashback > $4000 publisher brand = ‘Honda’ cashback = $6000 broker multicast Advertisement path Subscription path Publication path brand = ‘Honda’ cashback >= $0

PhD Thesis Presentation, Alex Cheung © 2011 Desirable Properties of Distributed Content-based Publish/Subscribe Decoupling of data sources and sinks  Ease of component addition and removal Flexible routing based on message content  Efficient use of network resources Distributed broker overlay network  Scalable  Fault tolerant 3

PhD Thesis Presentation, Alex Cheung © 2011 Applications of Publish/Subscribe Network and systems monitoring [Mukherjee 1994] Business activity monitoring [Fawcett et al. 1999] Business process execution [Schuler et al. 2001] Workflow management [Cugola et al. 2001] Multiplayer online games [Bharambe et al. 2002] RSS filtering [Petrovic et al. 2005; Rose et al. 2007] Automated service composition [Hu et al. 2008] Resource discovery [Yan et al. 2009] 4

PhD Thesis Presentation, Alex Cheung © 2011 Real Deployments of Distributed Publish/Subscribe GooPS ▫ Google’s pub/sub messaging middleware to integrate web applications (such as Gmail, Google Docs, Google Calendar) on a world-wide scale supporting millions of users ▫ Hundreds of brokers with tens of thousands of pub/sub clients Yahoo Message Broker ▫ Yahoo’s pub/sub middleware to integrate applications with their database system, PNUTS SuperMontage ▫ Tibco’s pub/sub distribution network for Nasdaq’s quote and order-processing system GDSN (Global Data Synchronization Network) ▫ A global pub/sub network that allows retailers and suppliers (i.e., Walmart, Target, Metro, etc.) to exchange timely and accurate supply chain data 5

PhD Thesis Presentation, Alex Cheung © 2011 Contributions Load Balancing in Content-based Publish/Subscribe Systems (ACM TOCS’10) Publisher Placement Algorithms in Content- based Publish/Subscribe (IEEE ICDCS’10) Green Resource Allocation Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’11) 6

PhD Thesis Presentation, Alex Cheung © 2011 Problem Brokers located at different geographical areas may suffer from uneven load distribution due to ▫ Heterogeneous servers ▫ Network congestion ▫ Different densities and interests of end-users Consequences ▫ Overloaded brokers introduce high delivery delays that may ultimately crash from running out of memory ▫ System that does not scale with the added resources 7

PhD Thesis Presentation, Alex Cheung © 2011 S S S S S S S S S S P P Visualizing the Problem 8

PhD Thesis Presentation, Alex Cheung © 2011 P P S S S S S S S S S S Overview of Load Balancing Approach 9 Local Load Balancing Global Load Balancing offloading broker load-accepting broker

PhD Thesis Presentation, Alex Cheung © 2011 Evaluation Implemented on a real open source pub/sub system called PADRES PlanetLab and a cluster testbed Local and global load balancing Homogeneous and heterogeneous servers Compared against a naive approach 10 B20 B21 B22 B30 B31 B32 B40 B41 B42 S S S S S S B10 B11 B12 P P P P Global LB Setup B50 B51 B52 B60 B61 B62

PhD Thesis Presentation, Alex Cheung © 2011 Summary Load balancing enables the pub/sub system to scale with the number of resources Load balancing solutions that are unaware of subscription load and relationships are ineffective ▫ Long response time ▫ Unstable system 11

PhD Thesis Presentation, Alex Cheung © 2011 Contributions Load Balancing in Content-based Publish/Subscribe Systems (ACM TOCS’10) Publisher Placement Algorithms in Content- based Publish/Subscribe (IEEE ICDCS’10) Green Resource Allocation Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’11) 12

PhD Thesis Presentation, Alex Cheung © 2011 Problem Publishers can join anywhere or to the closest broker in the overlay Consequences ▫ High delivery delay  Sluggish system ▫ High resource usage in terms of matching, network bandwidth, and subscription storage  High IT costs 13 P P S S S S

PhD Thesis Presentation, Alex Cheung © 2011 Approach Adaptively move publisher to area of matching subscribers Two unique solutions ▫ POP (Publisher Optimistic Placement)  Decision is based on the average number of downstream publication deliveries ▫ GRAPE (Greedy Relocation Algorithm for Publishers of Events)  Decision is based on the end-to-end delivery delay, total broker message rate, and user specified inputs including the minimization metric (load/delivery delay) and weight 14 S S S S P P

PhD Thesis Presentation, Alex Cheung © 2011 Evaluation Implemented on the open source pub/sub system called PADRES PlanetLab and a cluster testbed Enterprise and random workloads 15 Reduced delivery delay by up to 68% Reduced message rate by up to 85%

PhD Thesis Presentation, Alex Cheung © 2011 Summary POP is suitable for pub/sub systems that strive for simplicity, such as GooPS GRAPE is suitable for systems that strive to minimize in the extremes, such as system load in sensor networks or delivery delay in SuperMontage 16

PhD Thesis Presentation, Alex Cheung © 2011 Contributions Load Balancing in Content-based Publish/Subscribe Systems (ACM TOCS’10) Publisher Placement Algorithms in Content- based Publish/Subscribe (IEEE ICDCS’10) Green Resource Allocation Algorithms in Content-based Publish/Subscribe (IEEE ICDCS’11) 17

PhD Thesis Presentation, Alex Cheung © 2011 Problem What is the deployment strategy for the broker overlay, publisher assignment, and subscriber assignment to minimize the broker message rate and number of allocated brokers? Proven to be an NP-complete problem Benefits ▫ Increase capacity of the system ▫ More efficient energy usage of the allocated servers ▫ Fewer servers mean lower investment and maintenance costs ▫ Inline with Green IT, which is also what enterprises such as Google and Yahoo are currently engaged in 18

PhD Thesis Presentation, Alex Cheung © 2011 Approach 3 phase design. Most compelling properties ▫ Language independent  Content-based (XPath, regex, ranged, SQL, composite subscriptions, etc.) and topic-based, such as GooPS ▫ Works effectively under any workload (defined or undefined) 19 Phase 1Record the publications delivered to each subscription into bit vectors Phase 2Use information from the bit vectors to allocate subscriptions to brokers using one of 10 algorithms Phase 3Construct the broker overlay with 3 optimization techniques and deploy the new configuration

PhD Thesis Presentation, Alex Cheung © 2011 Phase 1: Subscription Profiling Message ID of first index Start of bit vector 1 Publications delivered to subscription B34-M213 B34-M215 B34-M216 B34-M217 B34-M220 B34-M222 B34-M225 B34-M226 B34-M Profile of each subscriber per advertisement maintained at the subscriber’s first broker Message ID Cardinality of bit vector corresponds to bandwidth requirement of the subscription Used to compute “closeness” of between any two subscriptions in the clustering algorithm. closeness = |s i ∩ s j | Fixed size so shift left if next publication is out of bit vector range

PhD Thesis Presentation, Alex Cheung © 2011 Phase 2: Subscription Allocation Algorithms MANUAL/(AUTOMATIC) ▫ Tree with fanout of 2, manual (random) placement of clients Fastest Broker First (FBF) ▫ Assign subscriptions randomly to the next most powerful broker Bin Packing ▫ Like FBF, but assigns the next highest traffic subscription PAIRWISE-N, PAIRWISE-K (related approaches in ICDCS’02) ▫ Subscription clustering where the number of clusters is given CRAM (Clustering with Resource Awareness and Minimization) ▫ Dynamically determines the number of clusters ▫ Utilizes a new clustering algorithm that is more effective ▫ Evaluated with 4 different subscription closeness metrics, with one derived from Banavar et al. in ICDCS '99 21

PhD Thesis Presentation, Alex Cheung © 2011 Bin Packing 22 S S S S S S S S S S S S

PhD Thesis Presentation, Alex Cheung © 2011 Bin Packing’s Allocation Result 23 S S S S S S S S S S S S

PhD Thesis Presentation, Alex Cheung © 2011 S S S S Phase 3: Broker Overlay Construction 24 S S S S S S S S S S S S S S

PhD Thesis Presentation, Alex Cheung © 2011 Bin Packing’s Final Overlay 25 S S S S S S S S S S S S S S S S S S P P P P ( ( GRAPE ) ) ( ( GRAPE ) )

PhD Thesis Presentation, Alex Cheung © 2011 Evaluation Implemented on the PADRES open source content-based pub/sub project Evaluated on a cluster testbed with 80 brokers Evaluated on SciNet, an HPC with 1000 brokers Comparison against two related works (Riabov et al. ICDCS’02, Banavar et al. ICDCS’99) Homogeneous and heterogeneous scenarios Workload saturates the initial deployment (MANUAL) 26

PhD Thesis Presentation, Alex Cheung © 2011 Evaluation Results on SciNet 27 Reduced message rate by up to 92% Reduced number of allocated brokers by up to 91%

PhD Thesis Presentation, Alex Cheung © 2011 Summary CRAM combines the benefits of ▫ Subscription clustering ▫ Resource awareness from Bin Packing by simultaneously reducing both ▫ Broker message rates ▫ Number of allocated brokers Bit vectors are powerful ▫ Language independent (XPath, regex, topics) ▫ Effective with any workload distribution 28

PhD Thesis Presentation, Alex Cheung © 2011 Conclusions Load balancing increases ▫ Availability by circumventing overloads ▫ Scalability of the system Publisher placement algorithms reduce ▫ Broker input load by up to 68% ▫ Broker message rate by up to 85% ▫ Delivery delay by up to 68% Resource allocation algorithms reduce ▫ Average broker message rate by up to 92% ▫ Number of allocated brokers by up to 91% 29

PhD Thesis Presentation, Alex Cheung © 2011 Future Work Self-tuning of load balancing parameters React dynamically by growing and shrinking the network in incremental steps Improve runtime of the CRAM algorithm by parallelization or reducing its computational complexity Model workload with more sophisticated methods, such as stochastic processes, to improve accuracy of load estimation Address fault resiliency in each approach 30

PhD Thesis Presentation, Alex Cheung © 2011 Q & A 31