Peter R Pietzuch and Sumeer Bhola IBM T J Watson Research Center Congestion Control in a Reliable Scalable.

Peter R Pietzuch and Sumeer Bhola Peter.Pietzuch@cl.cam.ac.uk, sbhola@us.ibm.com IBM T J Watson Research Center Congestion Control in a Reliable Scalable Message-Oriented Middleware Middleware’03, Rio de Janeiro, Brazil, June 2003

1 Message-Oriented Middleware Scalability –Asynchronous communication and loose synchronisation –Publish/Subscribe communication with filtering –Overlay network of message brokers Reliability –Guaranteed delivery semantics for messages –Resend messages lost due to failure Congestion –Publication rate may be too high  not enough capacity –Must guarantee stable behaviour of the system –Usually done with over-provisioning of the system B B B B B  Congestion Control for Overlay Networks

2 The Congestion Control Problem Characteristics of a MOM –Large message buffers at brokers –Burstiness due to application-level routing –TCP CC only deals with inter-broker connections B   B  B       Message Brokers App-level Queues Network Causes of Congestion –Under-provisioned system Network bandwidth (congestion at output queues) Broker processing capacity (congestion at input queues) –Additional resource requirement due to recovery

3 Outline Message-Oriented Middleware The Congestion Control Problem Gryphon –Congestion in Gryphon Congestion Control Protocols –Publisher-Driven Congestion Control –Subscriber-Driven Congestion Control Evaluation –Experimental Results Conclusion

4 The Gryphon MOM IBM’s MOM with publish/subscribe –Supports guaranteed in-order, exactly-once delivery IB SHB PHB SHB P P P P P P P S S S S S S S S S S S Brokers can be –Publisher-Hosting (PHB) –Subscriber-Hosting (SHB) –Intermediate (IB) Clients connect to brokers Publishers are aggregated to publishing endpoints (pubends) –Ordered stream of messages; maintained in persistent storage –NACKs for lost messages –IB’s cache stream data and satisfy NACKs

5 Congestion in Gryphon Congestion due to recovery after link failure –System never recovers from unstable state PHBPHB SHB1 SHB2 IB 100 200 300 400 500 600 link failure msgs (kb/s) failure Requirements of CC in MOM –Independent from particular MOM implementation –No/little involvement of intermediate brokers –Detect congestion before queue overflow occurs –Ensure that recovering SHBs will eventually catch up

6 Congestion Control Protocols PHB-Driven CC Protocol (PDCC) –Feedback loop between pubends and downstream SHBs to monitor congestion –Limit publication rate of new messages to prevent congestion SHB-Driven CC Protocol (SDCC) –Monitor rate of progress at a recovering SHB –Limit rate of NACKs during recovery PHB SHB 1.Detect congestion in the system –Change in throughput used as a congestion metric –Reduction in throughput  queue build-up 2.Limit message rates to obtain stable behaviour

7 PHB-Driven Congestion Control Downstream Congestion Query Msgs (DCQ) –Trigger the congestion control mechanism –Periodically sent down the dissemination tree by pubend Upstream Congestion Alert Msgs (UCA) –Indicate congestion in the system –SHBs observe their message throughput and respond with a UCA msg when congested –Cause pubend to reduce its publication rate Properties –DCQ/UCA msgs treated as high-priority by brokers –Frequency of DCQ msg controls responsiveness of PDCC –No UCA msgs flow in an uncongested system –Similar to ATM ABR flow control

8 Processing of DCQ/UCA Msgs Publisher-Hosting Brokers –Hybrid additive/multiplicative increase/decrease scheme to change publication rate –Attempt to find optimal operating point Subscriber-Hosting Brokers –Non-recovering brokers should receive msgs at the publication rate –Recovering brokers should receives msgs at a higher rate Intermediate Brokers –Aggregate UCA msgs to prevent feedback explosion Pass up UCA msg from worst-congested SHB –Short-circuit first UCA msg for fast congestion notification PHB SHB IB

9 SHB-Driven Congestion Control Important to restrict NACK rate –Small NACK msg can trigger many large data msgs –Mechanism to control degree of resources spent on resent messages during recovery (recovery time) No support from other brokers necessary SHBs maintain NACK window –Decide which parts of the message stream to NACK –Observe recovery rate –Open/close NACK window additively depending on rate change –Similar to CC in TCP Vegas

10 Implementation in Gryphon Gryphon’s message stream is subdivided into ticks –Discrete time interval that can hold a single message –4 states: –Doubt Horizon: position in stream of first Q tick Rate of progress of the DH as a congestion metric –Independent from filtering and actual publication rate (D)ataMsg published (S)ilenceNo msg published (F)inalTick was garbage collected (Q)uestionUnknown (send NACK) doubt horizon time

11 Experimental Evaluation Network of dedicated broker machines –Simple topology (4 brokers) –Complex topology (9 brokers; asymmetric paths) –Hundreds of publishing and subscribing clients –Large queue sizes to maximize throughput (5-25 Mb) Congestion was created by –restricting bandwidth on inter-broker links –failing inter-broker links PHBPHB SHB1 SHB2 IB

12 Experiments I Congestion due to recovery after link failure –PDCC reduces publication rate –SDCC keeps recovery rate steady msgs (kb/s) link failure PHB SHB1 SHB2 recovery

13 Experiments II Congestion due to dynamic b/w limits of IB-SHB1 link –Publication rate follows link bottleneck –UCA msgs are received at pubend PHB SHB1 SHB2 0 100 200 300 400 500 600 700 low b/w med b/wlow b/w UCA msg msgs (kb/s) throughput ratio 0.4 0.6 0.8 1 1.2

14 Conclusions Reliable, content-based pub/sub needs congestion control –Characteristics different from traditional network cc Publisher-driven and subscriber-driven congestion control –Distinguish between recovering and non-recovering brokers –Hybrid additive and multiplicative adjustment –Normalised rate regardless of publication rate –NACK window for controlled recovery Future work –Fairness between many pubends in the same system –Dynamic adjustment of the DCQ rate

15 Thank you Any Questions?

16 Related Work TCP Congestion Control –Point-to-point congestion control only –Throughput-based congestion metric Reliable Multicast –Scalable feedback processing –Sender-based and receiver-based schemes –Feedback loops Multicast ABR ATM –Forward and Backward Resource Management Cells –BRM cell consolidation at ATM switches Overlay Networks –Little work done so far

Peter R Pietzuch and Sumeer Bhola IBM T J Watson Research Center Congestion Control in a Reliable Scalable.

Similar presentations

Presentation on theme: "Peter R Pietzuch and Sumeer Bhola IBM T J Watson Research Center Congestion Control in a Reliable Scalable."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Peter R Pietzuch and Sumeer Bhola IBM T J Watson Research Center Congestion Control in a Reliable Scalable.

Similar presentations

Presentation on theme: "Peter R Pietzuch and Sumeer Bhola IBM T J Watson Research Center Congestion Control in a Reliable Scalable."— Presentation transcript:

Similar presentations

About project

Feedback