Presentation on theme: "Optimizing Buffer Management for Reliable Multicast Zhen Xiao AT&T Labs – Research Joint work with Ken Birman and Robbert van Renesse."— Presentation transcript:
Optimizing Buffer Management for Reliable Multicast Zhen Xiao AT&T Labs – Research Joint work with Ken Birman and Robbert van Renesse
Why important? Many applications desire reliable or semi-reliable delivery. IP multicast is best-effort. Buffering is necessary for retransmission. Buffer space is limited! How to utilize the amount of buffer space most efficiently?
Previous Work RMTP: Buffer all messages on repair servers. –Impractical for long-lived sessions. SRM: Regenerate messages at the application. –Buffer management at the application level remains a challenge. Stability Detection: Buffer messages until they are stable (i.e. received by all members in the group). –It takes a long time to achieve stability in a large multicast group. Bimodal Multicast: Buffer messages for a fixed amount of time. –Optimization: buffer messages on a sub-group of members.
RRMP: Randomized Reliable Multicast Protocol Key idea: combine previous work on randomized error recovery with the Bimodal Multicast protocol and hierarchical error recovery similar to that employed by tree-based protocols. Group receivers into a hierarchy. Do not use any repair server. parent region: the least upstream region of a receiver in the hierarchy. Each receiver maintains group membership information about receivers in its region and receivers in its parent region.
Two-phase Error Recovery Assume a receiver p detects a message loss. local loss: the loss affects a fraction of receivers in p’s region regional loss: the loss affects all receivers in p’s region Local recovery: a receiver tries to recover the loss from randomly selected neighbors. Remote recovery: some receivers in the region request retransmissions from the parent region.
Overview of Buffering Scheme Local recovery Remote recovery Error Recovery Long-term buffering Short-term buffering Buffering Short-term buffering: when a message is first introduced into the system. Long-term buffering: when almost all receivers in a region have received the message.
idle message: no request for this message has been received for a time interval T. (T is the idle threshold.) Short-term buffering: buffer a received message until it becomes idle. Result: messages most needed in the system stay in the buffer longer. No extra traffic overhead! n: the size of a region p: the percentage of members in this region missing a message The probability that a member will not receive any request: As, this probability can be approximated by idea: a member uses the retransmission requests it received as feedback to estimate how many members in the region still miss the message. Feedback-based Short-term Buffering
Simulation Results Short-term buffering in a local region. –100 members in the region, fully connected. –RTT between any two members: 10ms. –idle threshold: 40ms. Outcome of IP multicast: select a random subset of members to hold a message initially. –Measure how long these members buffer the message.
s s routers receivers sender idle Sorry, you are out of luck! p q
Randomized Long-term Buffering idea: provide long-term buffering for an idle message at a small subset of receivers in each region. Load balancing: spread the load of buffering across all receivers in a region. Randomized algorithm: each member independently tosses a coin to decide whether to become a long-term bufferer. C: the expected number of long-term bufferers. Saving in buffer space: n / C Network dynamics: message transfer
The probability that k members buffer an idle message for different values of C, the expected number of long-term bufferers.
The probability that no member buffers an idle message decreases exponentially with C
Search Overhead Evaluate penalty in recovery time due to search for a bufferer in a region with 100 members. –RTT between any two members: 10ms. –Assume a remote request arrives at a random member. –Simulation repeated 100 times with different random seeds. Question I: how does the search time change with the number of bufferers? Question II: how does the search time changes with the region size?
Search time as the number of bufferers increases.
Search time as the size of the region increases
Summary Efficient buffer management is essential for reliable multicast in a large group. Two phase buffering to address variances in delivery latency in a large group. Retransmission requests can be used as feedback to allocate buffer space adaptively. Spread the load of buffering among all members in a group through randomization.