CS 268: Multicast Transport Kevin Lai April 24, 2001.

Slides:



Advertisements
Similar presentations
1 School of Computing Science Simon Fraser University CMPT 771/471: Internet Architecture & Protocols TCP-Friendly Transport Protocols.
Advertisements

1 CS 194: Distributed Systems Process resilience, Reliable Group Communication Scott Shenker and Ion Stoica Computer Science Division Department of Electrical.
1 Transport Protocols & TCP CSE 3213 Fall April 2015.
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli University of Calif, Berkeley and Lawrence Berkeley National Laboratory SIGCOMM.
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
TCP Congestion Control Dina Katabi & Sam Madden nms.csail.mit.edu/~dina 6.033, Spring 2014.
Router Buffer Sizing and Reliability Challenges in Multicast Aditya Akella 02/28.
15-744: Computer Networking L-17 Multicast Reliability and Congestion Control.
L-21 Multicast. L -15; © Srinivasan Seshan, Overview What/Why Multicast IP Multicast Service Basics Multicast Routing Basics DVMRP Overlay.
School of Information Technologies TCP Congestion Control NETS3303/3603 Week 9.
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli SIGCOMM 1996.
A Reliable Multicast Framework For Light-Weight Sessions and Application Level Framing Sally Floyd, Van Jacobson, Ching-Gung Liu, Steven McCanne, Lixia.
The War Between Mice and Elephants Presented By Eric Wang Liang Guo and Ibrahim Matta Boston University ICNP
Congestion Control Tanenbaum 5.3, /12/2015Congestion Control (A Loss Based Technique: TCP)2 What? Why? Congestion occurs when –there is no reservation.
CSE 561 – Multicast Applications David Wetherall Spring 2000.
588 Section 6 Neil Spring May 11, Schedule Notes – (1 slide) Multicast review –(3slides) RLM (the paper you didn’t read) –(3 slides) ALF & SRM –(8.
CS 268: Wireless Transport Protocols Kevin Lai Feb 13, 2002.
High-performance bulk data transfers with TCP Matei Ripeanu University of Chicago.
Network Multicast Prakash Linga. Last Class COReL: Algorithm for totally-ordered multicast in an asynchronous environment, in face of network partitions.
EE689 Lecture 14 Review of Last lecture Receiver-driven Layered Multicast.
Streaming Media. Unicast Redundant traffic Multicast One to many.
1 Lecture 6: Multicast l Challenge: how do we efficiently send messages to a group of machines? n Need to revisit all aspects of networking n Last time.
1 Internet Networking Spring 2004 Tutorial 10 TCP NewReno.
1 CS 268: Lecture 12 (Multicast) Ion Stoica March 1, 2006.
Resilient Multicast Support for Continuous-Media Applications X. Xu, A. Myers, H. Zhang and R. Yavatkar CMU and Intel Corp NOSSDAV, 1997.
Reliable Transport Layers in Wireless Networks Mark Perillo Electrical and Computer Engineering.
On Multicast CS614 - March 7, 2000 Tibor Jánosi ?.
Prof. Reza Rejaie Computer & Information Science University of Oregon Winter 2003 An Overview of Internet Multimedia Networking.
Networking. Protocol Stack Generally speaking, sending an message is equivalent to copying a file from sender to receiver.
Error Checking continued. Network Layers in Action Each layer in the OSI Model will add header information that pertains to that specific protocol. On.
Multicast Transport Protocols: A Survey and Taxonomy Author: Katia Obraczka University of Southern California Presenter: Venkatesh Prabhakar.
CSE679: Multicast and Multimedia r Basics r Addressing r Routing r Hierarchical multicast r QoS multicast.
Receiver-driven Layered Multicast Paper by- Steven McCanne, Van Jacobson and Martin Vetterli – ACM SIGCOMM 1996 Presented By – Manoj Sivakumar.
3: Transport Layer3b-1 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle”
A Simulation of Adaptive Packet Size in TCP Congestion Control Zohreh Jabbari.
Wireless TCP Prasun Dewan Department of Computer Science University of North Carolina
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
EE 122: Congestion Control and Avoidance Kevin Lai October 23, 2002.
TCP Trunking: Design, Implementation and Performance H.T. Kung and S. Y. Wang.
2007/03/26OPLAB, NTUIM1 A Proactive Tree Recovery Mechanism for Resilient Overlay Network Networking, IEEE/ACM Transactions on Volume 15, Issue 1, Feb.
Copyright © Lopamudra Roychoudhuri
CIS679: Multicast and Multimedia (more) r Review of Last Lecture r More about Multicast.
1 TCP - Part II Relates to Lab 5. This is an extended module that covers TCP data transport, and flow control, congestion control, and error control in.
Lecture 9 – More TCP & Congestion Control
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Computer Networking Lecture 18 – More TCP & Congestion Control.
TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.
1 CS 4396 Computer Networks Lab TCP – Part II. 2 Flow Control Congestion Control Retransmission Timeout TCP:
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
EE689 Lecture 13 Review of Last Lecture Reliable Multicast.
ECE 4110 – Internetwork Programming
Transport Layer: Sliding Window Reliability
Data Link Layer Flow and Error Control. Flow Control Flow Control Flow Control Specifies the amount of data can be transmitted by sender before receiving.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
McGraw-Hill Chapter 23 Process-to-Process Delivery: UDP, TCP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
EE122: Multicast Kevin Lai October 7, Internet Radio  (techno station)
Other Methods of Dealing with Congestion
CMPE 252A: Computer Networks
Introduction to Congestion Control
CIS, University of Delaware
Transport Layer Unit 5.
TCP - Part II Relates to Lab 5. This is an extended module that covers TCP flow control, congestion control, and error control in TCP.
Lecture 19 – TCP Performance
Other Methods of Dealing with Congestion
CS640: Introduction to Computer Networks
CS4470 Computer Networking Protocols
TCP Congestion Control
The Transport Layer Reliability
Presentation transcript:

CS 268: Multicast Transport Kevin Lai April 24, 2001

The Goal  Transport protocol for multicast -Reliability apps: file distribution, non-interactive streaming -Low delay apps: conferencing, distributed gaming -Congestion control for multicast flows critical for all applications

Reliability: The Problems  Assume reliability through retransmission -even with FEC, may still have to deal with retransmission (why?)  Sender can not keep state about each receiver -e.g., what receivers have received, RTT -number of receivers unknown and possibly very large  Sender can not retransmit every lost packet -even if only one receiver misses packet, sender must retransmit, lowering throughput  Estimating path properties is difficult -must estimate RTT to set retransmit timers -unicast algorithms (e.g., TCP) don’t generalize to trees  N(ACK) implosion -described next

(N)ACK Implosion  (Positive) acknowledgements -ack every n received packets -what happens for multicast?  Negative acknowledgements -only ack when data is lost -assume packet 2 is lost S S R1 R2 R3 123

NACK Implosion  When a packet is lost all receivers in the sub-tree originated at the link where the packet is lost send NACKs S S R1 R2 R ?

Scalable Reliable Multicast (SRM) [Floyd et al ’95]  Receivers use timers to send NACKS and retransmissions -randomized prevent implosion -uses latency estimates short timer  cause duplicates when there is reordering long timer  causes excess delay  Any node retransmits -sender can use its bandwidth more efficiently -overall group throughput is higher  Duplicate NACK/retransmission suppression

Inter-node Latency Estimation  Every node estimates latency to every other node -uses session reports (< 5% of bandwidth) assume symmetric latency what happens when group becomes very large? A A B B t1t1 t2t2 d d d A,B = (t 2 – t 1 – d)/2

 Chosen from the uniform distribution on -A – node that lost the packet -S – source -C 1, C 2 – algorithm parameters -d S,A – latency between S and A -i – iteration of repair request tries seen  Algorithm -Detect loss  set timer -Receive request for same data  cancel timer, set new timer, possibly with new iteration -Timer expires  send repair request Repair Request Timer Randomization

Timer Randomization  Repair timer similar -every node that receives repair request sets repair timer -latency estimate is between node and node requesting repair  Timer properties -Minimize probability of duplicate packets reduce likelihood of implosion (duplicates still possible) ¢ poor timer, randomized granularity ¢ high latency between nodes -Reduce delay to repair nodes with low latency to sender will send repair request more quickly nodes with low latency to requester will send repair more quickly when is this sub-optimal?

Chain Topology  C 1 = D 1 = 1, C 2 = D 2 = 0  All link distances are 1 L2 L1 R1 R2 R3 source data out of order data/repair request repair request TO repair TO request repair

Star Topology  C 1 = D 1 = 0,  Tradeoff between (1) number of requests and (2) time to receive the repair  C 2 <= 1 -E(# of requests) = g –1  C 2 > 1 -E(# of requests) = 1 + (g-2)/C 2 -E(time until first timer expires) = 2C 2 /g  -E(# of requests) = -E(time until first timer expires) = N1 N2 N3 N4 Ng source

Bounded Degree Tree  Use both -Deterministic suppression (chain topology) -Probabilistic suppression (star topology)  Large C 2 /C 1  fewer duplicate requests, but larger repair time  Large C 1  fewer duplicate requests  Small C 1  smaller repair time

Adaptive Timers  C and D parameters depends on topology and congestion  choose adaptively  After sending a request: -Decrease start of request timer interval  Before each new request timer is set: -If requests sent in previous rounds, and any dup requests were from further away: Decrease request timer interval -Else if average dup requests high: Increase request timer interval -Else if average dup requests low and average request delay too high: Decrease request timer interval

Local Recovery  Some groups are very large with low loss correlation between nodes -Multicasting requests and repairs to entire group wastes bandwidth  Separate recovery multicast groups -e.g. hash sequence number to multicast group address -only nodes experiencing loss join group -recovery delay sensitive to join latency  TTL-based scoping -send request/repair with a limited TTL -how to set TTL to get to a host that can retransmit -how to make sure retransmission reaches every host that heard request

Application Layer Framing (ALF)  [Clark and Tennenhouse 90]  Application should define Application Data Unit (ADU) to lower layers -ADU is unit of error recovery app can recover from whole ADU loss app treats partial ADU loss/corruption as whole loss -App names ADUs -App can process ADUs out of order -Small ADUs (e.g., a packet): low delay, keep app busy -Large ADUs (e.g., a file): more efficient use of bw and cycles -Lower layers can minimize delay by passing ADUs to apps out of order

Multicast Congestion Control Problem  Unicast congestion control: -send at rate not exceeding smallest fair share of all links along a path  Multicast congestion control: -send at minimum of unicast fair shares across all receivers problem: what if receivers have very different bandwidths? -segregate receivers into multicast groups according to current available bandwidth

Issues  What rate for each group?  How many groups?  How to join and leave groups?

Assumptions  a video application -can easily make size/quality tradeoff in encoding of application data (i.e., a 10Kb video frame has less quality than a 20Kb frame) -separate encodings can be combined to provide better quality e.g., combine 5Kb + 10Kb + 20Kb frames to provide greater quality than just 20Kb frames  6 layers  32x2 i kb/s for the ith layer

Example of Size/Quality Tradeoff 784 bytes bytes5372 bytes3457 bytes 1900 bytes1208 bytes

Basic Algorithm  join a new layer when there is no congestion -joining may cause congestion -join infrequently when the join is likely to fail  drop largest layer when there is congestion -congestion detected through drops -could use explicit feedback, delay  how frequently to attempt join?  how to scale up to large groups?

Join Timer  Set 2 timers for each layer -use randomization to prevent synchronization -join timer expires  join next larger layer -detect congestion  drop layer, increase join timer, update detection timer with time since last layer add -detection timer expires  decrease join timer for this layer  Layers have exponentially increasing size  multiplicative increase/decrease (?)  All parameters adapt to network conditions  2 c >d j 4 = c cc 2c2c j 3 = c j 2 = c c

Scaling Problems  Independent joins do not scale -frequency of joins increase with group size  congestion collapse (why?) -joins interfere with each other  unfairness  Could reduce join rate -convergence for large groups will be slow

Scaling Solution  Multicast join announcement  node initiates join iff current join is for higher layer  congestion  backs off its own timer to join that layer -shares bottleneck with joiner  no congestion  joins new layer iff it was original joiner -does not share bottleneck with joiner  convergence could still be slow (why?) L1L1 L2L2 R0R0 R3R3 R2R2 R1R1 R4R4 R5R5 S congestion backoff join 4 join 2

Simulation Results  Higher network latency  less stability -congestion control is control problem -control theory predicts that higher latency causes less stability  No cross traffic  Scales up to 100 nodes

Priority-drop and Uniform-drop  Uniform drop -drop packets randomly from all layers  Priority drop -drop packets from higher layers first  Sending rate <= bottleneck -no loss, no difference in performance  Sending rate > bottleneck -important, low layer packets may be dropped  uniform drop performance decreases  Convex utility curve  users encouraged to remain at maximum [McCanne, Jacobson 1996]

Later Work Contradicts  Burstiness of traffic results in better performance for priority drop % better performance -measured in throughput, not delay  Neither has good incentive properties -n flows, P(drop own packet) = 1/n, P(drop other packet) = (n-1)/n -need Fair Queueing for good incentive properties [McCanne, Jacobson 1996][Bajaj, Breslau, Shenker 1998]

Discussion  Could this lead to congestion collapse?  Do SRM/RLM actually scale to millions of nodes? -Session announcements of SRM  Does RLM generalize to reliable data transfer? -What if layers are independent? -What about sending the file multiple times?  Is end-to-end reliability the way to go? -What about hop-by-hop reliability?

Summary  Multicast transport is a difficult problem  One can significantly improve performance by targeting a specific application -e.g., bulk data transfer or video  Depend on Multicast routing working

Resilient Multicast: STORM [Rex et al ’97]  Targeted applications: continuous-media applications -E.g., video and audio distribution  Resilience -Receivers don’t need 100% of data -Packets must arrive in time for repairs -Data is continuous, large volume -Old data is discarded

Design Implications  Recovery must be fast -SRM not appropriate (why?)  Protocol overhead should be small  No ACK collection or group management

Solution  Build an application recovery structure -Directed acyclic graph that span the set of receiver Does not include routers! -Typically, a receiver has multiple parents -Structure is built and maintained dsitributedly  Properties -Responsive to changing conditions -Achieve faster recovery -Reduced overhead

Details  Use multicast (expanding ring search) to find parents  When there is a gap in sequence number send a NACK -Note: unlike SRM in which requests and repairs are multicast, with STORM NACKs and repairs are unicast  Each node maintain -List of parent nodes -A quality estimator for each parent node -A delay histogram for all packets received -A list of timers for NACKs sent to the parent -A list of NACKs note served yet -Note: excepting the list of NACKs shared by parent-child all other info is local

Choosing a Parent  What is a good parent? -Can send repairs in time -Has a low loss correlation with the receiver NACK Repair Expected packet arrival Packet playback B (replay buffer) 100% Received (%) time Cumulative distribution of received packets

Choosing a Parent  Source stamps each packet to local time  t a – adjusted arrival time, where -t a = packet stamp – packet arrival time  Each node compute loss rate as a function of t a :  Choose parent that maximizes the number of received packets by time t a + B

Loop Prevention  Each receiver is assigned a level  Parent’s level < child’s level  Level proportional to the distance from source -Use RTT + a random number to avoid to many nodes on the same level

Adaptation  Receivers evaluate parents continually  Choose a new parent when one of current parents doesn’t perform well  Observations: -Changing parents is easy, as parents don’t keep track of children -Preventing loops is easy, because the way the levels are assigned -Thus, no need to maintain consistent state such as child-parent relationship