4Latency 1ms latency = 1% reduce of sales. 100ms latency = 0.2% number searches2.2 seconds faster / page= 60 M more download / year
5LOW LATENCY LOW Latency Datacenters? Finish Flows Earlier The LAST of flows == final resultMeet Flow DeadlineUser-facing applications, Latency Goal
6partition aggregate model associated component deadlines in the parentheses.
7Today’s transport protocols TCP / RCP/ ICTCP / DCTCP:Fair SharingDivide link bandwidth equally.Fail to reduce flow completion time.
8WHAT is TCP TCP slow start TCP fast recovery additive increase multiplicative decreaseHost Aone segmentRTTHost Btimetwo segmentsfour segments
9WHAT IS RCP Rate Control Protocol RCP is an adaptive algorithm to emulate Processor Share : a router divides outgoing link bandwidth equallyRate is picked by the routers based on queue size and aggregate trafficRouter assigns a single rate to all flowsRequires no per-flow state or per-packet calculation
10Fairness damages completion time Flow Fa,Fb,Fc arrives at the same time, with size = 1,2,3 and deadline = 1,4,6Fair share,FC time = (3+5+6)/3 = 4.67D3 for order BACFC time = (2+4+6)/3 = 4Shortest Job First/ Earliest Deadline FirstFC time = (1+3+6)/3 = 3.33
11Satisfy request by Order D3 depends on flow orderD3 satisfies as many flows as possible in the order of their arrival,Request rate = flow size / time until deadline.Satisfy request by Order
12The PDQ Solution Preemptive Distributed Quick Schedule by flow criticality.Criticality: relative priority of flows.Scheduling discipline.Preemptive : relating to the purchase of goods or shares by one person or party before the opportunity is offered to others.
13PDQ’s scheduling diciplines EDF: earliest deadline firstOptimal for flow deadlines.SJF: shortest job firstOptimal for mean flow finish time.EDF+SJF:Give preference to deadline flows.Policy based:Manually allocate priority of flow.
14challenges. Decentralizing scheduling discipline More mice than elephant.Switching between flows seamlesslyHard to full utilize bandwidthPrioritizing flows using FIFO tail-drop QueuesFIFO Queue length limited
15outline Motivation PDQ solution to flow scheduling Evaluation Discussion
18PDQ protocol-PDQ sender-1 SYN / TERM packet for initialization and termination.Resend after timeout.sender maintains info for in-flight packets:Current Sending Rate (Rs)ID of switch who paused it (Ps)Deadline (Ds)Expected flow transmission time (Ts)Inter-probing time (Is)Measured RTT (RTTs)
19PDQ protols-sender-2 Sender sends package with rate Rs If Rs = 0, Send a probe packet heartbeatly.(scheduling header without data)When ACK arrives, update Rs (ACKinfo: accept/pause)
20Pdq protols-sender -early-termination Sender TERMNINATES a flow when it cannot meet its deadline. Whenever:Deadline is past.Remaining flow transmission + time > deadlineFlow is paused , and time + RTT> deadline
21PDQ protols-switch Let the most critical flow complete asap. Critical flows preempt others to achieve the highest possible sending rate1) maintain state about each flow2) Compute Rate Feedbacka) flow controller to decide witch flows to sendb) rate controller to determine Rate
22pdq protocol-switch-state Maintains flow states on each link<Rate, P, Deadline, expected Time,RTT>Pi: flow i is paused by switch PiStore 2K of them, most critical ones. K is number of Current Sending Flow.
23pdq protocol-FLOW Control Whenever a Switch receives ACK/data, ACCEPT or PAUSE a flowPause: inform others flow f is Paused.Switch who receives ACK-Pause i removes i from its own statesAccept: calculate available bandwidthOther Switch who receives ACK-accept i updates state i
24algorithm recv data/ack flow f 1) if f is paused by other Switch, remove it from my list.2) if f is not in my list:Try to add f into my list, if can not ,pause f3) if (w = min(aviliableBW ,Rf) > 0 ):Accept fOtherwise pause f
25Flow-control-3 optimization DampeningIf switch accepted a flow, then in a short period of time he can not accept other new flows.Early startingSuppressed probing
27Suppressed probing Sender may send probe packages too often. Flow info If : tell the sender of f that you should send probe every If*RTT.If is maintained by switches , by calculation of average finish time of all flows and rank of f
28pdq protocol-rate control Control the total sending rate of its accepted flows.Maintains variable C to compute range of Rate.reserves BW for early started flowsC = Full_BW- Queue_size/(K*RTT)
29outline Motivation PDQ solution to flow scheduling Evaluation Discussion
30Evaluation setting: TRAFFIC Deadline-constrained flows:Time sensitive : ~20msShort message : 2KB~200KBGoal: Application Throughput = percentage of flows meets their deadlinesDeadline-unconstrained flows:100~1000KBGoal: average flow completion time
32query aggregation:All senders initiate at the same time to the same receiver.Optimal: one scheduler control all transmission with no delay.maximize application throughput:sort by EDF, and then uses a dynamic programming
39Other concerns Does it require rewriting APP? Deployment? PDQ paused appears like TCP slow,The transport connection stays open.Deployment?Hosts: between IP and transport layerSwitch: modify hardware/software, O(k)