Presentation on theme: "Improving Datacenter Performance and Robustness with Multipath TCP"— Presentation transcript:
1 Improving Datacenter Performance and Robustness with Multipath TCP Costin RaiciuDepartment of Computer ScienceUniversity Politehnica of BucharestSebastien Barre (UCL-BE), Christopher Pluntke (UCL), Adam Greenhalgh (UCL), Damon Wischik (UCL) and Mark Handley (UCL)Thanks to:
2 This is the wrong place to start MotivationDatacenter apps are distributed across thousands of machinesWant any machine to play any roleTo achieve this:Use dense parallel datacenter topologiesMap each flow to a pathProblem:Naïve random allocation gives poor performanceImproving performance adds complexityThis is the wrong place to start
3 Multipath topologies need multipath transport ContributionsMultipath topologies need multipath transportMultipath transport enables better topologies
4 To satisfy demand, modern datacenters provide many parallel paths …Traditional Topologies are tree-basedPoor performanceNot fault tolerantShift towards multipath topologies: FatTree, BCube, VL2,Cisco, EC2
5 Fat Tree Topology [Fares et al., 2008; Clos, 1953] K=4Aggregation Switches1GbpsK Pods withK Switcheseach1GbpsShow multiple paths between serversSay that network is rearrangeably non blockingClosRacks ofservers5
6 Fat Tree Topology [Fares et al., 2008; Clos, 1953] K=4Aggregation SwitchesK Pods withK SwitcheseachShow multiple paths between serversSay that network is rearrangeably non blockingClosRacks ofservers6
7 Collisions Show multiple paths between servers Say that network is rearrangeably non blockingClos7
25 MPTCP better utilizes the FatTree network Fairness matters for apps!
26 MPTCP on EC2 Amazon EC2: infrastructure as a service We can borrow virtual machines by the hourThese run in Amazon data centers worldwideWe can boot our own kernelA few availability zones have multipath topologies2-8 paths available between hosts not on the same machine or in the same rackAvailable via ECMP
27 Amazon EC2 Experiment 40 medium CPU instances running MPTCP For 12 hours, we sequentially ran all-to-all iperf cycling through:TCPMPTCP (2 and 4 subflows)
30 How many subflows are needed? How does the topology affect results?How does the traffic matrix affect results?
31 At most 8 subflows are needed Total ThroughputTCP
32 MPTCP improves fairness in VL2 topologies Fairness is important:Jobs finish when the slowest worker finishes
33 MPTCP improves throughput and fairness in BCube
34 Oversubscribed Topologies To saturate full bisectional bandwidth:There must be no traffic localityAll hosts must send at the same timeHost links must not be bottlenecksIt makes sense to under-provision the network coreThis is what happens in practiceDoes MPTCP still provide benefits?
35 Performance improvements depend on traffic matrix UnderloadedSweet SpotOverloadedIncrease Load
36 What is an optimal datacenter topology for multipath transport?
37 In single homed topologies: Hosts links are often bottlenecksToR switch failures wipe out tens of hosts for daysShow multiple paths between serversSay that network is rearrangeably non blockingClosMulti-homing servers is the obvious way forward37
38 Fat Tree Topology Show multiple paths between servers Say that network is rearrangeably non blockingClos38
39 Fat Tree TopologyUpper PodSwitchToR SwitchServers
40 Dual Homed Fat Tree Topology Upper PodSwitchToR SwitchServers
41 Is DHFT any better than Fat Tree? Not for traffic matrices that fully utilize the coreLet’s examine random traffic patternsOther TMs in the paper
42 DHFT provides significant improvements when core is not overloaded Core UnderloadedCore Overloaded
43 Summary“One flow, one path” thinking has constrained datacenter designCollisions, unfairness, limited utilizationMultipath transport enables resource pooling in datacenter networks:Improves throughputImproves fairnessImproves robustness“One flow, many paths” frees designers to consider topologies that offer improved performance for similar cost
45 Effect of MPTCP on short flows Flow sizes from VL2 datasetMPTCP enabled for long flows only (timer)Oversubscribed Fat Tree topologyResults:TCP/ECMPCompletion time: 79msCore Utilization: %MPTCP97ms65%
46 MPTCP vs Centralized Dynamic Scheduling Centralized Scheduling MPTCPAnimate tough part of the graphInfiniteScheduling Interval46
47 Centralized Scheduling: Setting the Threshold ThroughputHope1Gbps17% worse than multipath TCPApp Limited100Mbps
48 Centralized Scheduling: Setting the Threshold Throughput1Gbps21% worse than multipath TCPApp Limited100MbpsHope
49 Centralized Scheduling: Setting the Threshold Throughput1Gbps500Mbps45%51%100Mbps17%21%
50 Effect of Locality in the Dual Homed Fat Tree 50
51 Overloaded Fat Tree: better fairness with Multipath TCP