Presentation on theme: "Towards Predictable Datacenter Networks"— Presentation transcript:
1Towards Predictable Datacenter Networks Hitesh Ballani, Paolo Costa,Thomas Karagiannis and Ant RowstronMicrosoft Research, Cambridge
2This talk is about …Guaranteeing network performance for tenants in multi-tenant datacentersMulti-tenant datacentersDatacenters with multiple (possibly competing) tenantsPrivate datacentersRun by organizations like Facebook, Intel, etc.Tenants: Product groups and applicationsCloud datacentersAmazon EC2, Microsoft Azure, Rackspace, etc.Tenants: Users renting virtual machines
3Network performance is not guaranteed Cloud datacenters 101Simple interface: Tenants ask for a set of VMsCharging is per-VM, per-hourAmazon EC2 small instances: $0.085/hourNo (intra-cloud) network costTenantAmazon EC2 InterfaceRequestVMsNetwork performance is not guaranteedBandwidth between a tenant’s VMs depends on their placement, network load, protocols used, etc.
4Performance variability in the wild Up to 5x variabilityStudyProviderDurationA[Giurgui’10]Amazon EC2n/aB[Schad’10]31 daysC/D/E[Li’10](Azure, EC2, Rackspace)1 dayF/G[Yu’10]H[Mangot’09]Which instances??
5Network performance can vary ... so what? Data analytics on an isolated clusterMap ReduceJobResultsCompletionTime4 hoursTenantEnterpriseUnpredictability of application performance and tenant costs is a key hindrance to cloud adoptionKey Contributor: Network performance variationData analytics in a multi-tenant datacenterCompletionTime10-16 hoursMap ReduceJobResultsTenantDatacenterVariable tenant costsExpected cost (based on 4 hour completion time) = $100Actual cost = $Variable network performance can inflate the job completion time
6Predictable datacenter networks Extend the tenant-provider interface to account for the networkContributions-Virtual network abstractionsTo capture tenant network demandsOktopus: Proof of concept systemImplements virtual networks in multi-tenant datacentersCan be incrementally deployed today!VM1VM2VMNVirtual NetworkRequest# of VMs andnetwork demandsRequest# of VMs and network demandsTenantKey Idea: Tenants are offered a virtual network with bandwidth guaranteesThis decouples tenant performance from provider infrastructure
7Key takeawayExposing tenant network demands to providers enables a symbiotic tenant-provider relationship Tenants get predictable performance (and lower costs) Provider revenue increases
9Virtual Network Abstractions: Design Goals Easier transition for tenantsTenants should be able to predict the performance of applications running atop the virtual networkProvider flexibilityProviders should be able to multiplex many virtual networks on the physical networkThese are competing design goalsOur abstractions strive to strike a balance between themVM1VM2VMNVirtual NetworkRequestVirtual toPhysicalTenant
10Abstraction 1: Virtual Cluster (VC) Motivation: In enterprises, tenants run applications on dedicated Ethernet clustersTotal bandwidth= N * BVM 1VM NVM 2B MbpsVirtual SwitchRequest <N, B>N VMs. Each VM can send and receive at B MbpsTenants get a network with no oversubscriptionSuitable for data-intensive apps. (MapReduce, BLAST)Moderate provider flexibility
11Abstraction 2: Virtual Oversubscribed Cluster (VOC) VMs can send traffic to group members at B MbpsRoot Virtual SwitchTotal bandwidth at root= N * B / OTotal bandwidth at VMs = N * BB * S / O Mbps…VM 1VM SB MbpsGroup 2Group Virtual Switch…VM 1VM SB MbpsGroup N/SB Mbps…….VM 1VM NVM SGroup 1Request <N, B, S, O>N VMs in groups of size S. Oversubscription factor O.Motivation: Many applications moving to the cloud have localized communication patternsApplications are composed of groups with more traffic within groups than across groupsVOC capitalizes on tenant communication patternsSuitable for typical applications (though not all)Improved provider flexibilityOversubscription factor O for inter-group communication(captures the sparseness of inter-group communication)No oversubscription for intra-group communicationIntra-group communication is the common case!
12Offers virtual networks to tenants in datacenters OktopusOffers virtual networks to tenants in datacenters
13Offers virtual networks to tenants in datacenters OktopusOffers virtual networks to tenants in datacentersTwo main componentsManagement plane: Allocation of tenant requestsAllocates tenant requests to physical infrastructureAccounts for tenant network bandwidth requirementsData plane: Enforcement of virtual networksEnforces tenant bandwidth requirementsAchieved through rate limiting at end hosts
15Allocating Virtual Clusters Request : <3 VMs, 100 Mbps>100 MbpsMax Sending Rate =2*100 = 200Max Receive Rate =1*100 = 100B/W needed on link =Min (200, 100) = 100MbpsVM for an existing tenantWhat bandwidth needs to be reserved for the tenant on this link?For a virtual cluster <N,B>, bandwidth needed on a link that connects m VMs to the remaining (N-m) VMs is = Min (m, N-m) * BFor a valid allocation:Bandwidth needed <= Link’s Residual BandwidthHow to find a valid allocation?Datacenter Physical Topology4 physical machines, 2 VM slots per machineTenant RequestAllocate a tenant asking for 3 VMs arranged in a virtual cluster with 100 Mbps each, i.e. <3 VMs, 100Mbps>An allocation of tenant VMs to physical machinesTenant traffic traverses the highlighted linksLink divides virtual tree into two partsConsider all traffic from the left to right part
16Allocation is fast and efficient Greedy allocation algorithm Request : <3 VMs, 100 Mbps>100 Mbps1000200How many VMs can be allocated to this machine?SolutionAt most 1 VM for this tenant can be allocated here2 VMs3 VMs2 VMs1 VMAllocation is fast and efficientPacking VMs together motivated by the fact that datacenter networks are typically oversubscribedAllocation can be extended for goals like failure resiliency, etc.Constraints for # of VMs (m) that can be allocated to the machine-VMs can only be allocated to empty slots m <= 13 VMs are requested m <= 3Enough b/w on outbound link min (m, 3-m)*100 <= 200Key intuitionValidity conditions can be used to determine the number of VMs that can be allocated to any level of the datacenter; machines, racks and so onGreedy allocation algorithmTraverse up the hierarchy and determine the lowest level at which all 3 VMs can be allocated
18Enforcing Virtual Networks Allocation algorithms assumeNo VM exceeds its bandwidth guaranteesEnforcement of virtual networksTo satisfy the above assumptionLimit tenant VMs to the bandwidth specified by their virtual networkIrrespective of the type of tenant traffic (UDP/TCP/...)Irrespective of number of flows between the VMs
19Enforcement in Oktopus: Key highlights Oktopus enforces virtual networks at end hostsUse egress rate limiters at end hostsImplement on hypervisor/VMMOktopus can be deployed todayNo changes to tenant applicationsNo network supportTenants without virtual networks can be supportedGood for incremental roll out
21both tenants and providers EvaluationOktopus deploymentOn a 25-node testbedBenchmark Oktopus implementationCross-validate simulation resultsLarge-scale simulationAllows us to quantify the benefits of virtual networks at scaleThe use of virtual networks benefitsboth tenants and providers
22Datacenter Simulator Flow-based simulator 16,000 servers and 4 VMs/server 64,000 VMsThree-tier network topology (10:1 oversubscription)Tenants submit requests for VMs and execute jobsJob: VMs process and shuffle data between each otherBaseline: representative of today’s setupTenants simply ask for VMsVMs are allocated in a locality-aware fashionVirtual network requestTenants ask for Virtual Cluster (VC) or Virtual Oversubscribed Cluster (VOC)
23Virtual networks improve completion time Private datacentersVC is Virtual ClusterVOC-10 is Virtual Oversubscribed Cluster with oversubscription=10WorseExecute a batch of 10,000 tenant jobsJobs vary in network intensiveness(bandwidth at which a job can generate data)BetterVirtual networks improve completion timeVC: 50% of BaselineVOC-10: 31% of BaselineJobs become more network intensive
24Private datacentersWith virtual networks, tenants get guaranteed network b/wJob completion time is boundedWith Baseline, tenant network b/w can vary significantlyJob completion time varies significantlyFor 25% of jobs, completion time increases by >280%Lagging jobs hurt datacenter throughputVirtual networks benefit both tenants and providerTenants: Job completion is faster and predictableProvider: Higher datacenter throughput
25Cloud Datacenters Tenant job requests arrive over time Amazon EC2’s reported targetutilizationWorseTenant job requests arrive over timeJobs are rejected if they cannot be accommodated on arrival (representative of cloud datacenters)BetterRejected RequestsBaseline: 31%VC: 15%VOC-10: 5%Job requests arrive faster
26Provider revenue increases while tenants pay less Tenant CostsWhat should tenants pay to ensure provider revenue neutrality, i.e. provider revenue remains the same with all approaches Based on today’s EC2 prices, i.e. $0.085/hour for each VMProvider revenue increases while tenants pay lessAt 70% target utilization, provider revenue increases by 20% and median tenant cost reduces by 42%
27Oktopus Deployment Implementation scales well and imposes low overhead Allocation of virtual networks is fastIn a datacenter with 105 machines, median allocation time is 0.35msEnforcement of virtual networks is cheapUse Traffic Control API to enforce rate limits at end hostsDeployment on testbed with 25 end hostsEnd hosts arranged in five racks
28Cross-validation of simulation results Oktopus DeploymentCross-validation of simulation resultsCompletion time for jobs in the simulator matches that on the testbed
29How to determine tenant network demands? SummaryProposal: Offer virtual networks to tenantsVirtual network abstractionsResemble physical networks in enterprisesMake transition easier for tenantsProof of concept: OktopusTenants get guaranteed network performanceSufficient multiplexing for providersWin-win: tenants pay less, providers earn more!How to determine tenant network demands?Ongoing work: Map high-level goals (like desired completion time) to Oktopus abstractions
32“These are my abstractions and if you don’t like them, I have others ” Other Abstractions“These are my abstractions and if you don’t like them, I have others ”… paraphrasing Groucho MarxAmazon EC2 Cluster ComputeGuaranteed 10Gbps bandwidth (at a high cost though)Tenants get a <N, 10Gbps> Virtual ClusterVirtual Datacenter NetworksEg., SecondNet offers tenants pairwise bandwidth guaranteesTenants get a clique virtual networkSuitable for all tenants, but limited provider flexibilityVirtual Networks from the HPC worldMany direct connect topologies, like hypercube, Butterfly networks, etc.
34Allocation algorithms Goals for allocationPerformance: Bandwidth between VMsFailure resiliency: VMs in different failure domainsEnergy efficiency: Packing VMs to minimize power...Oktopus allocation protocols can be extended to account for goals beyond bandwidth requirements
35Oktopus: Nits and Warts 1 Oktopus focuses on guaranteed internal network bandwidth for tenants and is a first step towards predictable datacentersOther contributors to performance variabilityBandwidth to storage tierExternal network bandwidthVirtual networks provide a concise means to capture tenant demands for such resources
36Oktopus: Nits and Warts 2 Oktopus semantics: Tenants get the bandwidth specified by their virtual network (nothing less, nothing more!)Spare network capacityUsed by tenants without virtual networksWork conserving solutionTenants get guarantees for minimum bandwidthSpare network capacity shared amongst tenants who can use itCan be achieved through work-conserving enforcement mechanisms
37Hose Model Flexible expression of tenant demands in VPN settings Same as the virtual cluster abstractionBetter than pipe model[Sigcomm 1999]Allocation problem is differentVirtual clusters: VMs can be allocated anywhereHose model: Tenant locations are fixed. Need to determine the mapping of virtual to physical links