Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards Predictable Datacenter Networks

Similar presentations


Presentation on theme: "Towards Predictable Datacenter Networks"— Presentation transcript:

1 Towards Predictable Datacenter Networks
Hitesh Ballani, Paolo Costa, Thomas Karagiannis, Ant Rowstron SIGCOMM 2011 Presenter: Lili Sun 2019/11/16

2 Outline Motivation and Goals Virtual Network Abstractions Oktopus
Evaluation Conclusion Discussion Clues

3 Production Datacenter
Backgrounds Datacenter Cloud datacenter Production datacenter Interface computing resources storage resources Cloud Datacenter Production Datacenter Virtual Network (VMs) Physical Network Storage Resource Computing Resource Interface Provider Tenant

4 Motivation and Goals Motivation: Network performance variability
Cloud datacenter (system load and VM placement) Production datacenter (variable network bandwidth) Challenges application performance unstable tenant costs unpredictable provider revenue loss Goals Guaranteed application performance Tenants' cost Providers' revenue

5 Virtual Network Abstractions
Virtual cluster (VC) Virtual oversubscribed cluster (VOC) Design goals Tenant suitability: An intuitive way about network performance Provider flexibility: multiplex many virtual networks on their physical network

6 Virtual cluster Tenant request: <N, B>
All-to-all traffic patterns Suitable for data-intensive applications

7 Virtual oversubscribed cluster
Tenant request: <N,B,S,O> Local communication patterns Suitable for the apps have special communications patterns.

8 Oktopus Support tenants opt for Two main components Network manager
Virtual cluster Virtual oversubscribed cluster No virtual cluster Two main components Management plane (request & account for network resources and maintain bandwidth reservations) Data plane (enforce the bandwidth available) Network manager Meet the bandwidth demands Maximize the number of tenants

9 Cluster Allocation A virtual cluster request r : <N,B>
Topology: tree-like physical network Bandwidth required on link : L 200Mbps 100Mbps 100Mbps 100Mbps 100Mbps 100Mbps 100Mbps

10 Allocation Algorithm Allocated VMs to a sub-tree (a machine, a rack, a pod) Number of empty VM slots in the sub-tree Residual bandwidth on the physical link For a machine For the same level Choose the sub-tree with the least amount of residual bandwidth For the different levels Start from the lowest level Physical machine < racks <pods (level) Goals a greater outbound bandwidth available allow accommodate more future tenants.

11 Oversubscribed Cluster Allocation
An oversubscribed cluster request: <N,S,B,O> The total bandwidth required by group i on link : The bandwidth to be reserved on link L for request r is the sum across all the groups

12 Allocation Algorithm Individual group is similar to a virtual cluster
Reuse the cluster allocation algorithm Conditional bandwidth needed for jth group of request r on link L : The bandwidth required by groups [1,…,i] on L: Allocate VMs to sub-tree v:

13 Enforcing Virtual Network
Rate limiting mechanism Traditional ways: bandwidth reservation at switches Oktopus: endhost-based rate enforcement Design Enforcement module: measures traffic rate to other VMs Controller VM: calculates the max-min fair share Enforcement module: uses per-destination-VM limiter to enforce them Advantage Calculating at Controller VM for each tenant reduce the control rate Enforcement modules enable distributed rate limits Tenant-specific computation reduces scale of the problem compute rates for each virtual network VM1 EM1 Controller VM EM (Sends traffic rate) (Per-destination-VM limiter) (Measures traffic rate) (Calculates traffic rate) …… Minimal rate …… Maximal rate (Max-min fair share) VM i EM i Enough BW …… Fair BW (Returns traffic rate) (Per-destination-VM limiter) (Measures traffic rate)

14 Enforcing Virtual Network
Tenants without virtual network Two-level priorities Traffic from tenants with a virtual network is high level Other traffic is low level (fair share) Unused capacity in a VM with a virtual network Weighted sharing mechanisms Unused capacity is distributed among all tenants

15 Design Discussion NM and Routing Failures
assumes that the datacenter has a simple tree topology For the topologies with limited path diversity For the even richer network topologies Multiple physical links can be treated as a single aggregate link NM can control datacenter routing to build tenant-specific trees Failures For failures of physical links and switches, our allocation algorithms can be extended to determine the tenant VMs that need to be migrated, and reallocated

16 Evaluation Simulation setup Virtual network request Simulation breadth
Tc : minimum compute time for the job Tn: the time for last flow to finish T = max (Tc, Tn): the completion time Tn < Tc: to minimize the tenants cost Baseline: the purely VM-based resource allocation locality-aware allocation algorithm A flow’s bandwidth is calculated according to max-min fairness Virtual network request <N> can be expressed as <N,B> or <N,B,S,O> Simulation breadth The entire space for most parameters of interest in today’s datacenters tenant bandwidth requirements, datacenter load, and physical topology oversubscription

17 Production Datacenter Experiment
Job completion time

18 Production Datacenter Experiment
Utilization the allocation of VMs does not account for network demands

19 Production Datacenter Experiment
Diverse communication patterns. each tenant VM requires a different bandwidth

20 Cloud Datacenter Experiment
Rejected Requests tenant dynamics with requests arriving over time admission control scheme

21 Cloud Datacenter Experiment
Tenant costs and provider revenue Tenant will be charged based on the time they occupy their VMs

22 Cloud Datacenter Experiment
Charging for bandwidth virtual network abstractions allow explicitly charging for network bandwidth <N,B> for time T, Tenant cost: or

23 Results and conclusion
Virtual network abstractions practical, can be efficiently implemented and provide significant benefits provide a simple way of information exchange between tenants and providers Tenant expose network requirement and pick the trade-off between the performance of applications and cost Provider account for the network resources and improve their revenue

24 Discussion clues Actual bandwidth requirement Failure of tenant VMs
Description of network bandwidth resources Network security Compare to the physical switch, virtual switch has a weaker monitoring capability, so how to ensure the network security? Network security How to solve the problem of description of network bandwidth resources? There is no datasets describing job bandwidth requirements. Description of network bandwidth resources For many tenant, they don't know how much bandwidth they need exactly for all kinds of applications, so how to deal with this problem? Different from the computing and storage resources, the use of bandwidth for one tenant will impact other tenants because of the limited total bandwidth resources. So besides the pricing model, how to make sure that the tenant’s bandwidth requirement is appropriate (not too much or too little) (for example the monitor system to provide the actual demands to tenants) Actual bandwidth requirement For the oversubscribed network cluster, if a tenant VM fails, does the failed VM or all tenant VMs in the intra-group need to be migrated and be reallocated? Because the communication between reallocated VM and other VMs will increases the bandwidth from the underlying physical infrastructure. Failure of tenant VMs

25 Thank you!


Download ppt "Towards Predictable Datacenter Networks"

Similar presentations


Ads by Google