Presentation is loading. Please wait.

Presentation is loading. Please wait.

U NDERSTANDING D ATA C ENTER T RAFFIC C HARACTERISTICS Theophilus Benson 1, Ashok Anand 1, Aditya Akella 1, Ming Zhang 2 University Of Wisconsin – Madison.

Similar presentations


Presentation on theme: "U NDERSTANDING D ATA C ENTER T RAFFIC C HARACTERISTICS Theophilus Benson 1, Ashok Anand 1, Aditya Akella 1, Ming Zhang 2 University Of Wisconsin – Madison."— Presentation transcript:

1 U NDERSTANDING D ATA C ENTER T RAFFIC C HARACTERISTICS Theophilus Benson 1, Ashok Anand 1, Aditya Akella 1, Ming Zhang 2 University Of Wisconsin – Madison 1, Microsoft Research 2 1

2 D ATA C ENTERS B ACKGROUND Built to optimize cost and performance Tiered Architecture 3 layers; edge, aggregation, core Cheap devices at edges and expensive devices at core Over-subscription of links closer to the core Fewer links towards core reduce cost Trade negligible loss/delay for fewer devices and links 2

3 D ATA C ENTERS T ODAY Cheap and abundant Expensive and scarce Many little links Few large links 3 Cisco Canonical DC Architecture

4 C HALLENGES I N D ESIGNING F OR D ATA C ENTERS Very little is known about data centers No models for evaluation Lack of knowledge effects evaluation Use properties of wide area network traffic. Make up traffic matrixes/random traffic patterns. Insufficient for the following reasons Cant accurately compare techniques Oblivious to actual characteristics of data centers 4

5 D ATA C ENTER T RAFFIC C HARACTERIZATION Goals of our project Understand low level characteristics of traffic in data centers What is the arrival process? Is it similar or distinct from wide area networks? How does low level traffic impact the data center? 5

6 D ATA C ENTER T RAFFIC C HARACTERIZATION In studying data center traffic we found that: Few links experience loss Many links are unutilized Traffic adheres to ON-OFF Arrival process is log normal 6

7 O UTLINE Background Goals Data set Observations and insights Overview of traffic generator (see paper for details) Conclusion 7

8 D ATA S ETS Data from 19 data centers Differences in size and architecture Data for intranet and extranet server farms Applications: messaging, search, video streaming, email Data consists of Packet traces from edge switches in one data center SNMP MIB of devices in all data centers Data collected over a span of 10 days 8 Type# of DC Mean Size (# of Dev) 2-Tier1013 3-Tier9363

9 A NALYZING S NMP D ATA CoreAggregationEdge % of links used597357 % of links with at least one loss 432 Analyze link utilization and drops Analysis from one 5 minute interval Lot of un-utilized links Back-up/redundant links Aggregation layer has the most used links Funneling of traffic from aggregation Very few links with losses 9

10 A NALYZING S NMP D ATA : L INK U TILIZATION 95 th percentile used Core > Edge > Aggregation Core has fewest links Edge has smaller, (1Gbps) links higher util. than aggregation. 10

11 A NALYZING S NMP D ATA : L INK L OSS R ATES Aggregation > Edges > Core Utilization: Core > Edges > Aggregation Core has relatively little loss but high utilization All links loose less than 2% of packets Aggregation of flow leads to stability Edge & Aggr have significantly higher losses Few links (20%) experience high losses (over 40%) Most likely due to bursty traffic 11

12 I NSIGHTS F ROM S NMP Loss is localized to a few links (4%) Loss may be avoided by utilizing all links 40% of links are unused in some areas Reroute traffic Move applications/migrate virtual machine Inverse correlation between loss and utilization Should examine low level packet traces Traces from same 10 days as SNMP 12

13 A NALYZING P ACKET T RACES Time series of traffic on an edge link ON-OFF traffic at edges Time series shows ON-OFF patterns Binned in 15 and 100 m. secs ON-OFF persists 13

14 A NALYZING P ACKET T RACES What is the arrival process? Matlab curve-fitting (least mean square) Weibull, log normal, pareto, exponential Curve fits log-normal for the 3 distributions Inter-arrival, on-times, off-times All switches exhibit identical patterns Different from pareto (WAN) traffic 14

15 D ATA C ENTER T RAFFIC G ENERATOR Based on our insights we created a traffic generator Goal: produce a stream of packets that exhibits an ON-OFF arrival pattern Input: distribution of traffic volumes and loss rates from SNMP pulls for a link Output: the parameters for a fine grained arrival process that will produce the input distribution 15

16 D ATA C ENTER T RAFFIC G ENERATOR Approach Search the space of available parameters Simulate each set of parameters Accept parameters that pass a similarity test with high confidence Wilcoxon used for the similarity test 16

17 S HARING I NSIGHTS Implications for research and operations Evaluate designs with traffic generator Implications for Fat-tree Fat-tree: congestion eliminated through no over- provision and traffic balancing Parameterization: traffic engineering, flow classification, assumes stableness on the order of T seconds Our work can inform the setting of T 17

18 C ONCLUSION Analyzed traffic from 19 data centers Bottle neck aggregation layer Characterized arrival process at edge links Described a traffic generator for data centers Utilized for evaluation of data center designs Future work Analyze packet trace stableness of traffic matrix ratio of inter/intra-dc communication 18

19 Q UESTIONS ? 19


Download ppt "U NDERSTANDING D ATA C ENTER T RAFFIC C HARACTERISTICS Theophilus Benson 1, Ashok Anand 1, Aditya Akella 1, Ming Zhang 2 University Of Wisconsin – Madison."

Similar presentations


Ads by Google