Presentation is loading. Please wait.

Presentation is loading. Please wait.

George Varghese (based on Cristi Estan’s work) University of California, San Diego May 2011 Internet traffic measurement: from packets to insight.

Similar presentations


Presentation on theme: "George Varghese (based on Cristi Estan’s work) University of California, San Diego May 2011 Internet traffic measurement: from packets to insight."— Presentation transcript:

1 George Varghese (based on Cristi Estan’s work) University of California, San Diego May 2011 Internet traffic measurement: from packets to insight

2 Research motivation The Internet in 1969The Internet today Problems Flexibility, speed, scalability Overloads, attacks, failures Measurement & control Ad-hoc solutions suffice Engineered solutions needed Research direction: towards a theoretical foundation for systems doing engineered measurement of the Internet

3 Current solutions Analysis Server Raw data Traffic reports Network Operator Router Fast link Memory Network State of the art: simple counters (SNMP), time series plots of traffic (MRTG), sampled packet headers (NetFlow), top k reports Concise? Accurate?

4 Measurement challenges Data reduction – performance constraints  Memory (Terabytes of data each hour)  Link speeds (40 Gbps links)  Processing (8 ns to process a packet) Data analysis – unpredictability  Unconstrained service model (e.g. Napster, Kazaa )  Unscrupulous agents (e.g. Slammer worm)  Uncontrolled growth (e.g. user growth)

5 Main contributions Data reduction: Algorithmic solutions for measurement building blocks  Identifying heavy hitters (part 1 of talk)  Counting flows or distinct addresses Data analysis: Traffic cluster analysis automatically finds the dominant modes of network usage (part 2 of talk)  AutoFocus traffic analysis system used by hundreds of network administrators

6 Identifying heavy hitters Analysis Server Raw data Traffic reports Router Fast link Memory Network Identifying heavy hitters with multistage filters Network Operator

7 Why are heavy hitters important? Network monitoring: Current tools report top applications, top senders/receivers of traffic Security: Malicious activities such as worms and flooding DoS attacks generate much traffic Capacity planning: Largest elements of traffic matrix determine network growth trends Accounting: Usage based billing most important for most active customers

8 Problem definition Identify and measure all streams whose traffic exceeds threshold (0.1% of link capacity) over certain time interval (1 minute)  Streams defined by fields (e.g. destination IP)  Single pass over packets  Small worst case per packet processing  Small memory usage  Few false positives / false negatives

9 Measuring the heavy hitters Unscalable solution: keep hash table with a counter for each stream and report largest entries Inaccurate solution: count only sampled packets and compensate in analysis Ideal solution: count all packets but only for the heavy hitters Our solution: identify heavy hitters on the fly  Fundamental advantage over sampling – instead of (M is available memory)

10 Why is sample & hold better? uncertainty Sample and hold Ordinary sampling

11 How do multistage filters work? Array of counters Hash(Pink)

12 How do multistage filters work? Collisions are OK

13 How do multistage filters work? Stream memory stream1 1 Insert Reached threshold stream2 1

14 Stage 2 How do multistage filters work? Stream memory stream1 1 Stage 1

15 Conservative update Gray = all prior packets

16 Conservative update Redundant

17 Conservative update

18 Multistage filter analysis Question: Find probability that a small stream (0.1% of traffic) passes filter with d = 4 stages * b = 1,000 counters, threshold T = 1% Analysis: (any stream distribution & packet order)  can pass a stage if other streams in its bucket ≥ 0.9% of traffic  at most 111 such buckets in a stage => probability of passing one stage ≤ 11.1%  probability of passing all 4 stages ≤ 0.111 4 = 0.015%  result tight

19 Multistage filter analysis results d – filter stages T – threshold h=C/T, (C capacity) k=b/h, (b buckets) n – number of streams M – total memory QuantityResult Probability to pass filter Streams passing Relative error

20 Bounds versus actual filtering Number of stages Average probability of passing filter for small streams (log scale) Worst case bound Zipf bound Actual Conservative update 1 2 3 4 1 0.1 0.01 0.001 0.0001 0.00001

21 Comparing to current solution Trace: 2.6 Gbps link, 43,000 streams in 5 seconds Multistage filters: 1 Mbit of SRAM (4096 entries) Sampling: p=1/16, unlimited DRAM Average absolute error / average stream size Stream sizeMultistage filtersSampling s > 0.1%0.01%5.72% 0.1% ≥ s > 0.01%0.95%20.8% 0.01% ≥ s > 0.001%39.9%46.6%

22 Summary for heavy hitters Heavy hitters important for measurement processes More accurate results than random sampling:. instead of Multistage filters with conservative update outperform theoretical bounds Prototype implemented at 10 Gbps ?

23 Building block 2, counting streams Core idea  Hash streams to bitmap and count bits set  Sample bitmap to save memory and scale  Multiple scaling factors to cover wide ranges Result  Can count up to 100 million streams with an average error of 1% using 2 Kbytes of memory Accurate for 16-32 streams 8-15 streams 0-7 streams

24 Bitmap counting Does not work if there are too many flows Hash based on flow identifier Estimate based on the number of bits set

25 Bitmap counting Bitmap takes too much memory Increase bitmap size

26 Bitmap counting Too inaccurate if there are few flows Store only a sample of the bitmap and extrapolate

27 Bitmap counting Must update multiple bitmaps for each packet Use multiple bitmaps, each accurate over a different range Accurate if number of flows is 16-32 8-15 0-7

28 Bitmap counting 16-32 8-15 0-7

29 Bitmap counting Multiresolution bitmap 0-32

30 Future work

31 Traffic cluster analysis Analysis Server Raw data Traffic reports Router Fast link Memory Network Network Operator Part 2: Describing traffic with traffic cluster analysis Part 1: Identifying heavy hitters, counting streams

32 Finding heavy hitters not enough RankDestination IPTraffic 1jeff.dorm.bigU.edu11.9% 2lisa.dorm.bigU.edu3.12% 3risc.cs.bigU.edu2.83% Most traffic goes to the dorms … RankDest. networkTraffic 1library.bigU.edu27.5% 2cs.bigU.edu18.1% 3dorm.bigU.edu17.8% Where does the traffic come from? …… What apps are used? Which network uses web and which one kazaa? Aggregating on individual fields useful but  Traffic reports often not at right granularity  Cannot show aggregates over multiple fields Traffic analysis tool should automatically find aggregates over right fields at right granularity RankSource IPTraffic 1forms.irs.gov13.4% 2ftp.debian.org5.78% 3www.cnn.com3.25% RankSource NetworkTraffic 1att.com25.4% 2yahoo.com15.8% 3badU.edu12.2% RankApplicationTraffic 1web42.1% 2ICMP12.5% 3kazaa11.5%

33 Ideal traffic report Traffic aggregateTraffic Web traffic42.1% Web traffic to library.bigU.edu26.7% Web traffic from forms.irs.gov13.4% ICMP from sloppynet.badU.edu to jeff.dorm.bigU.edu11.9% Web is the dominant application The library is a heavy user of web That’s a big flash crowd! This is a Denial of Service attack !! Traffic cluster reports try to give insights into the structure of the traffic mix

34 Definition A traffic report gives the size of all traffic clusters above a threshold T and is:  Multidimensional: clusters defined by ranges from natural hierarchy for each field  Compressed: omits clusters whose traffic is within error T of more specific clusters in the report  Prioritized: clusters have unexpectedness labels

35 Unidimensional report example 10.0.0.210.0.0.310.0.0.410.0.0.510.0.0.810.0.0.910.0.0.1010.0.0.14 153530 40 160110 35 75 Threshold=100 Hierarchy 50702703575 3055070 120380 500 160110 270 305 120380 500 10.0.0.2/3110.0.0.4/3110.0.0.8/31 10.0.0. 10/31 10.0.0.0/3010.0.0.4/3010.0.0.8/30 10.0.0.0/2910.0.0.8/29 10.0.0.0/28 10.0.0.14/31 10.0.0.12/30 AI Lab 2 nd floor CS Dept

36 270 120 500 305 380 160110 Unidimensional report example 10.0.0.810.0.0.9 10.0.0.0/2910.0.0.8/29 10.0.0.8/31 10.0.0.8/30 10.0.0.0/28 120380 160110 Compression 305-270<100 380-270≥100 Source IPTraffic 10.0.0.0/29120 10.0.0.8/29380 10.0.0.8160 10.0.0.9110 Rule: omit clusters with traffic within error T of more specific clusters in the report

37 Multidimensional structure All traffic USEU CANYFRRU WebMail Source netApplication All traffic EU RU Mail RU Mail RU Web

38 AutoFocus: system structure Traffic parser Web based GUI Cluster miner Grapher Packet header trace / NetFlow data categories names

39 Traffic reports for weeks, days, three hour intervals and half hour intervals

40

41 Colors – user defined traffic categories Separate reports for each category

42 Analysis of unusual events Sapphire/SQL Slammer worm  Found worm port and protocol automatically

43 Analysis of unusual events Sapphire/SQL Slammer worm  Identified infected hosts

44 Related work Databases [FS+98] Iceberg Queries  Limited analysis, no conservative update Theory [GM98,CCF02] Synopses, sketches  Less accurate than multistage filters Data Mining [AIS93] Association rules  No/limited hierarchy, no compression Databases [GCB+97] Data cube  No automatic generation of “interesting” clusters


Download ppt "George Varghese (based on Cristi Estan’s work) University of California, San Diego May 2011 Internet traffic measurement: from packets to insight."

Similar presentations


Ads by Google