Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows Sriharsha Gangam 1, Puneet Sharma 2, Sonia Fahmy 1 1 Purdue University, 2 HP Labs.

Similar presentations


Presentation on theme: "Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows Sriharsha Gangam 1, Puneet Sharma 2, Sonia Fahmy 1 1 Purdue University, 2 HP Labs."— Presentation transcript:

1 Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows Sriharsha Gangam 1, Puneet Sharma 2, Sonia Fahmy 1 1 Purdue University, 2 HP Labs 1 This research has been sponsored in part by GENI project 1723, and by Hewlett-Packard

2 Passive Flow Monitoring 2 Detect network congestion, attacks, faults, anomalies, traffic engineering and accounting Observe and collect traffic summaries e.g., InMon traffic sentinel [InMon] uses sFlow, Cisco’s NetFlow is used in ISPs Monitoring Data Collection & Analysis Network Devices e.g., switches [InMon] http://inmon.com

3 Passive Flow Monitoring - Challenges Large overhead to collect and analyze fine-grained flow data Increasing link speeds, network size and traffic o Limited CPU, memory resources at the routers o Millions of flows in ISP networks Current Techniques? o NetFlow sampling rate in ISPs ~ 1 in 100 (Internet2) o sFlow packet sampling rate ~ 1 in 2000 o Application dependent sketches o Fine-grained information is lost 3

4 Will More Resources Help? Commercial co-located compute and storage o HP ONE Blades o Cisco SRE Modules Example configuration o 2.20 GHz Core Duo processor o 4 GB RAM, 250 GB HD o 2x10 Gbps duplex bandwidth to switch Storage and Analysis of fine-grained flow statistics o Distributed monitoring applications 4

5 Design Space 5 Network Overhead Additional Compute & Storage Accuracy Ideal Solution Current Solutions: Sampling and Sketching Our Goals: Pegasus - Accurate & low overhead monitoring Naïve Solution Impractical

6 Key Class of Applications Network bottlenecks o Top traffic destinations, sources, and links Suspicious port scanning activity o Sources that connect to more than 10% hosts within time T DDoS attack detection o Destinations with large number of connections or traffic 6

7 Global Iceberg Detection Items with aggregate count exceeding a threshold (S xθ) o Global heavy hitters Observations at any single switch/router may not be significant or interesting o E.g., DDoS attack 7 Monitoring Data Items contributing > 1% (θ) traffic? Network Devices e.g., switches h1h1 h2h2 h4h4 … 201050… h2h2 h3h3 h4h4 … 601520… h3h3 h5h5 h6h6 … 501030…

8 Online Iceberg Detection with Pegasus Reduce communication overhead o Additional compute and storage Precisely detect all global icebergs o zero false positives and false negatives Feedback based iterative approach 8 High precision Iterative solution

9 Comparison of Different Approaches Network Devices e.g., switches Naïve Approach Prohibitively large Monitoring Data Collection & Analysis (Aggregator) Sampling and Sketching Lossy Summary: False +ves and -ves Pegasus Lossy Summary: Sketch- sets i1i1 i2i2 i4i4 … 201050… i2i2 i3i3 i4i4 … 601520… i3i3 i5i5 i6i6 … 501030… Fine-grained data on-demand: No False +ves or -ves Monitor

10 1- D Sketch-set Representation Sketch-set: Summary representation of a collection of flows, supports set operations 10 β Coarse Sketch-set Generation (Destination IP, Packet Count) 128.41.10.10, 128.41.10.50, 15, 30 128.41.10.110, 128.41.10.150, 100, 110 128.41.10.210, 128.41.10.210, 300, 300 Coarse-grained sketch- sets α 128.41.10.10, 15 128.41.10.20, 20 128.41.10.30, 15 128.41.10.40, 30 128.41.10.50, 25 128.41.10.110, 110 128.41.10.150, 100 128.41.10.210, 300 (startIP, endIP, minPkt, maxPkt) Example: Destinations IPs receiving more than 200 packets

11 Example 11 128.41.10.35, 128.41.10.70, 10, 35 128.41.10.100, 128.41.10.120, 90, 130 128.41.10.10, 128.41.10.50, 15, 30 128.41.10.110, 128.41.10.150, 100, 110 128.41.10.210, 128.41.10.210, 300, 300 (startIP, endIP, minPkt, maxPkt) Coarse-grained Sketch-sets Monitor 2 Monitor 1 Aggregator Disjoint Sketch-sets INTERSECTION SUBTRACTION Non- icebergs Query monitors (uncertain) Iceberg 128.41.10.10, 128.41.10.34, 15, 30 128.41.10.35, 128.41.10.50, 10, 65 128.41.10.51, 128.41.10.70, 10, 35 128.41.10.100, 128.41.10.109, 90, 130 128.41.10.121, 128.41.120.150, 100, 110 128.41.10.110, 128.41.10.120, 90, 240 128.41.10.210, 128.41.10.210, 300, 300

12 Example…Query Response 12 Aggregator Query: (128.41.10.110, 128.41.10.120) 128.41.10.110, 90 128.41.10.120, 130 128.41.10.110, 110 Monitor 2 Monitor 1 Lookup relevant flows Generate Sketch-sets (finer granularity) 128.41.10.110, 128.41.10.110, 90, 90 128.41.10.120, 128.41.10.120, 130, 130 128.41.10.110, 128.41.10.110, 110, 110

13 Example…Query Response 13 Aggregator Query: (128.41.10.110, 128.41.10.120) Monitor 2 Monitor 1 128.41.10.110, 128.41.10.110, 90, 90 128.41.10.120, 128.41.10.120, 130, 130 128.41.10.110, 128.41.10.110, 110, 110 128.41.10.110, 128.41.10.110, 200, 200 128.41.10.120, 128.41.10.120, 130, 130 Fine-grained sketch-sets Aggregator Non- icebergs Iceberg

14 Evaluation Methodology Abilene trace o Netflow records: 11 sites with 1 in 100 sampling for 5 min o Add small flows to revert sampling (90% of flows contribute to 20% of traffic, ~ 758K unique flow records) o Trace is used in [Huang11] Enterprise network sFlow trace o sFlow records: 249 switches,1 in 2000 sampling for a week o Revert sampling by adding flows PlanetLab’s Outgoing Traffic o NetFlow records generated at each PlanetLab host 14 [Huang11] G. Huang, A. Lall, C. Chuah, and J. Xu. Uncovering global icebergs in distributed streams: Results and implications. J. Netw. Syst. Manage., 19:84–110, March 2011

15 Comparison with Sample- sketch Sends sampled monitoring data and sketches to the aggregator for iceberg detection Uses two main parameters o Sampling interval o Sketch threshold Difficult to decide the parameters Can have false positives and false negatives 15 G. Huang, A. Lall, C. Chuah, and J. Xu. Uncovering global icebergs in distributed streams: Results and implications. J. Netw. Syst. Manage., 19:84–110, March 2011

16 Abilene Trace 16 For the 5 min trace, θ = 0.08 -Naive solution: ≈ 7.63 MB - Pegasus: ≈ 8 KB -Sample-Sketch: ≈ 36 KB Larger is better Pegasus has lower communication overhead θ θ

17 Monitoring Outgoing PlanetLab Traffic Example of end-host monitoring system Detect accidental attacks and anomalies originating from PlanetLab Existing monitoring service: PlanetFlow o Decouples collection from analysis o Collects 1 TB of data every month [PF] (naïve approach) 17 PlanetLab nodes Monitor Aggregator Monitor NetFlow records generated from outgoing traffic [PF] http://www.cs.princeton.edu/~sapanb/planetflow2/

18 Pegasus PlanetLab Service PlanetLab’s outgoing traffic o NetFlow records of ~250 PlanetLab nodes o Online global iceberg detection service Global Iceberg detection for o Flow identifier: Destination IP, Source Port, Destination Port o Flow size: Packet count 18

19 Pegasus PlanetLab Service 15 hour deployment - Pegasus: 403 MB, Naïve: 2.26 GB Most outbound traffic to other PlanetLab hosts 1- Day outgoing traffic: CoDNS and CoDeeN don’t produce many icebergs 19 Source Port IcebergsDestination Port Icebergs 3 (CompressNET) 8 (unassigned)0 (Reserved) 22 (SSH)53 (DNS) 80 (HTTP)80 (HTTP), 443 (HTTPS)

20 Conclusions Pegasus: A distributed measurement system o Commercial co-located compute and storage devices o Low network overhead o High accuracy Adaptive aggregation for the global iceberg detection o Iterative feedback solution Experiments from real traces and PlanetLab deployment o low overhead without false +ves and -ves 20

21 Thank you Questions? 21

22 Anomaly Examples Based on traffic features [Kind09] 22 [Kind09] Histogram-Based Traffic Anomaly Detection, In IEEE Trans. on Netwk. Service Management

23 Related Work Threshold Algorithm (TA) [Fagin03] o Large number of iterations Three phase uniform threshold (TPUT) [Cao04] o Accounting data distributions [Yu05] Filtering based continuous monitoring algorithms [Babcock03] [Keralapura06] [Olston03] o Send update to aggregator when local arithmetic constraints fail 23 [Yu05] Efficient processing of distributed top-k queries. In Proc. of DEXA, 2005 [Cao04] Efficient Top-K Query Calculation in Distributed Networks. In proc. of PODC, 2004 [Fagin03] Optimal aggregation algorithms for middleware. Jour. of Comp. and Sys. Sciences, 2003 [Babcock03] Distributed top-k monitoring. In Proc. SIGMOD, 2003 [Keralapura06] Communication-efficient distributed monitoring of thresholded counts. In Proc. of SIGMOD, 2006 [Olston03] Adaptive filters for continuous queries over distributed data streams. In Proc. SIGMOD, 2003

24 Sketch-set Granularity - G High granularity ⇒ More precise, more expensive representation Granularity definition: maxSize – minSize Used to determine if more flows should be combined in a sketch-set Used to send finer granularity during monitor response (for convergence) 24

25 Iterative Feedback Algorithm 25

26 Abilene Trace 26 β little influence on the communication cost

27 Enterprise Network sFlow Trace 27 Larger is better All except one parameter pair (green) has false positives and negatives

28 Scalability with Number of Monitors 28

29 Scalability with Number of Monitors 29 sFlow trace Abilene trace Larger is better


Download ppt "Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows Sriharsha Gangam 1, Puneet Sharma 2, Sonia Fahmy 1 1 Purdue University, 2 HP Labs."

Similar presentations


Ads by Google