New Directions in Traffic Measurement and Accounting Cristian Estan – UCSD George Varghese - UCSD Reviewed by Michela Becchi Discussion Leaders Andrew.

Slides:



Advertisements
Similar presentations
New Packet Sampling Technique for Robust Flow Measurements Shigeo Shioda Department of Architecture and Urban Science Graduate School of Engineering, Chiba.
Advertisements

Traffic Dynamics at a Commercial Backbone POP Nina Taft Sprint ATL Co-authors: Supratik Bhattacharyya, Jorjeta Jetcheva, Christophe Diot.
New Directions in Traffic Measurement and Accounting Cristian Estan (joint work with George Varghese)
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
OpenSketch Slides courtesy of Minlan Yu 1. Management = Measurement + Control Traffic engineering – Identify large traffic aggregates, traffic changes.
Jaringan Komputer Lanjut Packet Switching Network.
A Fast and Compact Method for Unveiling Significant Patterns in High-Speed Networks Tian Bu 1, Jin Cao 1, Aiyou Chen 1, Patrick P. C. Lee 2 Bell Labs,
Detecting DDoS Attacks on ISP Networks Ashwin Bharambe Carnegie Mellon University Joint work with: Aditya Akella, Mike Reiter and Srinivasan Seshan.
Detectability of Traffic Anomalies in Two Adjacent Networks Augustin Soule, Haakon Ringberg, Fernando Silveira, Jennifer Rexford, Christophe Diot.
11 Packet Sampling for Worm and Botnet Detection in TCP Connections Reporter: 林佳宜 /10/25.
PERSISTENT DROPPING: An Efficient Control of Traffic Aggregates Hani JamjoomKang G. Shin Electrical Engineering & Computer Science UNIVERSITY OF MICHIGAN,
Sampling and Flow Measurement Eric Purpus 5/18/04.
Polytechnic University,ECE Department1 Detection of “Hot Spots” Paper Title : Joint Data Streaming and Sampling Techniques for Detection of Super Sources.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Ph.D. DefenceUniversity of Alberta1 Approximation Algorithms for Frequency Related Query Processing on Streaming Data Presented by Fan Deng Supervisor:
Detecting Network Intrusions via Sampling : A Game Theoretic Approach Presented By: Matt Vidal Murali Kodialam T.V. Lakshman July 22, 2003 Bell Labs, Lucent.
“On Scalable Attack Detection in the Network” Ramana Rao Kompella, Sumeet Singh, and George Varghese Presented by Nadine Sundquist.
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
Catching Accurate Profiles in Hardware Satish Narayanasamy, Timothy Sherwood, Suleyman Sair, Brad Calder, George Varghese Presented by Jelena Trajkovic.
RelSamp: Preserving Application Structure in Sampled Flow Measurements Myungjin Lee, Mohammad Hajjat, Ramana Rao Kompella, Sanjay Rao.
Game-based Analysis of Denial-of- Service Prevention Protocols Ajay Mahimkar Class Project: CS 395T.
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
A Signal Analysis of Network Traffic Anomalies Paul Barford with Jeffery Kline, David Plonka, Amos Ron University of Wisconsin – Madison Summer, 2002.
George Varghese (based on Cristi Estan’s work) University of California, San Diego May 2011 Internet traffic measurement: from packets to insight.
Chapter 13: WAN Technologies and Routing 1. LAN vs. WAN 2. Packet switch 3. Forming a WAN 4. Addressing in WAN 5. Routing in WAN 6. Modeling WAN using.
Attig 1 Automatically Inferring Patterns of Resource Consumption in Network Traffic In Proceedings of SIGCOMM 2003 Reviewed By Michael Attig
COGNITIVE RADIO FOR NEXT-GENERATION WIRELESS NETWORKS: AN APPROACH TO OPPORTUNISTIC CHANNEL SELECTION IN IEEE BASED WIRELESS MESH Dusit Niyato,
Traffic Classification through Simple Statistical Fingerprinting M. Crotti, M. Dusi, F. Gringoli, L. Salgarelli ACM SIGCOMM Computer Communication Review,
NAROS : Host-Centric IPv6 Multihoming with Traffic Engineering A solution to perform traffic engineering in a IPv6 multihomed end-site, using a multi-addressing.
SIGCOMM 2002 New Directions in Traffic Measurement and Accounting Focusing on the Elephants, Ignoring the Mice Cristian Estan and George Varghese University.
Brierley 1 Module 4 Module 4 Introduction to LAN Switching.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic Minho Sung Networking & Telecommunications Group College.
Source-End Defense System against DDoS attacks Fu-Yuan Lee, Shiuhpyng Shieh, Jui-Ting Shieh and Sheng Hsuan Wang Distributed System and Network Security.
New Streaming Algorithms for Fast Detection of Superspreaders Shobha Venkataraman* Joint work with: Dawn Song*, Phillip Gibbons ¶,
NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004.
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao and Brian.
Bruno Ribeiro CS69000-DM1 Topics in Data Mining. Bruno Ribeiro  Reviews of next week’s papers due Friday 5pm (Sunday 11:59pm submission closes) ◦ Assignment.
ENERGY-EFFICIENT FORWARDING STRATEGIES FOR GEOGRAPHIC ROUTING in LOSSY WIRELESS SENSOR NETWORKS Presented by Prasad D. Karnik.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,
A Formal Analysis of Conservative Update Based Approximate Counting Gil Einziger and Roy Freidman Technion, Haifa.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
April 4th, 2002George Wai Wong1 Deriving IP Traffic Demands for an ISP Backbone Network Prepared for EECE565 – Data Communications.
Open-Eye Georgios Androulidakis National Technical University of Athens.
Efficient Cache Structures of IP Routers to Provide Policy-Based Services Graduate School of Engineering Osaka City University
1 - CS7701 – Fall 2004 Review of: Detecting Network Intrusions via Sampling: A Game Theoretic Approach Paper by: – Murali Kodialam (Bell Labs) – T.V. Lakshman.
PART3 Data collection methodology and NM paradigms 1.
D 陳怡安 R 解巽評 R 高榮泰 IEEE/ACM TRANSACTIONS ON NETWORKING OCTOBER 2006 Cristian Estan, George Varghese, Member, IEEE, and Michael Fisk.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Packet Classification Using Multidimensional Cutting Sumeet Singh (UCSD) Florin Baboescu (UCSD) George Varghese (UCSD) Jia Wang (AT&T Labs-Research) Reviewed.
IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo a, Jose G. Delgado-Frias Publisher: Journal of Systems.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
Network Anomaly Detection Using Autonomous System Flow Aggregates Thienne Johnson 1,2 and Loukas Lazos 1 1 Department of Electrical and Computer Engineering.
ECE 526 – Network Processing Systems Design Network Address Translator II.
1 Netflow Collection and Aggregation in the AT&T Common Backbone Carsten Lund.
PATH DIVERSITY WITH FORWARD ERROR CORRECTION SYSTEM FOR PACKET SWITCHED NETWORKS Thinh Nguyen and Avideh Zakhor IEEE INFOCOM 2003.
FlowRadar: A Better NetFlow For Data Centers
Distributed Network Traffic Feature Extraction for a Real-time IDS
Data Streaming in Computer Networking
Data collection methodology and NM paradigms
Optimal Elephant Flow Detection Presented by: Gil Einziger,
Mapping Internet Sensors With Probe Response Attacks
Transport Layer Identification of P2P Traffic
Lu Tang , Qun Huang, Patrick P. C. Lee
Presentation transcript:

New Directions in Traffic Measurement and Accounting Cristian Estan – UCSD George Varghese - UCSD Reviewed by Michela Becchi Discussion Leaders Andrew Levine Jeff Mitchell

Michela Becchi - 4/16/2015 Outline n Introduction n Cisco NetFlow n Sample and Hold & Multistage Filters n Analytical Evaluation n Comparison n Measurements n Conclusions

Michela Becchi - 4/16/2015 Introduction n Measuring and monitoring of network traffic for Internet Backbones »Long term traffic engineering (traffic rerouting and link upgrade) »Short term monitoring (hot spots and DOS attacks’ detection) »Accounting (usage based pricing) n Scalability problem »FixWest, MCI traces: ~million flows/hour between end host pairs

Michela Becchi - 4/16/2015 Cisco NetFlow n Flow: unidirectional stream of data identified by »Source IP address and port »Destination IP address and port »Protocol »TOS byte »Rx router interface n An entry in DRAM for each flow n Heuristics for end-of-flow detection n Flow data exported via UDP packets from routers to collection server for processing

Michela Becchi - 4/16/2015 Cisco NetFlow - problems n Processing overhead »Interfaces faster then OC3 (155Mbps) slowed down by memory cache updates n Collection overhead »Collection server »Network connection n NetFlow Aggregation (based on IP prefixes, ASes, ports) »Extra “aggregation” cache »Only aggregated data exported to collection server »PB: High amount of aggregates

Michela Becchi - 4/16/2015 Sampled NetFlow n Sampling packets n Per flow information based on samples n Problems: »Inaccurate (sampling and losses) »Memory Intensive »Slow (DRAM needed)

Michela Becchi - 4/16/2015 Idea n “A small percentage of flows accounts for a large percentage of the traffic” »Algorithms for identifying large flows n Use of SRAM instead of DRAM n Categorize algorithms depending on: 1.Memory size and memory references 2.False negatives 3.False positives 4.Expected error in traffic estimates

Michela Becchi - 4/16/2015 Algorithms n Sample and Hold »Sample to determine flows to consider »Update flow entry for every subsequent packet belonging to the flow n Multistage Filters »Use multiple tables of counters (stages) indexed by a hash function computed on flow ID »Different stages have independent hash functions »For each packet and for each stage, compute hash on flow ID and add the packet size to corresponding counter »Consider counters in all stages for addition of packets to flow memory

Michela Becchi - 4/16/2015 Sample and Hold F1 F3 F2 F3 F2F1 F4 F1 F1 1 F3 1 F1 2F1 3 F3 2 Transmitted Packets Flow Memory Sampled Packet (probability=1/3) Entry created Entry updated

Michela Becchi - 4/16/2015 flow memory Array of counters Hash(Pink) Multistage Filters

Michela Becchi - 4/16/2015 flow memory Array of counters Hash(Green) Multistage Filters

Michela Becchi - 4/16/2015 flow memory Array of counters Hash(Green) Multistage Filters

Michela Becchi - 4/16/2015 flow memory Multistage Filters

Michela Becchi - 4/16/2015 flow memory Collisions are OK Multistage Filters

Michela Becchi - 4/16/2015 flow memory stream1 1 Insert Reached threshold Multistage Filters

Michela Becchi - 4/16/2015 flow memory stream1 1 Multistage Filters

Michela Becchi - 4/16/2015 flow memory stream1 1 stream2 1 Multistage Filters

Michela Becchi - 4/16/2015 Stage 2 flow memory stream1 1 Stage 1 Multistage Filters

Michela Becchi - 4/16/2015 Parallel vs. Serial Multistage Filters n Threshold for serial filters: T/d (d = number of stages) n Parallel filters perform better on traces of actual traffic

Michela Becchi - 4/16/2015 Optimizations n Preserving entries »Nearly exact measurement of long lived large flows »Bigger flow memory required n Early removal »Definition of a threshold R < T to determine which entries added in the current interval to keep n Shielding »Avoid to update counters for flows already in flow memory »Reduction of false positives n Conservative update of the counters »Update normally only the smallest counter »No introduction of false negatives »Reduction of false positive

Michela Becchi - 4/16/2015 Gray = all prior packets Conservative update of counters

Michela Becchi - 4/16/2015 Redundant Conservative update of counters

Michela Becchi - 4/16/2015 Conservative update of counters

Michela Becchi - 4/16/2015 Analytical Evaluation n Sample and Hold »Prob.(false negatives): (1-p)^T ~ e^(-O) »Best estimate for flow size s: c+1/p »Upper bound for flow memory size: O*C/T –Preserving entries: 2O*C/T –Early removal: O*C/T+C/R n Parallel Multistage Filters »No false negatives »Prob(false positives): f(1/k)^d »Upper bound for flow size estimate error: f(T,1/k) »Bound on memory requirement Where T: threshold, p:sample prob (O/T), c: number of bytes counted for flow, C: link capacity, O: oversampling factor, d: filter depth, k: stage strength (T*b/C)

Michela Becchi - 4/16/2015 Comparison w/ Memory Constraint n Assumptions: »Memory Constraint M »The considered flow produces traffic zC (e.g. z=0.01) n Observations and Conclusions: »Mz ~ oversampling factor »S&H and MF better accuracy but more memory accesses »S&H and MF through SRAM, SNetflow through DRAM, as long as x is larger than the ratio of a DRAM memory access to an SRAM memory access

Michela Becchi - 4/16/2015 Comparison w/o Mem Constraint n Observations and Conclusions: »Through preserving of entries, S&H and MF provide exact estimation for long-lived large flows »S&H and MF gain in accuracy by losing in memory bound (u=zC/T) »Memory access as in case of constrained memory »S&H provides better accuracy for small measurement intervals => faster detection of new large flows »Increase in memory size => greater resource consumption

Michela Becchi - 4/16/2015 Dynamic threshold adaption n How to dimension the algorithms »Conservative bounds vs. accuracy »Missing a priori knowledge of flow distribution n Dynamical adaptation »Keep decreasing the threshold below the conservative estimate until the flow memory is nearly full »“Target usage” of memory »“Adjustment ratio” of threshold »For stability purposes, adjustments made across 3 intervals n Netflow: fixed sampling rate

Michela Becchi - 4/16/2015 Measurement setup n 3 unidirectional traces of Internet traffic n 3 flow definitions n Traces are between 13% and 17% of link capacities

Michela Becchi - 4/16/2015 Measurements n S&H (threshold 0.025%, oversampling 4) n MF (strength=3) n Differences between analytical bounds and actual behavior (lightly loaded links) n Effect of preserving entries and early removal

Michela Becchi - 4/16/2015 Measurements n Flow IDs: 5-tuple n Flow IDs: destination IP n Flow IDs: ASes n MF always better than S&H n SNetflow better for medium flows, worse for very large ones n AS: reduced number of flows (~entries in flow memory).

Michela Becchi - 4/16/2015 Conclusions n Focus on identifying large flows which creates the majority of network traffic n Proposal of two techniques »Providing higher accuracy than Sampled Netflow »Using limited memory resource (SRAM) n Mechanism to make the algorithms adaptable n Analytical Evaluation providing theoretical bounds n Experimental measurements showing the validity of the proposed algorithms

Michela Becchi - 4/16/2015 Future works n Generalize algorithms to automatically extract flow definitions for large flows n Deepen analysis, especially to cover discrepancy between theory and experimental measurements n Explore the commonalities with other research areas (e.g.: data mining, architecture, compilers) where issues related to data volume and high speed also hold

Michela Becchi - 4/16/2015 The End n Questions?

Michela Becchi - 4/16/2015 Zipf distribution n Characteristics: »Few data “score” very high »A medium number of elements have “medium score” »Huge number of elements “score” very low n Examples »Use of words in a natural language »Web use (e.g.: website accesses) »+