Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices Qi Zhao*, Abhishek Kumar*, Jia Wang + and Jun (Jim) Xu* *College.

Similar presentations


Presentation on theme: "Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices Qi Zhao*, Abhishek Kumar*, Jia Wang + and Jun (Jim) Xu* *College."— Presentation transcript:

1 Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices Qi Zhao*, Abhishek Kumar*, Jia Wang + and Jun (Jim) Xu* *College of Computing, Georgia Tech + AT&T Labs - Research

2 Flow matrix FM  FM [i, j, f] = the size of the flow f flowing from node i to node j  Useful in Computing usage pattern of ISPs Detecting of flapping routes Detecting DDoS attacks Traffic and flow matrices Traffic matrix TM  TM [i, j] = traffic volume from node i to node j  Useful in Capacity planning and forecasting Routing configuration Network fault/reliability diagnoses Provisioning for SLA

3 Existing approaches Traffic matrix  Indirect inference (holistic) Link counts from SNMP Routing matrix Network model  Direct measurement Sampling Our approach Flow matrix  Not well studied yet  Straightforward approach: sampling

4 Data streaming algorithms Data streaming: processing a long stream of data items in one pass using a small working memory in order to answer a class of queries regarding the stream. Our context  Packet arrival rate is high (e.g., 10-40 Gbps)  Small but fast memory — SRAM (10ns per access) will be used. Challenge: how to fully use SRAM to remember as much information pertinent to traffic/flow matrix as possible?

5 Two data streaming schemes The bitmap-based scheme  Traffic matrix The counter array-based scheme  Flow matrix  Traffic matrix

6 System model Online streaming module Online streaming module Data analysis module Node i Node j Sever

7 The bitmap-based scheme Online streaming module Data analysis module

8 Online streaming module The data digest data-structure is a bit array (bitmap) initially set to all 0’s. It is updated upon each packet arrival. Measurement proceeds in epochs.

9 Example packet 012i 0 Invariant packet header + the first 8 bytes of the payload [Snoeren et al. SIGCOMM’01] shows that these 28 bytes are sufficient to differentiate almost all non-identical packets. H(.) U := U-1 If U/b < Threshold save the bitmap start a new epoch b-1 1

10 Complexities Computational complexity  One hash function computation  One write to the memory Storage complexity  Each packet only produces a little more than one bit as its digest.  This can be further reduced using sampling.

11 The bitmap-based scheme Online streaming module Data analysis module

12 What we have so far? (for TM [i, j]):  BM i generated by the traffic at node i (T i ) and  BM j generated by the traffic at node j (T j ) What we want to estimate

13 Estimation based on BM i and BM j [Whang et al. 1990] proposed a method to infer |T| from BM, i.e., where is the number of “0”s in BM. |T i U T j | can be inferred from the bitwise-OR of BM i and BM j. An estimator of TM [i, j] is given by We derive the variance of the estimator

14 Multipaging 1 1 234 23 t1t1 t2t2 Node i Node j

15 Eliminating the effects of clock offset and packets in transit 1 1 234 23 t Node i Node j T1 : a tight upper bound of clock offset (e.g., 50ms in a NTP enabled network) If t < T1, then overlap(1,2) = 1 Combining with packets in transit T2 : a tight upper bound of packet traversal time If t < T1+T2, then overlap(1,2) = 1

16 Counter array based scheme Online streaming module Data analysis module

17 Online streaming module The data digest data-structure is a counter array. It is updated upon each packet arrival. Measurement proceeds in epochs.

18 Example packet 012i b-1 n Flow label H(.) n+1

19 Counter array based scheme Online streaming module Data analysis module

20 Principle: find good counter-value matching between ingress nodes and egress nodes Challenge: the hashing collisions make the one- to-one matching fail. Method: iterative elephant-first matching Accuracy: work well for the medium-to-large flow matrix elements due to the Zipfian nature of Internet traffic.

21 Elephant-first matching K a1a1 Node i a2a2 Node j a1>a2 a1-a2 Node i 0 Node j FM[i, j, f] = a2 K a1a1a2a2 a1<=a2 0a2-a1FM[i, j, f] = a1

22 Evaluation Ideally it would require packet-level traces collected simultaneously at hundreds of ingress and egress routers in an ISP during a certain period of time. We construct the synthetic experiments based on 16 publicly available packet- level traces from NLANR.

23 Evaluation: traffic matrix bitmap schemecounter array scheme

24 Metric

25 RMSRE: traffic matrix

26 RMSRE: flow matrix

27 Conclusion A novel data streaming algorithm that can produces traffic matrix estimation much more accurate than existing approaches. Another data streaming algorithm that very accurately estimates flow matrix, a finer-grained characterization than traffic matrix. Both algorithms are designed to operate at very high speed networks.

28 Thank You! Questions?


Download ppt "Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices Qi Zhao*, Abhishek Kumar*, Jia Wang + and Jun (Jim) Xu* *College."

Similar presentations


Ads by Google