Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 An Information-theoretic Approach to Network Measurement and Monitoring Yong Liu, Don Towsley, Tao Ye, Jean Bolot.

Similar presentations


Presentation on theme: "1 An Information-theoretic Approach to Network Measurement and Monitoring Yong Liu, Don Towsley, Tao Ye, Jean Bolot."— Presentation transcript:

1 1 An Information-theoretic Approach to Network Measurement and Monitoring Yong Liu, Don Towsley, Tao Ye, Jean Bolot

2 2 Outline  motivation  background  flow-based network model  full packet trace compression  marginal/joint  coarser granularity  netflow and SNMP  future work

3 3 Motivation  network monitoring: sensing a network  traffic engineering, anomaly detection, …  single point v.s. distributed  different granularities  full traffic trace: packet headers  flow level record: timing, volume  summary statistics: byte/packet counts  challenges  growing scales: high speed link, large topology  constrained resources: processing, storage, transmission  30G headers/hour at UMass gateway  solutions  sampling: temporal/spatial  compression: marginal/distributed

4 4 Questions  how much can we compress monitoring traces?  how much information is captured by different monitoring granularity?  packet trace/NetFlow/SNMP  how much joint information is there in multiple monitors?  joint compression  trace aggregation  monitor placement

5 5 Our Contribution  flow-based network models  explore temporal/spatial correlation in network traces  projection to different granularity  information theoretic framework  entropy: bound/guideline on trace compression  quantitative approach for more general problems  validation against measurement from operational network

6 6 Entropy & Compression  Shannon entropy of discrete r.v.  compression of i.i.d. symbols (length M) by coding  coding:  expected code length:  info. theoretic bound on compression ratio:  Shannon/Huffman coding  assign short codeword to frequent outcome  achieve the H(X) bound

7 7 Entropy & Correlation  joint entropy  entropy rate of stochastic process  exploit temporal correlation  Lempel-Ziv Coding: (LZ77, gzip, winzip) asymptotically achieve the bound for stationary process  joint entropy rate of correlated processes  exploit spatial correlation  Slepian-Wolf Coding: (distributed compression) encode each process individually, achieve joint entropy rate in limit

8 8 Network Trace Compression  naïve way: treat as byte stream, compress by generic tools  gzip compress UMass traces by a factor of 2  network traces are highly structured data  multiple fields per packet diversity in information richness correlation among fields  multiple packets per flow packets within a flow share information temporal correlation  multiple monitors traversed by a flow most fields unchanged within the network spatial correlation  network models  explore correlation structure  quantify information content of network traces  serves as lower bounds/guidelines for compression algorithms

9 9 Packet Header Trace source IP address destination IP address data sequence number acknowledgment number time stamp (sec.) time stamp (sub-sec.) total lengthToSvers.HLen IPIDflags TTLprotocolheader checksum destination portsource port window sizeHlen fragment offset TCP flags urgent pointerchecksum Timing IP Header TCP Header 01631

10 10 Header Field Entropy source IP address destination IP address data sequence number acknowledgment number time stamp (sec.) time stamp (sub-sec.) total lengthToSvers.HLen IPIDflags TTLprotocolheader checksum destination portsource port window sizeHlen fragment offset TCP flags urgent pointerchecksum Timing IP Header TCP Header 01631 flow id time

11 11 Single Point Packet Trace T0 F0 T1 F1 T3 F0 Tn Fn Tm F0  temporal correlation introduced by flows  packets from same flow closely spaced in time  they share header information  packet inter-arrival: # bits per packet: T0 F0 T3 F0 Tm F0  flow based trace:  flow record: F0 KT0 flow ID flow size arrival time packet inter-arrival

12 12 Network Models  flow-based model  flow arrivals follow Poisson with rate  flows are classified to independent flow classes according to routing (the set of routers traversed)  flow i is described by: flow inter-arrival time: flow ID: flow length: packet inter-arrival time within the flow:  packet arrival stochastic process:

13 13 Entropy in Flow Record  # bits per flow:  # bits per second:  marginal compression ratio  determined by flow length (pkts.) and variability in pkt. inter-arrival.

14 14 Single Point Compression: Results TraceH (total)Model Ratio Compression Algorithm C1-in706.37720.20020.6425 BB1-out736.17220.21390.6574 BB2-out689.90660.21860.6657  Compression ratio lower bound calculated by entropy much lower than real compression algorithm  Real compression algorithm difference  Records IPID, packet size, TCP/UDP fields  Fixed packet buffer for each flow => many flow records for long flows

15 15 Distributed Network Monitoring  single flow recorded by multiple monitors  spatial correlation: traces collected at distributed monitors are correlated  marginal node view: #bits/sec to represent flows seen by one node, bound on single point compression  network system view: #bits/sec to represent flows cross the network, bound on joint compression  joint compression ratio: quantify gain of joint compression

16 16  “perfect” network  fixed routes/constant link delay/no packet loss  flow classes based on routes  flows arrive with rate:  # of monitors traversed:  #bits per flow record:  info. rate at node v:  network view info. rate:  joint compression ratio: Baseline Joint Entropy Model  dependence on # of monitors travered

17 17 Joint Compression: Results Set of TracesJoint Compression Ratio {C1-in, BB1-out, C2-in, BB2-out}0.5 {C1-in, BB1-out}0.8649 {C1-in, BB2-out}0.8702 {C2-in, BB1-out}0.7125 {C2-in, BB2-out}0.6679

18 18 Coarser Granularity Models  NetFlow model  similar to flow model:  joint compression result similar to full trace  SNMP model  any link SNMP rate process is sum of rate processes of all flow classes passing through that link  traffic rates of flow classes are independent Gaussian  entropy can be calculated by covariance of these processes  information loss due to summation  small joint information between monitors  difficult to recover rates of flow classes from SNMP data

19 19 Joint Compression Ratio of Different Granularity Set of TracesSNMPNetFlowPacket Trace {C1-in, BB1-out}1.00210.85970.8649 {C1-in, BB2-out}0.99970.87820.8702

20 20 Conclusion  information theoretic bound on marginal compression ratio -- ~ 20% (time+flow id, even lower if include other low entropy fields)  marginal compression ratio high (not very compressible) in SNMP, lower in NetFlow, and the lowest in full trace  joint coding is much more useful/nessassary in full trace case than in SNMP  “More entropy for your buck”

21 21 Future Work  network impairments  how many more bits for delay/loss/route change  model netflow with sampling  distributed compression algorithms  lossless v.s. lossy compression  entropy based monitor placement  maximize information under constraints

22 22 Thanks!


Download ppt "1 An Information-theoretic Approach to Network Measurement and Monitoring Yong Liu, Don Towsley, Tao Ye, Jean Bolot."

Similar presentations


Ads by Google