Presentation on theme: "New Packet Sampling Technique for Robust Flow Measurements Shigeo Shioda Department of Architecture and Urban Science Graduate School of Engineering, Chiba."— Presentation transcript:
New Packet Sampling Technique for Robust Flow Measurements Shigeo Shioda Department of Architecture and Urban Science Graduate School of Engineering, Chiba University
Chiba University 2 Objectives of traffic measurements Short-term monitoring. Detecting high volume traffic patterns (denial of service attacks). Detecting unexpected or illegal packets. Investigating of origins. Long-term traffic engineering. Rerouting traffic. Upgrading selected links.
Chiba University 3 Per-flow-base traffic measurement (1) Just counting the number of packets or bytes is not sufficient; per-flow-base traffic measurement is necessary. What is a flow? Informally, a set of packets consisting logical communication between application processes running on different hosts. Flow-level information could tell us who is now using the Internet.
Chiba University 4 Per-flow-base traffic measurement (2) Flow 1 Flow 2 Meaning of a flow.
Chiba University 5 Per-flow-base traffic measurement (3) How we could distinguish flows. Investigating headers of packets. Classifying packets based on IP addresses, port numbers, and protocol ID. versionHLTOSTotal Length IdentificationFlagsFragment Offset TTLProtocol-IDHeader Checksum Source Address Destination Address Source PortDestination Port Sequence Number Acknowledgement Number IP Header TCP Header
Chiba University 6 20131500 Per-flow-base traffic measurement (4) Flow-measurement procedure. A Router maintains flow cache containing a flow record. When a packet is seen, a router updates counters of the corresponding entry in the flow cache. 3000 0011500 0 Flow 1: Flow 2: # of packets Flow 3: # of bytes 0011500 23000 Flow Cache Flow 1 packetFlow 2 packetFlow 3 packet 4500
Chiba University 7 Lack of scalability Due to the rapid increase of the todays line speed, the number of concurrent flows are increasing yearly. Updating per-flow counter on a per-packet basis is already impossible with todays line speed. The gap between DRAM speeds and link speeds is increasing. Problems of flow measurements
Chiba University 8 Packet sampling Updating a flow cache only for sampled packets. Elephant flows would be detected even under the packet sampling. Although many tiny (and unimportant) flows would be missed under the packet sampling, it does not matter in terms of network management. Ciscos Sampled NetFlow. How to sample packets?
Chiba University 9 Fixed rate sampling Definition Choosing sampled packets at a fixed rate For example, taking one in every N packets. Ciscos Sampled NetFlow uses the fixed rate sampling. Sampling PacketsNo Sampling Packets 0 t N = 5
Chiba University 10 Shortcomings of the fixed rate sampling The size of memory holding the flow cache strongly depends on the traffic load. When DoS attacks are in progress, the memory would be rapidly consumed even if the sampling rate is low. However, low sampling rate would yield large error in traffic measurement under the normal load. Its a hard decision for network operators to set the static sampling rate.
Chiba University 11 Fixed period sampling Definition Choosing at most one packet to sample in every fixed-length period (called sampling window) For example, taking one in every t w second. Our solution. 0 twtw 2 t w 3 t w 4 t w Sampling Window Sampling PacketsNon-sampled Packets
Chiba University 12 Properties of fixed period sampling The number of samplings during a second is bounded by 1/t w. The number of entries in the flow cache is also bounded. Sampling interval (t w ) is easily determined based on the available memory or CPU for flow measurements.
Chiba University 13 Number of flow entrees Time [s] Number of Entries Indianapolis-Kansas City Time [s] Number of Entries U.S.-Japan link Fixed period samplingFixed rate sampling N=1000, t w =10ms
Chiba University 14 Time [s] Number of Sampled Packets Time [s] Trace 1Trace 2 Number of sampled packets Fixed period samplingFixed rate sampling N=1000, t w =10ms
Chiba University 15 Second Packet Sampling (1) An arbitrary packet can be chosen to sample during each sampling window. Which packets to be sampled? The simplest (and the most natural) rule: the first packet sampling. Intuitively the first packet sampling rule seems to work well, but it is not true. We apply the second packet sampling.
Chiba University 16 First packet sampling and second packet sampling First packet sampling Second packet sampling 0 twtw 2 t w 3 t w 4 t w Sampling PacketsNon-sampled Packets 0 twtw 2 t w 3 t w 4 t w Sampling PacketsNon-sampled Packets
Chiba University 17 Second Packet Sampling (2) For example Flow 1: packets arrive periodically Flow 2: packets arrive according to a Poisson process We theoretically found that Under the first packet sampling rule, 63.2% of sampled packets are of flow 1. (strongly biased) Under the second packet sampling rule, 49.7% of sampled packets are of flow 1. (almost unbiased)
Chiba University 18 Flow level traffic estimation Sampling inevitably misses some information. Some inference techniques are required to know the statistics of flow level traffic from the sampled packets. Here, we focus on the flow rate estimation.
Chiba University 19 Flow rate estimation (1) Flow rate Informally, the rate at which a flow sends data. Formally, the ratio of the total bytes transferred to the flow duration. Flow rate is an index for identifying vital flows, which often have significant impact on network performance. Flow rate can be estimated from sampled packet streams.
Chiba University 20 Flow rate estimation (2) Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps] Real trace on a link between Indianapolis-Kansas City Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps] t w =10ms (0.15% packets were sampled) t w =1ms (1.5% packets were sampled)
Chiba University 21 Flow rate estimation (3) Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps] Real trace on a U.S. – Japan link t w =10ms (1.5% packets were sampled) t w =1ms (13.4% packets were sampled) Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps]
Chiba University 22 Conclusion Sampling techniques are indispensable to todays traffic measurement in the Internet. Fixed period sampling could bypass problems of the existing sampling technique (fixed rate sampling). Fixed period sampling should be used together with the second packet sampling. Flow rate can be estimated well with the fixed period sampling.
Chiba University 24 Flow rate estimation under first packet sampling N=1000, t w =10ms Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps] Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps] Indianapolis-Kansas U.S.-Japan link
Chiba University 25 Bayesian Estimates (2) t w =1ms Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps] Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps] Bayesian Estimator Naive Estimator
Chiba University 26 Bayesian Estimates (1) Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps] Estimated Flow Rate [Mbps] Actual Flow Rate [Mbps] Bayesian Estimator Naive Estimator t w =10ms
Chiba University 27 Objectives of traffic measurements (2) QoS monitoring. Measurement of QoS properties. Validating service-level agreement. Usage-based accounting. Input to charge or billing.
Chiba University 28 Shortcomings of the fixed rate sampling Is there any sampling strategy which work even under massive DoS attacks? 50 100 150 200 250 300 350 0150300450600750900 Traffic Time [s]
Chiba University 29 Existing solutions to the fixed rate sampling Sampling rate adaptation First, the sampling rate is initialized to the maximum rate, at which the processor can operate. Then, the sampling rate is dynamically adjusted based on the amount of consumed memory. Adaptive NetFlow. We propose another solution.
Chiba University 30 Fixed period sampling (2) Timeout transaction Under the sampling measurements, one could not exactly know the beginning and end of flows. (SYN or FIN packets may not be sampled.) Thus, flow entries that have not been seen during last N samplings are deleted from the flow cache. Due to timeout transaction, the flow cache keeps only flows, whose packets have been detected at least once during last N samplings.
Chiba University 31 Simulation experiments The accuracy of the flow-rate estimation was investigated using real traffic data. Two real traces (traffic data) were used. Trace1: Traffic data measured by PMA Project on a backbone link between Indianapolis - Kansas City. Trace 2: Traffic data measured by WIDE Project on a U.S. and Japan link published.
Chiba University 32 Flow rate estimation (2) Naïve estimation. Estimation based on the sampling frequency. Bayesian estimation. If we know the probability density function of the flow rate as prior information, we could apply Bayesian estimator to improve the estimation accuracy.