Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Raffaello Secchi SPECTS 2005 – July 27, 2005 BRUTE: A High Performance and Extensible Traffic Generator Nicola Bonelli, Stefano Giordano, Gregorio Procissi,

Similar presentations


Presentation on theme: "1 Raffaello Secchi SPECTS 2005 – July 27, 2005 BRUTE: A High Performance and Extensible Traffic Generator Nicola Bonelli, Stefano Giordano, Gregorio Procissi,"— Presentation transcript:

1 1 Raffaello Secchi SPECTS 2005 – July 27, 2005 BRUTE: A High Performance and Extensible Traffic Generator Nicola Bonelli, Stefano Giordano, Gregorio Procissi, Raffaello Secchi Department of Information Engineering University of Pisa

2 2 Raffaello Secchi SPECTS 2005 – July 27, 2005 Outline Motivations BRUTE Features Architecture design and internals Implementing issues Extensibility and modularity Script language Application Programming Interface (API) Programming library Macros Traffic modules Performance evaluation Fast Ethernet Scenario Gigabit Ethernet Scenario Conclusions

3 3 Raffaello Secchi SPECTS 2005 – July 27, 2005 Motivations & Requirements The current open-source software tools are not suitable to deal with high-speed networks: poor performance in terms of generated frames per second scarce timing/rate accuracy in traffic generation Requirements: high performance and precision extensibility configurability RFC2544 compliance We developed a tool that … generate high speed flows over Fast- and Gigabit-Ethernet extensible through a modular architecture configurable through an ad-hoc script language IP version independent: IPv4, IPv6

4 4 Raffaello Secchi SPECTS 2005 – July 27, 2005 BRUTE Features What is BRUTE? BRUTE is a Linux user-space real-time traffic engine operating at layer-II and layer-III High performance Saturate Fast-Ethernet link with short frame length (64 bytes) Saturate Gigabit-Ethernet link with 128 bytes frame length Configuration Flexible script language, which allows the user to define customized traffic patterns Extensible design Traffic modules (C-language) API (library functions and macros) Frame building, memory allocation, sockets handling IP checksum Reliable statistical distributions Timing resources

5 5 Raffaello Secchi SPECTS 2005 – July 27, 2005 Implementing Choices (1/3) Timing issues: temporal accuracy – A traffic generator deals with packets and inter- departure times… busy-wait polling versus system call sleep mechanism – The gettimeofday features: low resolution (1 μsec) high latency due to the time evaluation (500 CPU cycles) system-call interrupt mechanism – Reading the CPU time-stamp-counter… higher resolution (1 nano-sec with 1Ghz CPU clock) lower latency around 32 CPU cycles (Intel Pentium) no interrupt (no system call)

6 6 Raffaello Secchi SPECTS 2005 – July 27, 2005 Internals of Linux Kernel 2.4

7 7 Raffaello Secchi SPECTS 2005 – July 27, 2005 Implementing Choice (2/3) Socket family – The sendto computational load differs according to socket family – PF_PACKET family avoids routing and headers building – PF_PACKET bypass the Linux NetFilter Framework – PF_PACKET allows to customize the Ethernet frame RFC2544 suggests some tests using random MAC address

8 8 Raffaello Secchi SPECTS 2005 – July 27, 2005 Implementing Choices (3/3) Scheduling policy Real-time requirements Traffic generator is a typical real-time application Linux soft real-time SCHED_FIFO policy control over the order of execution of processes static priority assigned to process preemption of any normal process no time slicing Memory blocking avoid paging delays mlockall used to disable paging

9 9 Raffaello Secchi SPECTS 2005 – July 27, 2005 Overall Architecture The modular design involves a distributed parser algorithm The core parser handles grammar and part of lexical tasks Micro-parsers distributed in traffic modules complete the lexical parsing The traffic engine executes the micro-engines codes in order to generate the traffic pattern

10 10 Raffaello Secchi SPECTS 2005 – July 27, 2005 Extensibility: T-module A module implements a traffic class:T-module only few lines of C-language code define a fully customizable pattern of traffic A T-module consists of: The structure module_descriptor to allow the link between BRUTE core and the module The structure mod_line to define the parameters of a specific traffic class The micro-parser handler to implement the ad-hoc lexical parser The micro-engine handler in charge of generating traffic

11 11 Raffaello Secchi SPECTS 2005 – July 27, 2005 Brute script language command tok_1 =val; tok_2 =val; … A statement consists of: label command identifier sequence of semicolon terminated atoms Where an atom consists of: Tokens identifier (l-value) Numbers, functions and variables (r-values) cbr msec=1000; saddr=192.168.0.1; daddr=192.168.0.2;\ rate=1000; len=udp_data(18); sport=1024; dport=1024; lab:cbr msec=1000; rate +=1000; loop times=10; label=lab;

12 12 Raffaello Secchi SPECTS 2005 – July 27, 2005 API (1/2) Memory management Allocate and free the memory space required to hold the frame Setup the frame headers according to the parameters specified in the configuration file or using random destination (MAC or IP) when specified in the command line. The UDP data is filled as specified in the RFC2544 Timing management Read the TSC register of the CPU using architecture dependent assembly instructions (get_cycles) Busy-wait routine in charge of introducing inter-departure times between packets Frame management Update the frame with the changes required to obtain the subsequent. It modifies the IP id and checksum fields and destination IP or MAC according the command line options. Forward the frame to network device driver

13 13 Raffaello Secchi SPECTS 2005 – July 27, 2005 API (2/2) Random Number Generation Implemented the Mersenne Twister algorithm Quasi infinite period (2 19937 -1) ~100 CPU cycles (fast to be executed at run-time) Good statistical properties Statistical Distributions Implemented functions to generate some statistical distribution (uniform, exponential, Pareto …) AlgorithmCPU cyclesPeriodLifetimeEntropyChi 2 Correlation Linux rand10916(2 31 -1)9.5 hours7.954210.01%-0.04935 /dev/urandom20100--7.9999690.00%-0.00016 TT800942 800 -17.3567430.01%0.139006 Mersenne T.1002 19937 -17.9999550.00%0.00028

14 14 Raffaello Secchi SPECTS 2005 – July 27, 2005 Implemented Traffic Modules (2/4) Poisson process constant packet length exponential inter-departure time parameters: msec, saddr, daddr, sport, dport, len, tos, ttl, lambda

15 15 Raffaello Secchi SPECTS 2005 – July 27, 2005 Poisson Arrival of Burts Poisson Arrival of Burst (PAB) process: R(t) = R N(t) (R is a constant [bitrate]) N(t) underlying state process N(t): superposition of bursts, occurring with exponential inter-arrival time and arbitrary burst length distribution N(t) is equivalent to the number of busy servers in a M/G/ queue, with service time B For fixed t, R(t) ~ Poiss (R* E[B]) If Bs are Pareto distributed (1< <2), R(t) is Long Range Dependent with Hurst parameter H = (3 – XX X T1T1 T2T2 T3T3 B1B1 B2B2 N(t) t X

16 16 Raffaello Secchi SPECTS 2005 – July 27, 2005 Implemented Traffic Modules (3/4) PAB process constant packet length Poisson inter-arrival of burts, pareto bursts length parameters: msec, saddr, daddr, sport, dport, len, tos, ttl, alpha, theta, lambda

17 17 Raffaello Secchi SPECTS 2005 – July 27, 2005 Implemented Traffic Modules (4/4) End to end delay estimation requirements: Measurement Methodology Two hosts synchronized clock via GPS One host closed in loopback Packet format Rude implements a proprietary packet format Using a standard RTCP (SR) we dont need a specific receiver applications (tcpdump, ethereal, AX4000…) Transmission delay compensation from application layer to device driver

18 18 Raffaello Secchi SPECTS 2005 – July 27, 2005 Performance Measurement Internal measurement (non invasive) Allocated a vector into the device driver to store packets timestamps (using get_cycles). Developed a kernel module to dump off-line timestamps through a /proc entry. Wire-line measurement Over Fast- and Gigabit-Ethernet (on copper line and optical fiber). Hardware employed Genuine Intel Pentium-4 2.40 Ghz, 512 Mbyte RAM, motherboard ASUS P4PE, Fast Ethernet 3com 3c905c-TX Tornado Dual Genuine Intel Xeon 2.66 Ghz, 1Gbyte RAM, motherboard SuperMicro X5DPE- G2, Intel PRO/1000LX Gigabit Ethernet Controller (fiber) Spirent AX4000 Traffic Analyzer

19 19 Raffaello Secchi SPECTS 2005 – July 27, 2005 Fast Ethernet Scenario Adapter: 3com 3c905c-TX Tornado fast Ethernet Frame length: 64 bytes BRUTE saturates the link capacity Maximal Rate Test Comparisons

20 20 Raffaello Secchi SPECTS 2005 – July 27, 2005 Throughput vs. frame length Fast Ethernet Scenario Adapter: 3com 3c905c-TX Tornado fast Ethernet BRUTE matches the ideal rate curve at each frame length

21 21 Raffaello Secchi SPECTS 2005 – July 27, 2005 Rate Bias Comparison Fast Ethernet Scenario Adapter: 3com 3c905c-TX Tornado fast Ethernet Error rate averaged over 10 6 frames The through of BRUTE is unbiased at each frame rate

22 22 Raffaello Secchi SPECTS 2005 – July 27, 2005 Standard Deviation of Rate Comparison Fast Ethernet Scenario Adapter: 3com 3c905c-TX Tornado fast Ethernet averaged performed over a window size of 100 frames Std. dev of the rate of BRUTE grows linearly

23 23 Raffaello Secchi SPECTS 2005 – July 27, 2005 Maximal Rate Test Comparison Gigabit-Ethernet Scenario Adapter: Intel PRO/1000LX Gigabit Ethernet Controller

24 24 Raffaello Secchi SPECTS 2005 – July 27, 2005 Bias Error Comparison Gigabit-Ethernet Scenario Adapter: Intel PRO/1000LX Gigabit Ethernet Controller average over 10 6 frames

25 25 Raffaello Secchi SPECTS 2005 – July 27, 2005 Standard Deviation Comparison Gigabit-Ethernet Scenario Adapter: Intel PRO/1000LX Gigabit Ethernet Controller averaged performed over a window size of 10 3 frames

26 26 Raffaello Secchi SPECTS 2005 – July 27, 2005 Conclusions BRUTE is real-time extensible traffic generator: Flexible architecture and extensible design. Along with several traffic modules that generate different pattern of Ethernet traffic. High performance and high level of precision suitable for network benchmarking Use of timing paradigms to better satisfy realtime requirements Capability to generate workloads at wirespeed in order to stress network device


Download ppt "1 Raffaello Secchi SPECTS 2005 – July 27, 2005 BRUTE: A High Performance and Extensible Traffic Generator Nicola Bonelli, Stefano Giordano, Gregorio Procissi,"

Similar presentations


Ads by Google