Network Modelling & Simulation

Network Modelling & Simulation
Content: Network Topology Network Traffic characteristics Distribution patterns Probabilistic modelling Random numbers & Pseudo-Random numbers Queuing Theory (introductory)

What does it mean? Modelling:
A mathematical representation of the target system Building a model of a computer network Using that model to analyse and predict possible behaviour of a real network Simulation: Having the appearance of / to behave like / to copy. Reproducing the conditions of (a situation, etc.), as in carrying out an experiment . To reproduce the behaviour of a system by providing realistic configuration parameters to a simulation.

The need for network models
Network designers & managers need to ask a wide variety of What If questions, for example: the number of users were doubled? the buffer size at routers was increased? a key device failed at peak traffic conditions? the 10 Mbs links were upgraded to 100 Mbs? we replace a hub with a switch? What is the cheapest way to increase the bandwidth available to users in the accounts department? Why does the throughput between Building-A and Building-B fall significantly between 2pm and 3pm every day?

Why models? Problems with experimenting on a live/test network:
Scale and cost Risk (expensive devices may be purchased and NOT solve the original problem) Interruption (to existing users and services) Time consuming Hence the use of models of networks for: Planning a network implementation, Planning an expansion of an existing network, Problem diagnostics for an existing network, As a learning tool.

Network Topology The topology of a network is represented by a model.
This is a schematic representation of eight nodes connected to a bus network The topology of a network is represented by a model. The number and type of devices: Computers – clients, servers Network devices: hubs, switches, routers Connectivity: links and bandwidth

An example of a physical layout
JANET - geographical view

JANET – topology Janet = Joint Academic Network

What is not shown as part of topology
The actual shape of the network. The physical location of devices. The distance between devices. Behavioural characteristics such as: what are the chances of any individual frame being lost or corrupted? The reliability of each link or device i.e. what are the chances of a link or device failing? The traffic patterns with the network: The amount of traffic at any point, at any time ? The source and sink characteristics of traffic ? Routes taken by traffic.

Traffic generation A static (idle) network is not very interesting
- This just amounts to some devices and some connections between them. Applications run on networks and generate traffic or load on a network. It is the load on the network that causes the network to exhibit certain behaviours, potentially causing problems. It is these behaviours that we are interested in. Without traffic our model is just a map of the network – it can exhibit no behaviour and predict nothing. This is useful to see what is connected to what, but does not tell us: how congested the network will be? where problems might occur? how many uses can be supported? Have you ever planned a road journey based purely on a map, and encountered unexpected delays that seem to have no plausible explanation? (have you ever driven on the M25?)

Traffic generation (continued)
Network traffic: packets/ sec? packet size? Not constant except on rare occasions Cannot be predicted or modelled exactly Need to use statistical techniques Probability distributions with the right parameters, for example: - Inter-arrival time: exponential (mean 3 sec) - Packet size: exp (mean = 1200 bytes) Two important traffic factors are: Packet/message frequency (or inter-arrival time). Frame/packet size Traffic models are always approximations of the real world because: Complexity (number of parameters), Characteristics cannot normally be precisely known in advance. In real networks, frame are of varying size, and are not generated at regular intervals. So we need to use statistical techniques to model frame sizes and inter-arrival times approximately. Probability distributions are used to determine the size of frames and the times at which frames are sent by nodes.

Probability Distributions
Several probability distribution functions are available – each characterised by certain parameters – mean, standard deviation etc. In each case the actual function that describes the distribution is the Probability Density Function (PDF). No distribution can represent the random events in a computer network with complete accuracy. It is important to use appropriate distributions to give a ‘good approximation’ of the behaviour of typical systems. This is known from analysis of live network traffic, typical applications etc.

Important distributions
Name Parameters Use to represent: Constant None (fixed value) Packet/frame size Binomial P(success) Server up/dn Exponential λ, Mean ‘mean’ interpreted as ‘expected value’ where E(x) = 1 / λ PDF: λe-λx (e = Euler constant ≈ ) Interarrival time, packet size Poisson Mean Pkt arrival rate Uniform Min & max (constant probability) Application start time Normal Mean, SD PDF: Errors in experiments The most important ones you will use are: constant and exponential

Poisson distribution (Used e.g. for packet arrival rate)
This distribution expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known average rate (the mean) and independently of the time since the last event. In the example shown, the mean is 6, therefore in the given time period (not indicated on the graph) the most likely number of events is 6 – the cumulative probability will always increase, towards 1.0 (but this will not be reached until all possible values of k are accounted for). For example, suppose the number of frames arriving at a router is measured at per hour. This gives a mean of 6 frames per second. The Poisson density P(k) gives the probability of a specified number of frames (k) arriving in a given second e.g. p(4) is approximately 0.1 The cumulative probability C(k) could be used to determine the probability that the number of frames arriving in a specific second will be between 0 and k. e.g. C(4) = 0.3 Using the graph we can determine: the probability of six frames arriving in a single second (purple bar) the probability of between zero and six frames arriving in a single second (blue bar)

Exponential distribution, mean = 2. 0 (e. g
Exponential distribution, mean = 2.0 (e.g. Used for Interarrival time, packet size) C (x) Cumulative probability of occurrences 0 to x ( λ = 0.5 ) P(x) The probability of (e.g.) a particular period between packets decreases as the time interval increases. Again the probability density P(x) and cumulative density C(x) are shown.

Variable frame sizes Ignoring the ‘illegal’ sizes, the distribution is biased towards the smaller frame sizes

Normal distribution Popular in social sciences, e. g
Normal distribution Popular in social sciences, e.g. height of group of people. For networking, useful for distribution of errors (see Std Dev from mean) The favourite of sociologists and statisticians. The Normal Distribution is symmetrical about a mean (at which it peaks). It has tails of very small and very large values which approach zero magnitude the further travelled from the mean. The Normal Distribution can be useful for modelling packet size in some systems. The Normal Distribution is not particularly useful for modelling inter-arrival times.

Exponential Distribution (e.g. Random choice of interarrival time)
Using the distribution in a model: 1. Use the Cumulative curve (dark red), this represents the probability distribution p 2. Pick a random number y where (0 ≤ y ≤ 1) note y = p(x) 3. Using y, and the distribution p we can find x (track across to the curve to find x) Note that for this curve short interarrival times are much more likely than longer times If y were 0 the packet next will follow directly after the previous one If y were 1 the next packet will follow after infinite time (Interarrival time) Random numbers and Pseudo Randomness. In order that a model can use a probability distribution, it needs a random number generator to provide a stream of random numbers. For example: Assume that the exponential distribution shown is used to determine packet inter-arrival times (milliseconds) in a network simulation. The model gets to the point at which it needs to simulate the next packet: The model must determine when to generate the packet. The random number generator produces next number (between 0 and 1), e.g. 0.4. From the function it can be seen that this corresponds to an inter-arrival time of 1.02 mS. After an interval of 1.02 mS since the last packet, the new packet is inserted into network. Similarly, for a random number (0,1) of 0.8, the value of interarrival time is: 3.22

Simulation Results Results of simulations can be presented in a number of ways: Animation of the model - this allows you to watch packets as they move around the network. Gives an overall idea of what is happening. Numerical results - event counts (e.g. number of packets sent) - statistical averages (e.g. average packet end-to-end delay). Graphical results - alternative way to view numerical results.

Identifying the time when the network behaviour ‘steadies’
This is an artefact of the modelling process – it arises because of the use of moving averages. This is NOT a behavioural artefact of the modelled network → we must ignore it when analysing results (see next slide) When a network is simulated from cold, it takes a little while for traffic to build up and reach a ‘steady state’. When you calculate averages and standard deviations, you should ignore this initial period.

Illustration of Moving Average
The moving average takes a window of the most recent 10 values and averages them. This is useful measure of average value and is often used in simulation models. Note how the MA10 curve is inaccurate for the first 9 samples (it uses 10 samples to determine the average, so inaccurate when less than 10 samples are available). Value x MA10(x) 60 6 70 13 60 19 80 27 50 32 60 38 70 45 50 50 30 53 40 57 50 56 20 51 60 51 40 47 50 47 60 47 50 45 Ignore the first set of values, in this case, because MA10 is used, ignore 9 values MA Raw data

Average and variance Average over the run (e.g. 50 values/statistic)
Average over several runs (diff. seeds) If you run a simulation over 3600 seconds, you can ask OPNET to calculate and output an average every 60 seconds, say. This you get 60 values. You can calculate the mean (average) and standard deviation. You can also run the simulation 3-4 times and calculate an ‘average of averages’.

Model Validation Models are approximations of the real thing
Lots of guesswork and simplification Need to make the model as realistic as possible Need to validate the model thoroughly This is important if the results are to be credible and useful Need sufficient amount of results, i.e. sufficient depth and breadth of investigation, changing various parameters, and changing the seed to ensure a wide range of possible outcomes is investigated; otherwise you cannot draw valid conclusions. Verification of a model A model is not the same as the real network – it can never give completely accurate predictions of behaviour because there are many parameters that need tuning and it is not possible to completely and absolutely reflect the true network. This is especially true of network traffic, which has to be generated probabilistically in the model. It is very important that we build models as accurately as possible. An incorrect model will yield incorrect predictions and this is dangerous if we believe them to be accurate. Thus it is important that we can obtain a level of confidence in the correctness of our model.

Verification Alternative calculations (queueing theory) – difficult, often impossible Comparison against a real network Extrapolation from a smaller model which can be verified Extrapolation from a smaller real network Verification of a model Confidence can be achieved in a number of ways: Mathematical confirmation of results: (very time consuming and only realistically possible for very simple models). For example: Using queuing theory, we can predict how a queue will behave if we know: the arrival rate of frames into the queue, the service rate of the server. Comparison against a real network if it is available. Comparison against an extrapolation of the results for a smaller model that has been verified. Comparison against an extrapolation of the behaviour of a smaller real network.

The M/M/1 Queue A simple queuing situation which can be solved exactly; Given the mean packet interarrival time E(ta) & the mean service time E(ts), we can calculate the mean queue length & the mean delay: Utilisation ρ = E(ts) / E(ta) longer service time → higher ρ longer interarrival time → lower ρ Delay E(Tq) = E(ts) / (1- ρ) 1- ρ = proportion of time the resource is free Delay = service time divided by free time. The more free the resource is, the shorter the time to process jobs Queue length E(q) = ρ / (1- ρ) Queue length is the ratio of utilisation to free time. At 50% utilisation the average queue size will be 1 Queue Length Server The most common (and most simple) queue model is the M/M/1 Queue The first M signifies a Poisson arrival-rate function (λ) (the number of frames arriving in unit time) The second M signifies a Poisson service-rate function (the number of frames serviced in unit time) (µ) May be stated in terms of the mean time required to service each item (1/µ) The 1 signifies that there is 1 server (link, router, switch, server etc.) The utilisation is the load on the server as a fraction of capacity. The delay represents the sum of queueing delay and service time (e.g. transmission, routing, switching etc.) The queue length includes the item being serviced e.g. packet being transmitted. Note – for self-checking, Delay can be estimated as: Queue length * service time. This value should be in the same ball-park as the value calculated using the formula.

An example calculation
An application generates packets at the average rate of 5 per second. (Arrival rate λ = 5) The average packet size is 1200 bytes (Exponential dist) Packets transmitted along a 100 kbps link. Mean Interarrival time = E(ta) = 1/5 = 0.2 sec Mean service time = E(ts) = 1200 x 8 / = sec. (in this case service time is time needed to transmit onto the link) Utilisation ρ = E(ts) / E(ta) = / 0.2 = 0.48 Delay E(tq) = E(ts) / (1- ρ) = / 0.52 = sec Queue length E(q) = ρ / (1- ρ) = 0.48 / 0.52 = 0.923 Quick check technique for ρ (= link bandwidth / mean total traffic) = / (1200 * 8 * 5) = / = 0.48 E(tq) is the average delay, E(q) is the average queue size.

One for you to try A database server receives client requests at the average rate of one every 2 seconds. Arrival rate λ = E(ta) = 2 It can process a transaction in approximately 1.5 seconds E(ts) = 1.5 Calculate the utilisation, average transaction delay and the transaction queue size at the server. Util = 1.5/2 = 0.75 Queue size = 0.75/0.25 = 3 Delay = 1.5/0.25 = 6 sec.

Network Modelling & Simulation

Similar presentations

Presentation on theme: "Network Modelling & Simulation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Network Modelling & Simulation

Similar presentations

Presentation on theme: "Network Modelling & Simulation"— Presentation transcript:

Similar presentations

About project

Feedback