Traffic Modeling.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 6 Point Estimation.
Chapter 6 Confidence Intervals.
Non-life insurance mathematics
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Random Variables ECE460 Spring, 2012.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Sampling Distributions (§ )
Continuous Probability Distributions.  Experiments can lead to continuous responses i.e. values that do not have to be whole numbers. For example: height.
Review of Basic Probability and Statistics
Random-Variate Generation. Need for Random-Variates We, usually, model uncertainty and unpredictability with statistical distributions Thereby, in order.
Statistical Inference Chapter 12/13. COMP 5340/6340 Statistical Inference2 Statistical Inference Given a sample of observations from a population, the.
1 Engineering Computation Part 6. 2 Probability density function.
Chapter 6 Continuous Random Variables and Probability Distributions
Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.
Some standard univariate probability distributions
Part III: Inference Topic 6 Sampling and Sampling Distributions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
1 STATISTICAL INFERENCE PART I EXPONENTIAL FAMILY & POINT ESTIMATION.
Copyright © Cengage Learning. All rights reserved. 6 Point Estimation.
Maximum likelihood (ML)
Discrete Distributions
Simulation Output Analysis
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
Chapter 4 – Modeling Basic Operations and Inputs  Structural modeling: what we’ve done so far ◦ Logical aspects – entities, resources, paths, etc. 
Chapter 3 Basic Concepts in Statistics and Probability
Modeling and Simulation Input Modeling and Goodness-of-fit tests
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
Moment Generating Functions
Random Sampling, Point Estimation and Maximum Likelihood.
Theory of Probability Statistics for Business and Economics.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Lecture 4 The Normal Distribution. Lecture Goals After completing this chapter, you should be able to:  Find probabilities using a normal distribution.
1 Statistical Distribution Fitting Dr. Jason Merrick.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review II Instructor: Anirban Mahanti Office: ICT 745
Continuous Distributions The Uniform distribution from a to b.
Chapter 7 Point Estimation
“ Building Strong “ Delivering Integrated, Sustainable, Water Resources Solutions Statistics 101 Robert C. Patev NAD Regional Technical Specialist (978)
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Mean, Variance, Moments and.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Engineering Probability and Statistics - SE-205 -Chap 3 By S. O. Duffuaa.
Conditional Expectation
Chap 5-1 Discrete and Continuous Probability Distributions.
Selecting Input Probability Distributions. 2 Introduction Part of modeling—what input probability distributions to use as input to simulation for: –Interarrival.
THE NORMAL DISTRIBUTION
Ondrej Ploc Part 2 The main methods of mathematical statistics, Probability distribution.
Chapter 6 The Normal Distribution and Other Continuous Distributions
Introduction to Probability - III John Rundle Econophysics PHYS 250
Copyright © Cengage Learning. All rights reserved.
Chapter 14 Fitting Probability Distributions
Modeling and Simulation CS 313
Goodness-of-Fit Tests
Modelling Input Data Chapter5.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Chapter 6 Continuous Probability Distributions
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Continuous Distributions
Presentation transcript:

Traffic Modeling

Approaches to construct Traffic Models Trace-Driven: Collect traces from the network using a sniffing tool and utilize it directly in simulations. Empirical Distribution: Generate an empirical distribution from collected traces and accordingly generate random variables to drive the simulations. Distribution Fitting: Fitting collected traces to a well-known distribution. Use the fitted distribution for both simulations and analysis .

Trace-Driven Traffic Modeling time Packet Arrival Behavior over Time

Empirical Distribution Traffic Modeling time Packet Arrival Behavior over Time Cumulative Distribution Function (CDF)

Generate Samples of an Empirical Model Generate a uniform random variable between [0,1] A random sample of the empirical distribution is generated by selecting from the CDF c that corresponds to the generated uniform distribution output. Sample #2 Sample #1 Cumulative Distribution Function (CDF)

Distribution Fitting Traffic Modeling time Packet Arrival Behavior over Time

Advantages and Disadvantages Trace-Driven Empirical distribution Fitted standard Advantages Data is sure to be from the correct sample. Practical real-life results could be anticipated Sometimes there is not enough data to figure out the distribution accurately More flexibility in terms of data that can be generated Fairly simple to deduce from data Same advantages as the empirical distribution Irregularities can be smoothed out Disadvantages Simulation is limited to the results produced from the collected data Data may not be sufficient to do long-enough runs May have irregularities if the collected sample is not large enough (statistical abnormalities) The number of data points that can be generated may be limited depending on the original data samples Can be difficult to deduce if the available data is limited Always a chance of abnormalities that were not accounted for

Characterization of Distributions Mean and Central Moments of Distributions: Continuous Dist. Discrete Dist. Expected Value (Mean) 𝐸 𝑋 = 𝑥𝑓 𝑥 𝑑𝑥 𝐸 𝑋 = 𝑥𝑃 𝑋=𝑥 Variance 𝑉𝑎𝑟 𝑋 = 𝑥−𝐸 𝑋 2 𝑓 𝑥 𝑑𝑥 𝑉𝑎𝑟 𝑋 = 𝑥−𝐸 𝑋 2 𝑃 𝑋=𝑥 Skewness 𝑆𝑘𝑒𝑤 𝑋 = 𝑥−𝐸 𝑋 3 𝑓 𝑥 𝑑𝑥 𝑆𝑘𝑒𝑤 𝑋 = 𝑥−𝐸 𝑋 3 𝑃 𝑋=𝑥

Expected Value of Random Variables The average of the generated random samples If you would like to replace the whole distribution by ONE value, the mean will have the least mean square error

Variance of Random Variables Characterizes how far does the random variable deviate away from its mean value.

Skewness of Random Variables Skewness is a measure of the asymmetry of probability distributions Negative Skewness: The left tail is longer; the mass of the distribution is concentrated on the right of the distribution. Positive Skewness: The right tail is longer; the mass of the distribution is concentrated on the left of the distribution.

Other Parameters for Continuous Distributions Some additional parameters of continuous distribution can be helpful in guidance for distribution fitting Location Parameter (Shift Parameter): Specifies an abscissa (x coordinate) location point of a distribution’s range of values, Often some kind of midpoint of the distribution. Scale Parameter: Determine the scale of measurement or spread of a distribution. Shape Parameter: Determine the basic form or shape of a distribution within the family of distributions of interest.

Location Parameter Examples Normal Distribution Pareto Distribution 𝝁=𝟎 𝝁=𝟐 𝜽=𝟏𝟎 𝜽=𝟐𝟎 𝑓 𝑥 = 1 2𝜋 𝜎 2 𝑒 − 𝑥−𝝁 2 2 𝜎 2 𝑓 𝑥 = 1 𝜎 1+𝑘 𝑥−𝜽 𝜎 −1− 1 𝑘

Scale Parameter Example If X is a random variable with a scale parameter 1 then if there is a random variable Y = 𝛽X then its distribution will have scale parameter 𝛽 The standard deviation of a normal distribution is a scale parameter for it i.e., 𝛽=𝜎 𝑿 𝝈=𝟏 𝒀=𝝈𝑿 𝝈=𝟐

Shape Parameter Characteristics Normal and exponential distributions do not have a shape parameter other distributions such as beta distribution may have two shape parameters. A change in the shape parameter generally alter a distribution property more fundamentally than shift or scale parameters

Heavy-Tail Distributions A distribution with a tail heavier than the exponential Distributions where random variable values that are far from the “mean” of the distribution have a non-zero probability Pareto Principle: known as the 80-20 rule, i.e. 80% of the effects come from 20% of the causes

Heavy or Light Tailed? Pareto Distribution: 𝐶𝐷𝐹 𝐹 𝑋 =1− 𝑥 𝑚 𝑥 𝛼 ,𝑥≥ 𝑥 𝑚 Pareto Survivor Function: 1-𝐶𝐷𝐹 𝑥 𝑚 𝑥 𝛼 ,𝑥≥ 𝑥 𝑚 Plot on log-log scale: A heavy tailed survivor function would be linear in log-log domain Heavy Tail LightTail

Discrete Probability Distributions Binomial Distribution Uniform (Discrete) Distribution Geometric Distribution Negative Binomial Distribution Hypergeometric Distribution

Continuous Probability Distributions Uniform (Continuous) Distribution Triangular Distribution Normal Distribution Exponential Distribution Cauchy Distribution

Continuous Probability Distributions Lognormal Distribution Weibull Distribution Gamma Distribution

Estimation of Parameters Suppose that some distribution shape was deduced from the data set by any of the methods mentioned earlier. The data set X1,X2,…,Xn was used to deduce the distribution shape and can be used to estimate the parameters defining the distribution completely. There are many methods used to estimate the parameters. We will use the maximum likelihood estimator (MLE) method. The method can be explained as follows : suppose we have decided that a certain discrete distribution is the closest to the data set and that the distribution have one unknown parameter q The Likelihood L(q) function is defined as follows : This is basically the joint probability mass function since the data are assumed independent. This gives the probability of obtaining the data set as a whole if q is the value of the unknown parameter.

Maximum Likelihood Estimator The MLE of the unknown value q , which is denoted by q* is defined to be the value of q which maximizes L(q) In the continuous case the probability mass function is substituted with the chosen probability density function. Example : For an exponential distribution q = b and The likelihood function will be given by :

Maximum Likelihood Estimator (cont’d) Because most of the theoretical distributions include exponential functions it is often easier to maximize the logarithm of the likelihood function instead of L(q) itself. Define the log-likelihood function as: The problem reduces to maximizing the logarithmic function as the value of b which maximizes both functions has to be the same. The above equation equals zero if which means the sample mean of the sample set. This should be expected in the case of exponential random variables since they are fully characterized by their means

Maximum Likelihood Estimator (cont’d) Suppose that the distribution chosen is a geometric distribution which is a discrete distribution with pmf given by : The likelihood function will be given by : The log-likelihood function : By differentiating and equating to zero