Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 6 Data Collection and Parameter Estimation.

Similar presentations


Presentation on theme: "Lecture 6 Data Collection and Parameter Estimation."— Presentation transcript:

1 Lecture 6 Data Collection and Parameter Estimation

2 2 Input Modeling  In real-world simulation applications, determining appropriate distributions for input data is a major task from the standpoint of time and resource requirements.  Faulty models of the inputs will lead to outputs whose interpretation may give rise to misleading recommendations.  Steps to develop a useful model for input data  Collect data from the real system of interest  Identify a probability distribution to represent the input process  In real-world simulation applications, determining appropriate distributions for input data is a major task from the standpoint of time and resource requirements.  Faulty models of the inputs will lead to outputs whose interpretation may give rise to misleading recommendations.  Steps to develop a useful model for input data  Collect data from the real system of interest  Identify a probability distribution to represent the input process

3 3 Input Modeling (cont’)  Choose parameters that determine a specific instance of the distribution family  Evaluate the chosen distribution and the associated parameters for goodness-of-fit  Choose parameters that determine a specific instance of the distribution family  Evaluate the chosen distribution and the associated parameters for goodness-of-fit

4 4 Data Collection  Plan your data collection process  Always try to find ways that can help you collect data efficiently and accurately (equipment, barcoding, receipts, personnel, video, etc)  Collect only data that is useful for your project  Plan your data collection process  Always try to find ways that can help you collect data efficiently and accurately (equipment, barcoding, receipts, personnel, video, etc)  Collect only data that is useful for your project

5 5 Identifying the Distribution  HISTOGRAMS  Divide the range of the data into intervals  Label the horizontal axis to conform to the intervals selected  Determine the frequency of occurrences within each interval  Label the vertical axis so that the total occurrences can be plotted for each interval  Plot the frequencies on the vertical axis  HISTOGRAMS  Divide the range of the data into intervals  Label the horizontal axis to conform to the intervals selected  Determine the frequency of occurrences within each interval  Label the vertical axis so that the total occurrences can be plotted for each interval  Plot the frequencies on the vertical axis

6 6 Identifying the Distribution (cont’)  SELECTING THE FAMILY OF DISTRIBUTOINS  Recall if the histogram drawn from your resembles any kind of statistical distribution  Use physical basis (e.g. usage, discrete or continuous) of the distribution as a guide  Use software  The exponential, normal, and Poisson distributions are frequently encountered and are not difficult to analyze from a computational standpoint  SELECTING THE FAMILY OF DISTRIBUTOINS  Recall if the histogram drawn from your resembles any kind of statistical distribution  Use physical basis (e.g. usage, discrete or continuous) of the distribution as a guide  Use software  The exponential, normal, and Poisson distributions are frequently encountered and are not difficult to analyze from a computational standpoint

7 7 Identifying the Distribution (cont’)  QUANTILE-QUANTILE PLOTS  Evaluate the fit of the chosen distribution(s)  Compare the actual values with the values derived from the chosen distribution  The nearer to become a straight line, the better the accuracy  QUANTILE-QUANTILE PLOTS  Evaluate the fit of the chosen distribution(s)  Compare the actual values with the values derived from the chosen distribution  The nearer to become a straight line, the better the accuracy 99.7999.56100.17100.33 100.26100.4199.9899.83 100.23100.27100.02100.47 99.5599.6299.6599.82 99.9699.90100.0699.85

8 8 Identifying the Distribution (cont’) ObservedValue j Value 1 99.556 99.8211 99.9816 100.26 2 99.567 99.8312 100.0217 100.27 3 99.628 99.8513 100.0618 100.33 4 99.659 99.9014 100.1719 100.41 5 99.7910 99.9615 100.2320 100.47 EstimatedValue j Value 1 99.436 99.8211 100.0116 100.20 2 99.587 99.8612 100.0417 100.25 3 99.668 99.9013 100.0818 100.32 4 99.739 99.9414 100.1219 100.40 5 99.7810 99.9715 100.1620 100.55

9 9 Parameter Estimation  Sample Mean and Sample Variance  Calculate sample mean ( ) and variance ( ) from the collected data  Based on the distribution chosen, convert the parameters from the sample mean and variance which is (are) used for the distribution  Sample Mean and Sample Variance  Calculate sample mean ( ) and variance ( ) from the collected data  Based on the distribution chosen, convert the parameters from the sample mean and variance which is (are) used for the distribution DistributionParameter(s)Suggested Estimator(s) Poisson  Exponential Normal ,  2

10 10 Goodness-of-Fit Tests  Provides helpful (quantitative) guidance for evaluating the suitability of a potential input model  Used in large samples size data  Use tables to determine accept or reject  Provides helpful (quantitative) guidance for evaluating the suitability of a potential input model  Used in large samples size data  Use tables to determine accept or reject

11 11 Goodness-of-Fit Tests (cont’)  Chi-Square Test  This test is applied to for testing the hypothesis that a random sample of size n of the random variable X follows a specific distributional form  The test is valid for large sample sizes, for both discrete and continuous distributional assumptions O i is the observed frequency in the i th class interval E i is the expected frequency in that class interval  Chi-Square Test  This test is applied to for testing the hypothesis that a random sample of size n of the random variable X follows a specific distributional form  The test is valid for large sample sizes, for both discrete and continuous distributional assumptions O i is the observed frequency in the i th class interval E i is the expected frequency in that class interval

12 12 Goodness-of-Fit Tests (cont’)  Example 9.13 (Poisson Assumption) H 0 : the random variable is Poisson distributed H 1 : the random variable is not Poisson distributed For  = 3.64, the probabilities associated with various values of x: It is significantly to reject H 0 at the 0.05 level of significance.  Example 9.13 (Poisson Assumption) H 0 : the random variable is Poisson distributed H 1 : the random variable is not Poisson distributed For  = 3.64, the probabilities associated with various values of x: It is significantly to reject H 0 at the 0.05 level of significance. P(0) = 0.026P(4) = 0.192P(8) = 0.020 P(1) = 0.096P(5) = 0.140P(9) = 0.008 P(2) = 0.174P(6) = 0.085P(10) = 0.003 P(3) = 0.211P(7) = 0.044P(11) = 0.001 2.619.22.0 9.614.00.8 17.48.50.3 21.14.40.1 E(x)=np

13 13 Goodness-of-Fit Tests (cont’) xixi Observed frequency, O i Expected Frequency, E i 0122.6 1109.6 21917.40.15 31721.10.80 41019.24.41 5814.02.57 678.50.26 754.4 852.0 930.8 1030.3 1110.1 100100.027.68 22 12.27.87 177.611.62

14 14 Goodness-of-Fit Tests (cont’)  Example 9.14 (Exponential Assumption) H 0 : the random variable is Exponential distributed H 1 : the random variable is not Exponential distributed Let k = 8, then each interval will have probability p = 0.125  Example 9.14 (Exponential Assumption) H 0 : the random variable is Exponential distributed H 1 : the random variable is not Exponential distributed Let k = 8, then each interval will have probability p = 0.125

15 15 Goodness-of-Fit Tests (cont’) It is significantly to reject H 0 at the 0.05 level of significance. Class Interval Observed frequency, O i Percentage FactorExpected Frequency, E i [0, 1.590)19 P(X  0.159) – P(X  0) = 0.125 6.2526.01 [1.590, 3.425)10 P(X  3.425) – P(X  1.590) = 0.125 6.252.25 [3.425, 5.595)3 P(X  5.595) – P(X  3.425) = 0.125 6.250.81 [5.595, 8.252)6 P(X  8.252) – P(X  5.595) = 0.125 6.250.01 [8.252, 11.677)1 P(X  11.677) – P(X  8.252) = 0.125 6.254.41 [11.677, 16.503)1 P(X  16.503) – P(X  11.677) = 0.125 6.254.41 [16.503, 24.755)4 P(X  24.755) – P(X  16.503) = 0.125 6.250.81 [24.755,  ) 6 P(X   ) – P(X  24.755) = 0.125 6.250.01 501.0005039.6

16 16 Selecting Input Models without Data  Engineering data  A product or process has performance ratings provided by the manufacturer (for example, a laser printer fan produce 4 pages/minute)  Expert option  Talk to people who are experienced with the process or similar processes.  Physical or conventional limitations  Most real processes have physical limits on performance (for example, computer data entry cannot be faster than a person can type)  The nature of the process  Select the family of distribution  Engineering data  A product or process has performance ratings provided by the manufacturer (for example, a laser printer fan produce 4 pages/minute)  Expert option  Talk to people who are experienced with the process or similar processes.  Physical or conventional limitations  Most real processes have physical limits on performance (for example, computer data entry cannot be faster than a person can type)  The nature of the process  Select the family of distribution


Download ppt "Lecture 6 Data Collection and Parameter Estimation."

Similar presentations


Ads by Google