1 Sampling and Sampling Distributions Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS.

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Estimation in Sampling
Sampling: Final and Initial Sample Size Determination
Math 144 Confidence Interval.
Sampling Distributions
THE CENTRAL LIMIT THEOREM The “World is Normal” Theorem.
1 Functions of Random Variables Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR.
Class notes for ISE 201 San Jose State University
Chapter 7 Sampling and Sampling Distributions
Chapter 9 Chapter 10 Chapter 11 Chapter 12
Point and Confidence Interval Estimation of a Population Proportion, p
Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 2000 LIND MASON MARCHAL 1-1 Chapter Seven Sampling Methods and Sampling Distributions GOALS When you.
Sampling Methods and Sampling Distributions Chapter.
Statistical Inference and Sampling Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 7 Sampling.
Sampling Distributions
Inferences About Process Quality
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Business Statistics: Communicating with Numbers
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 5.2.
Discrete Probability Distributions
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Chapter 5 Sampling Distributions
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
1 Ch6. Sampling distribution Dr. Deshi Ye
Estimation Basic Concepts & Estimation of Proportions
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Chapter 8: Confidence Intervals
Estimation of Statistical Parameters
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Mid-Term Review Final Review Statistical for Business (1)(2)
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
1 Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Systems.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
1 Estimation From Sample Data Chapter 08. Chapter 8 - Learning Objectives Explain the difference between a point and an interval estimate. Construct and.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
Ch5. Probability Densities II Dr. Deshi Ye
1 Random Number Generation Dr. Jerrell T. Stracener, SAE Fellow Update: 1/31/02.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 8.
1 Special Continuous Probability Distributions -Exponential Distribution -Weibull Distribution Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Maz Jamilah Masnan Institute of Engineering Mathematics Semester I 2015/ Sampling Distribution of Mean and Proportion EQT271 ENGINEERING STATISTICS.
1 Statistical Analysis – Descriptive Statistics Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 5-1 Business Statistics: A Decision-Making Approach 8 th Edition Chapter 5 Discrete.
1 Continuous Probability Distributions Continuous Random Variables & Probability Distributions Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
The final exam solutions. Part I, #1, Central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance.
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
System Maintainability Modeling & Analysis Leadership in Engineering
Stracener_EMIS 7305/5305_Spr08_ Reliability Data Analysis and Model Selection Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 8.3.
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © 2005 Dr. John Lipp.
President UniversityErwin SitompulPBST 10/1 Lecture 10 Probability and Statistics Dr.-Ing. Erwin Sitompul President University
1 Discrete Probability Distributions Hypergeometric & Poisson Distributions Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
1 Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Systems.
STATISTICS People sometimes use statistics to describe the results of an experiment or an investigation. This process is referred to as data analysis or.
ESTIMATION.
CONCEPTS OF ESTIMATION
Chapter 7 – Statistical Inference and Sampling
Chapter 5 Sampling Distributions
Confidence intervals for the difference between two means: Independent samples Section 10.1.
Chapter 5: Sampling Distributions
Presentation transcript:

1 Sampling and Sampling Distributions Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Systems Engineering Program Department of Engineering Management, Information and Systems

2 Population the total of all possible values (measurement, counts, etc.) of a particular characteristic for a specific group of objects. Sample a part of a population selected according to some rule or plan. Why sample? - Population does not exist - Sampling and testing is destructive Population vs. Sample

3 Characteristics that distinguish one type of sample from another: the manner in which the sample was obtained the purpose for which the sample was obtained Sampling

4 Simple Random Sample The sample X 1, X 2,...,X n is a random sample if X 1, X 2,..., X n are independent and identically distributed random variables. Remark: Each value in the population has an equal and independent chance of being included in the sample. Stratified Random Sample The population is first subdivided into sub-populations for strata, and a simple random sample is drawn from each strata Types of Samples

5 Censored Samples Type I Censoring - Sample is terminated at a fixed time, t 0. The sample consists of K times to failure plus the information that n-k items survived the fixed time of truncation. Type II Censoring - Sampling is terminated upon the Kth failure. The sample consists of K times to failure, plus information that n-k items survived the random time of truncation, t k. Progressive Censoring - Sampling is reduced in stage. Types of Samples (continued)

6 Systematic Random Sample The N items in the population are arranged in some order. Select an item at random from the first K = N/n items, where n is the sample size. Select every K th item thereafter. Types of Samples (continued)

7 Sampling - Monte Carlo Simulation

8 For any random variable Y with probability density function f(y), the variable is uniformly distributed over (0, 1), or F(y) has the probability density function Uniform Probability Integral Transformation

9 Remark: the cumulative probability distribution function for any continuous random variable is uniformly distributed over the interval (0, 1). Uniform Probability Integral Transformation

10 f(y) F(y) y y riri yiyi Generating Random Numbers

11 Generating values of a random variable using the probability integral transformation to generate a random value y from a given probability density function f(y): 1. Generate a random value r U from a uniform distribution over (0, 1). 2. Set r U = F(y) 3. Solve the resulting expression for y. Generating Random Numbers

12 From the Tools menu, look for Data Analysis. Generating Random Numbers with Excel

13 If it is not there, you must install it. Generating Random Numbers with Excel

14 Once you select Data Analysis, the following window will appear. Scroll down to “Random Number Generation” and select it, then press “OK” Generating Random Numbers with Excel

15 Choose which distribution you would like. Use uniform for an exponential or weibull distribution or normal for a normal or lognormal distribution Generating Random Numbers with Excel

16 Uniform Distribution, U(0, 1). Select “Uniform” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Then press OK. 10 random numbers of uniform distribution will now appear on a new chart. Generating Random Numbers with Excel

17 Normal Distribution, N(μ, σ). Select “Normal” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Enter the values for the mean (m) and standard deviation (s) then press OK. 10 random numbers of uniform distribution will now appear on a new chart. Generating Random Numbers with Excel

18 First generate n random variables, r 1, r 2, …, r n, from U(0, 1). Select “Uniform” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Then press OK. 10 random numbers of uniform distribution will now appear on a new chart. Generating Random Values from an Exponential Distribution E(  ) with Excel

19 Select a θ that you would like to use, we will use θ = 5. Type in the equation x i = -  ln(1 - r i ), with filling in θ as 5, and r i as cell A1 ( =-5*LN(1-A1) ). Now with that cell selected, place the cursor over the bottom right hand corner of the cell. A cross will appear, drag this cross down to B10. This will transfer that equation to the cells below. Now we have n random values from the exponential distribution with parameter θ =5 in cells B1 - B10. Generating Random Values from an Exponential Distribution E(  ) with Excel

20 First generate n random variables, r 1, r 2, …, r n, from U(0, 1). Select “Uniform” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Then press OK. 10 random numbers of uniform distribution will now appear on a new chart. Generating Random Values from an Weibull Distribution W( β,  ) with Excel

21 Select a β and θ that you would like to use, we will use β =20, θ = 100. Type in the equation x i =  [-ln(1 - r i )] 1/ , with filling in β as 20, θ as 100, and r i as cell A1 ( =100*(-LN(1-A1))^(1/20) ). Now transfer that equation to the cells below. Now we have n random variables from the Weibull distribution with parameters β =20 and θ =100 in cells B1 - B10. Generating Random Values from an Weibull Distribution W( β,  ) with Excel

22 First generate n random variables, r 1, r 2, …, r n, from N(0, 1). Select “Normal” under the “Distribution” menu. Type in “1” for number of variables and 10 for number of random numbers. Enter 0 for the mean and 1 for standard deviation then press OK. 10 random numbers of uniform distribution will now appear on a new chart. Generating Random Values from an Lognormal Distribution LN( μ, σ ) with Excel

23 Select a μ and s that you would like to use, we will use μ = 2, σ = 1. Type in the equation, with filling in μ as 2, σ as 1, and r i as cell A1 ( =EXP(2+A1*1) ). Now transfer that equation to the cells below. Now we have an Lognormal distribution in cells B1 - B10. Generating Random Values from an Lognormal Distribution LN( μ, σ ) with Excel

24 Flow Chart of Monte Carlo Simulation method Input 1: Statistical distribution for each component variable. Input 2: Relationship between component variables and system performance Select a random value from each of these distributions Calculate the value of system performance for a system composed of components with the values obtained in the previous step. Output: Summarize and plot resulting values of system performance. This provides an approximation of the distribution of system performance. Repeat n times

25 Because Monte Carlo simulation involves randomly selected values, the results are subject to statistical fluctuations. Any estimate will not be exact but will have an associated error band. The larger the number of trials in the simulation, the more precise the final results. We can obtain as small an error as is desired by conducting sufficient trials In practice, the allowable error is generally specified, and this information is used to determine the required trials Sample and Size Error Bands

26 Example If X~ B(n,p) and the desired confidence level is 95%, then 1 -  = 0.95 and  = 0.05 and Z 1-  /2 = 1.96; and if = 0.2. Then an estimate of the required sample size is

27 there is frequently no way of determining whether any of the variables are dominant or more important than others without making repeated simulations if a change is made in one variable, the entire simulation must be redone the method may require developing a complex computer program if a large number of trials are required, a great deal of computer time may be needed to obtain the necessary results Drawbacks of the Monte Carlo Simulation

28 If the probability density function of X is Find (a)F(x) (b)Mean (c)Standard Deviation (d)The value of x for which P(X > x)=0.05 (e)If 5 values of x are randomly selected find the probability that at least 2 of them will exceed 0.6 (f)Redo parts (a) thru (e) using Monte Carlo Simulation Example

29 First, plot : Example - Solution

30 (a) The (cumulative) probability distribution function of X for is Example - Solution

31 so that Example - Solution

32 (b) The mean of X is Example - Solution

33 The variance of X is Example - Solution

34 The standard deviation is Example - Solution

35 (d) The value of x such that P(X > x) = 0.05 can be determined by a couple of different approaches. x can be obtained by solving the following equation for x, or by solving F(x) = 0.95 for x, Example - Solution

36 Here its roots are is outside of our range, so is our answer. If we check with our plot of the data, this seems reasonable Example - Solution 0.05

37 (e) Let Y = number of values that exceed 0.6, for y = 0,1,2,3,4,5. Now Example - Solution

38 so that Example - Solution

39 (f) Generate a random sample of n, say 1,000, from using Monte Carlo Simulation as follows: Since generate and solve for x i Example - Solution

40 Then estimate F(x), μ, σ and as follows: f(x) Example - Solution

41 Then estimate F(x), μ, σ and as follows: F(x) Example - Solution

42 Compare this to  = Example - Solution

43 where Example - Solution

44 Compare this to  = Example - Solution

45 Compare this to the Example - Solution

46 Compare this to the Example - Solution

47 Remember, that there are 1000 points of data that we have used. To access our data, just double click on the excel chart to the left. Example - Solution - Our Data

48 Sampling Distributions

49 If X 1, X 2,...,X n is a random sample of size n from a normal distribution with mean  and known standard deviation , and if then and Sampling Distribution of with known 

50 The dollar amount per transaction, X, in the Sporting Goods Department of a store has a normal distribution with mean $75 and standard deviation of $20. What is the probability that a random sample of 9 sales transactions will have an average over $85? Sampling Distributions: Example

51 If X ~ N(75, 20), then Sampling Distributions: Example - Solution

52 If is the mean of a random sample of size n, X 1, X 2, …, X n, from a population with mean  and finite standard deviation , then if n   the limiting distribution of is the standard normal distribution. Central Limit Theorem

53 Remark: The Central Limit Theorem provides the basis for approximating the distribution of X with a normal distribution with mean  and standard deviation The approximation gets better as n gets larger. Central Limit Theorem

54 A manufacturing process produces parts with a mean diameter of 5 mm. An engineer conjectures that the population mean is 5.0 mm, and an experiment is conducted in which 100 parts are selected randomly and measured. It is known that the population  = 0.1. The experiment indicates a sample average diameter = mm. Does this refute the engineer’s conjecture? Solution: Whether or not the data support or refute the conjecture depends on the probability that data similar to that obtained in this experiment can readily occur when  = 5.0. In other words, how likely is it that one can obtain  with n = 100 if the mean is equal to  = 5.0? Central Limit Theorem - Example

55 The probability that we choose to compute is given by P[( - 5)  0.027]. This is the same as asking, if the mean is 5, what is the chance that it will deviate by so much as 0.027? Solution

56 Here we are simply standardizing the sample mean according to the Central Limit Theorem. Thus one would experience by chance a sample mean that is mm from the population mean in only about 3.5 of 1000 experiments. Therefore the sample data does not support the engineer’s conjecture. Solution (Continued)

57 Let X 1, X 2,..., X n be independent random variables that have normal distribution with mean  and unknown standard deviation . Let and Then the random variable has a t-distribution with  = n - 1 degrees of freedom. Sampling Distribution of with Unknown 

58 If S 2 is the variance of a random sample of size n taken from a normal population having variance  2, then the statistic has a chi-squared distribution with  = n - 1 degrees of freedom. Sampling Distributions of S 2

59 A manufacturer of car batteries guarantees that his product will last, on average, 3 years with a standard deviation of 1 year. If five batteries have lifetimes of 1.9, 2.4, 3.0, 3.5 and 4.2 years, is the manufacturer still convinced that his batteries have a standard deviation of 1 year? Assume that battery lifetime follows normal distribution. Solution: We first find the sample variance: Example

60 Then is a value from a chi-squared distribution with 4 degrees of freedom. Since 95% of the  2 values with 4 degrees of freedom fall between and , the computed value with  2 = 1 is reasonable, and therefore the manufacturer has no reason to suspect that the standard deviation is other than 1 year. Solution (Continued)