Last Day…completed 5.1 Today Parts of Section 5.3 and 5.4
Sampling In chapter 1, we concerned ourselves with numerical/graphical summeries of samples (x 1, x 2, …, x n ) from some population Can view each of the X i ’s as random variables We will be concerned with random samples The x i ‘s are independent The x i ‘s have the same probability distribution Often called
Definitions A parameter is a numerical feature of a distribution or population Statistic is a function of sample data (e.g., sample mean, sample median…) We will be using statistics to estimate parameters (point estimates)
Suppose you draw a sample and compute the value of a statistic Suppose you draw another sample of the same size and compute the value of the statistic Would the 2 statistics be equal?
Use statistics to estimate parameters Will the statistics be exactly equal to the parameter? Observed value of the statistics depends on the sample There will be variability in the values of the statistic over repeated sampling The statistic has a distribution of its own
Probability distribution of a statistic is called the sampling distribution (or distribution of the statistic) Is the distribution of values for the statistic based on all possible samples of the same size from the population? Based on repeated random samples of the same size from the population
Example Large population is described by the probability distribution If a random sample of size 2 is taken, what is the sampling distribution for the sample mean?
Sampling Distribution of the Sample Mean Have a random sample of size n The sample mean is What is it estimating?
Properties of the Sample Mean Expected value: Variance: Standard Deviation:
Sampling from a Normal Distribution Suppose have a sample of size n from a N( , 2 ) distribution What is distribution of the sample mean?
Example Distribution of moisture content per pound of a dehydrated protein concentrate is normally distributed with mean 3.5 and standard deviation of 0.6. Random sample of 36 specimens of this concentrate is taken Distribution of sample mean? What is probability that the sample mean is less than 3.5?
Central Limit Theorem In a random sample (iid sample) from any population with mean, , and standard deviation, , when n is large, the distribution of the sample mean is approximately normal. That is,
Implications So, for random samples, if have enough data, sample mean is approximately normally distributed...even if data not normally distributed If have enough data, can use the normal distribution to make probability statements about
Example A busy intersection has an average of 2.2 accidents per week with a standard deviation of 1.4 accidents Suppose you monitor this intersection of a given year, recording the number of accidents per week. Data takes on integers (0,1,2,...) thus distribution of number of accidents not normal. What is the distribution of the mean number of accidents per week based on a sample of 52 weeks of data
Example What is the approximate probability that is less than 2 What is the approximate probability that there are less than 100 accidents in a given year?