Chapter 9 Sampling Distributions AP Statistics St. Francis High School Fr. Chris, 2001
Two Key Ideas n A Statistic is a Random Variable n As such, mean and standard deviations can be found from combining the basic random variables that make the statistic
Pick Pennies from a Hat n Recall how we did this n Try it again: –Pick at random –Note the year –Compute the mean and standard deviation of your sample –NEW: Compute what you think the mean and standard deviation of the entire hat!
Formulas
Statistic vs. Parameter n A Statistic is a way to describe a parameter n A Parameter describes a population
Which is a sample, which is a parameter? 42% of today’s 15 year-old girls will get pregnant in their teens 37% said they would vote for Joan Smith, on election day 41% actually did. The NIH reports that the mean systolic blood pressure for males years of age is 128 and the standard deviation is male Stock Brokers in this age group have a mean blood pressure of : parameter 37:statistic 41:parameter 128, 15: parameter, :statistic
Bias vs. Variability Bias: Is your statistic centered around the population’s parameter? Variability: Is your sample distribution scattered or focused?
Identify the bias and variability of each:
What about your sample? Is it variable? Is it biased? How can you tell?
Confidence Intervals By hand: Computer Simulation Use your sample statistics and what you know of the central limit Theorem, to make an assertion about the Population parameter
What about a proportion? The Gallup poll asked a probability sample of 1785 adults whether they attended church or synagogue during the past week. Suppose 40% did attend. How likely is it that a SRS of 1785 would be within 3% of this actual value?
Two rules of thumb: The population must be at least 10 times more than your sample size to use this formula for standard deviation. np > 10 and n(1-p) > 10 in order to use the normal curve for approximating p.
Compute the standard deviation Since the population is more than 10 times 1785, =0.0116
The Probability that p-hat is between 37%-43% Since (.4)(1785) >10, and (.6)(1785)>10 then we can convert to z-scores and use the normal curve.
Using the Normal Distribution… P( < Z < 2.586)= P(Z<2.586)-P(Z<-2.586)= normalcdf(-2.586,2.586)= Normalcdf(.37,.43,.4, )=.9903!
Okay, what if you flip a coin 20 times and it’s heads 14 times? Is it a fair coin? How can justify your answer? Did you mention sample variability? Bias? Do the rules of thumb apply to find a sigma? To use the normal distribution? If you suspect that 70% is this coin’s true proportion, how many times should we flip it so we can use the normal curve?
Dishonest Cola? DC Cola is suspected of underfilling its cans of cola. They say each can has 12 ounces, with a standard deviation of 0.4 oz. If this is true, how likely is it to get an average of 11.9 oz.or less, by taking a random sample of 50 cans?
Work it out... Z score? Or normalcdf(-1E99, 11.9, 12,.4 / √50) =.0384 Look up in Table A, or normalcdf(-1E99, -1.77)
This leads to inference... If these were your results, there is still a 3% chance that the parameter really is where the company says it is (12 oz.) and sample variation lead you to a result less than 11.9 oz. At what point do you reject the company’s claim? At 5%? 1%? 0.1%?
Inferential Statistics n We choose a level of rejection (alpha) n We assume that our results are no different, and any variation is from chance (Null Hypothesis). n If it is unlikely (less than our chosen alpha), we reject the “Null Hypothesis” n Then claim our results SIGNIFICANTLY different.
Central Limit Theorem Draw an SRS of size n from any population whatsoever with mean µ and a finite standard deviation . When n is large, the sampling distribution of the sample mean x-bar is close to the normal distribution N[µ, /√n] (page 488).
Law of Large Numbers Draw observations at random from any population with finite mean µ. As the number of observations drawn increases, the mean x-bar of the observed values gets closer and closer to ..
Homework (489) Parameter or a statistic? parameterstatistic parameter statistic
9.5 (492)Tumbling Toast n Toss coin 20 times. P-hat= n 10 more times… make a histogram of your p-hats…. Is the center close to.5? n Pool your work.. Is the center near.5? Is it normal?
9.9 (500) Dead Guinea Pigs
9.10(510) n A) Large Bias, Large Variability n B)Small Bias Small Variability n C)Small Bias Large Variability n D)Large Bias Small Variability
9.17 (503) School Vouchers n Assuming the poll’s sample size is less than 780,000-10% of the population of NJ… the variability would be about the same
9.19 (511) Got Milk? n=1012
9.33(519) Juan’s results =10
9.35(524)Bad Rug Mean=1.6 sd=1.2
9.39(525) Cheap Cola =298, =3 P(<295)? P(xbar<295, n=6)?
9.41(526) What a Wreck! n =2.2, =1.4 n Not normal but dist of x-bar is!