Stat 13, Tue 5/8/12. 1. Collect HW 4. 2. Central limit theorem. 3. CLT for 0-1 events. 4. Examples. 5.  versus  /√n. 6. Assumptions. Read ch. 5 and 6.

Slides:



Advertisements
Similar presentations
Chapter 18 Sampling distribution models
Advertisements

Mean, Proportion, CLT Bootstrap
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
Estimation in Sampling
Chapter 8: Estimating with Confidence
Sampling Distributions and Sample Proportions
Sampling Distributions
Ch. 8 – Practical Examples of Confidence Intervals for z, t, p.
Chapter 18 Sampling Distribution Models
Copyright © 2010 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
STAT 1301 Confidence Intervals for Averages. l we usually DON’T KNOW the population AVG l we take simple random sample of size n l find sample AVG - this.
Sampling distributions. Example Take random sample of students. Ask “how many courses did you study for this past weekend?” Calculate a statistic, say,
1 Test a hypothesis about a mean Formulate hypothesis about mean, e.g., mean starting income for graduates from WSU is $25,000. Get random sample, say.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #15.
Sampling We have a known population.  We ask “what would happen if I drew lots and lots of random samples from this population?”
1 The Sample Mean rule Recall we learned a variable could have a normal distribution? This was useful because then we could say approximately.
Sample Distribution Models for Means and Proportions
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.
Chapter 5 Sampling Distributions
ESTIMATING with confidence. Confidence INterval A confidence interval gives an estimated range of values which is likely to include an unknown population.
Many times in statistical analysis, we do not know the TRUE mean of a population of interest. This is why we use sampling to be able to generalize the.
Estimation Statistics with Confidence. Estimation Before we collect our sample, we know:  -3z -2z -1z 0z 1z 2z 3z Repeated sampling sample means would.
Chapter 11: Estimation Estimation Defined Confidence Levels
AP Statistics Chapter 9 Notes.
Stat 13, Thu 5/10/ CLT again. 2. CIs. 3. Interpretation of a CI. 4. Examples. 5. Margin of error and sample size. 6. CIs using the t table. 7. When.
Many times in statistical analysis, we do not know the TRUE mean of a population of interest. This is why we use sampling to be able to generalize the.
Stat 13, Tue 5/15/ Hand in HW5 2. Review list. 3. Assumptions and CLT again. 4. Examples. Hand in Hw5. Midterm 2 is Thur, 5/17. Hw6 is due Thu, 5/24.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Section 5.2 The Sampling Distribution of the Sample Mean.
Confidence Intervals and Tests of Proportions. Assumptions for inference when using sample proportions: We will develop a short list of assumptions for.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
February 2012 Sampling Distribution Models. Drawing Normal Models For cars on I-10 between Kerrville and Junction, it is estimated that 80% are speeding.
AP Review #3: Continuous Probability (The Normal Distribution)
Sampling Distribution Models Chapter 18. Toss a penny 20 times and record the number of heads. Calculate the proportion of heads & mark it on the dot.
Confidence intervals: The basics BPS chapter 14 © 2006 W.H. Freeman and Company.
Confidence Intervals Lecture 3. Confidence Intervals for the Population Mean (or percentage) For studies with large samples, “approximately 95% of the.
Inferential Statistics Part 1 Chapter 8 P
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
 The point estimators of population parameters ( and in our case) are random variables and they follow a normal distribution. Their expected values are.
Section 7-3 Estimating a Population Mean: σ Known.
Statistics and Quantitative Analysis U4320 Segment 5: Sampling and inference Prof. Sharyn O’Halloran.
Chapter 10: Confidence Intervals
Confidence intervals. Want to estimate parameters such as  (population mean) or p (population proportion) Obtain a SRS and use our estimators, and Even.
1 Mean Analysis. 2 Introduction l If we use sample mean (the mean of the sample) to approximate the population mean (the mean of the population), errors.
Chapter 18 Sampling distribution models math2200.
INFERENTIAL STATISTICS DOING STATS WITH CONFIDENCE.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
10.1 – Estimating with Confidence. Recall: The Law of Large Numbers says the sample mean from a large SRS will be close to the unknown population mean.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Many times in statistical analysis, we do not know the TRUE mean of a population on interest. This is why we use sampling to be able to generalize the.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Estimating a Population Proportion ADM 2304 – Winter 2012 ©Tony Quon.
Confidence Intervals Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
Introduction to Inference
Sampling Distribution Models
Stat 35b: Introduction to Probability with Applications to Poker
Sampling Distribution Models
Chapter 18 – Central Limit Theorem
Sampling Distribution Models
Chapter 13 Sampling Distributions
Presentation transcript:

Stat 13, Tue 5/8/ Collect HW Central limit theorem. 3. CLT for 0-1 events. 4. Examples. 5.  versus  /√n. 6. Assumptions. Read ch. 5 and 6. Ignore the normal approximation to the binomial. Hw5 is due Tue, 5/15. Midterm 2 is Thur, 5/17. 1

2. Central Limit Theorem (CLT). Suppose we’re interested in the avg income in the Los Angeles population. We think it’s $ We sample 100 people and calculate their mean income,. Say = $ That sample mean will always be a bit different from the population mean, μ. There is a sense in which μ could be $36000, and yet could be $35000, just by chance. The question is: what is the chance of this happening? How much different should be expect and μ to be? To get at this question, we have to think about sampling over and over, repeatedly, each time taking 100 people at random. For each sample, we’ll get a different. Imagine looking at this list of all these different s. It turns out that, provided n is large (here n = 100), and provided each sample is a SRS (simple random sample), then: this list of s is NORMALLY DISTRIBUTED with mean μ, and the std deviation of the s is where  is the std deviation of the population. That’s called the CLT (central limit theorem) or the NORMAL approximation. What does this mean? It means that the sample mean is typically about μ, but off by around  /√n. It means that 68% of the times we sample, the sample mean is within  /√n of µ. Note that this doesn’t depend on what the population looks like. Even if the population is not normally distributed, even if it’s all 0s and 1s, we can use the normal table to see what the probability is that the sample mean is in some range. 2

What about if you have a population of 0s and 1s, and you sample from this population and are interested in the total number of 1s in your sample. Your book calls this total X. e.g. the number of people in your sample with type A blood. Usually we’re not interested in X: we’re interested in the PERCENTAGE, with A blood in the sample, and we want to use this to estimate the PERCENTAGE, p, with type A blood in the population. The formulas for X in the book are unnecessary and are also potentially misleading. They don’t matter once you realize is and p is µ for such problems. If we view each value as 1 or 0, then the population percentage (p) is the same as the population mean μ. And the sample percentage ( ) is the same as the sample mean of all the 0s and 1s in the sample. So we don’t ever need separate formulas for and p. They are special cases of the formulas for and µ, where the data happen to be 0s and 1s. Again, the CLT says that is normally distributed with mean μ and the std deviation of is. This is called the standard error for. Remember though that  is the std dev of the population. If all the values in the population are 1’s and 0’s, then  is the std dev of those numbers. In such situations,  = where p = percentage of ones in population, and q = 1 − p = percentage of zeros. Forget the book’s equations for and especially the equations for x which might confuse you. They have a crazy formula for the SE for a proportion involving n+4 instead of n. I have no idea where they got that from. Ignore it and use n CLT for proportions.

4. Examples. The average cow produces 20 pounds of wet manure per day, with a sd of 6 pounds. Suppose we take a SRS (simple random sample) of 144 cows. What is the probability that our sample mean will be greater than 21 pounds? This is a SRS, and n is large, so is N(20, 6/ √n ) = N(20,.5). So we’re asking what the chance is that a draw from a normal distribution with mean 20 and std dev 0.5 will be at least 21. You know how to do this (subtract the mean and divide by the sd): this is the chance that a draw from a N(0,1) will be at least (21−20) ÷ 0.5 = 2.00, which from the tables is 1 − = , or 2.28%. What is the probability that the amount for one individual cow will be greater than 21? Can’t answer that question. Not enough info. But suppose we also knew that the population of cow excretions was normally distributed. Then we could do it. The answer would be: the chance that a draw from a N(20, 6) is at least 21, which is the chance that a N(0, 1) is at least (21-20) ÷ 6 = 0.17, which from the tables is 1 −.5675 = , or 43.25%. 4

Another example. Suppose that 90% of the cows are dairy cows. In a SRS of 144 cows, what’s the probability that at least 80% of the cows in our sample are dairy cows. Answer: this is P( > 0.80) = P( > 0.80), which is the probability that a draw from a N(0.9, ) is > What’s ? n = 144.  = = √ 0.09 = 0.3. So, = 0.3/12 = We want the probability that a draw from a N(0.9, 0.025) is at least This is the probability that a N(0, 1) draw is at least (0.80−0.9) ÷ = − is off the charts. For the closest number on the chart, the probability of being greater than that number is or 99.98%, so all we can say is the probability is at least 99.98%. 5

5.  versus. The issue here is whether you are interested in one draw or a sample of draws.  tells you how much ONE DRAW typically differs from the mean. is how much, the mean of SEVERAL DRAWS, typically differs from the mean. 6. Assumptions. The assumptions are important here. The CLT says that if we have a SRS, n is large, and  is known, then ~ N(µ, ). Don’t need SRS if the obs. are iid (independent and identically distributed) with mean µ. Don’t need n large if population is normal. But, if n is small and pop. is not known to be normal, we’re stuck. Need more info. Large n? For non-0-1 problems, n ≥ 25 is often large enough. Let’s use this rule of thumb. For 0-1 problems, the general rule of thumb is that n and n must both be ≥ 10. In other words, there must be at least 10 of each type in the sample. If n is large and you don’t know , you can plug in s, the sd of your sample. If n is small, pop. is normal, and you use s for , then the distribution is t, not N. 6

Next: Confidence Intervals (CIs). 7