Presentation on theme: "MATH/ECON 108 Visiting Associate Prof Lisa Giddings Sampling and Hypothesis Testing."— Presentation transcript:
MATH/ECON 108 Visiting Associate Prof Lisa Giddings Sampling and Hypothesis Testing
DeweyTruman Crossley Gallup Roper
DeweyTruman Crossley Gallup Roper Election
OOPS!: Truman Beats Dewey Just before the election, Gallup gave the election to Dewey by 49.5%... When the election was over, Truman had beaten Dewey by more than two million votes and carried 28 states (303 electoral votes) to Deweys 16 (189 electoral votes). In 1948 the group that voted Truman in was less likely to be included in a telephone poll.
Telephone polls and Probability Samples Social scientists use probability samples when they have access to a population. What is a probability sample? What might have been the problem in 1948 when Gallup and others tried to use a probability sample in their telephone survey?
Problem? Not everyone has a phone – particularly in Tempting present-day example: use postal codes. Whats the problem with this?
Problems Illegal immigrants Homeless Young non-working populations Example: problems of accurate immigrant and ethnic group counts with the census.
Solution? Sampling Frame Building a sampling frame means that you go out and canvas everywhere the group youre trying to study might be, include each place identified in your sampling frame, and then sample randomly from your sampling frame.
Sampling Frame: Source from which samples are chosen Phone book Voter registration list People who pass the corner of Snelling and Grand
Types of Samples Simple Random: each subset has the same probability of selection Stratified: do random samples within subcategories Cluster: randomly pick a point, then choose all those close Quota: set quotas for males vs. females, other categories, keep or reject a randomly chosen subject depending on whether or not they fit the quota
1948 Gallup: What did they do? Quota sampling using the following categories: Sex: 50.5% male Age: 29.6% 49 Education: 35.3% grade school or less, 17.9% college Race: 95% white Vet Status: 13.3% veterans Union membership: 17.5% members Problem: great scope for interviewer bias. Why?
Population and Sample Population: an entire set of objects or units of observation of one sort or another Sample: a subset of a population Parameter versus statistic SizeMeanVarianceProportio n PopulationNµ σ2σ2 π Sample:n(x hat)S2S2 p
Properties of estimators: sample mean Insert Formula: To make inferences regarding the population mean (µ), we need to know something about the probability distribution of this sample statistic, x-hat. The distribution of a sample statistic is known as a sampling distribution. Two of its characteristics are of particular interest, the mean or expected value and the variance or standard deviation.
Thought Experiment Sample repeatedly from the given population, each time recording the sample mean, and take the average of those sample means.
Unbiased Estimator If the sampling procedure is unbiased, deviations of x-hat from µ in the upward and downward directions should be equally likely: on average, they should cancel out. E(x-hat) = µ = E(x) The sample mean is then an unbiased estimator of the population mean.
Efficient Estimators One estimator is more efficient than another if its values are more tightly clustered around its expected value. For example, imagine alternative estimators of the population mean: X-hat versus the average of the largest and smallest values in the sample. The degree of dispersion of an estimator is generally measured by the standard deviation of its probability distribution (sampling distribution). This goes under the name standard error.
Standard Error of Sample mean Insert Formula: The more widely dispersed are the population values around their mean (larger σ) the greater the scope for sampling error (i.e. drawing by chance an unrepresentative sample whose mean differs substantially from µ). A larger sample size (greater n) narrows the dispersion of x-hat.
Shape of Sampling Distributions Besides knowing the expected value and standard error, we also need to know the shape of a sampling distribution in order to put it to use. Sample mean: Central Limit Theorem implies a Gaussian (or normal) distribution for large enough samples
Non Gaussian Not all sampling distributions are Gaussian, e.g. sample variance as an estimator of the population variance. The shape below represents a Chi-Square (X 2 ) distribution. Insert a Chi Square Note that even in this case, if the sample size is large, the Chi-squared distribution will converge to normal.
Confidence Intervals If we know the mean, the standard error, and the shape of the distribution of a given sample statistic, we can then make definite probability statements about the statistic. For example: say µ = 100 and σ = 12 for a certain population, and we draw a sample with n = 36 from that population. The standard error of x-hat is Which is 12/6= 2, and a sample size of 36 is large enough to justify the assumption of a Normal distribution.
Okay so what the heck does that mean? That means that we know that a range of: µ ± 2 σ encloses the central 95 percent of a normal distribution, so we can state: P (96 < x-hat < 104).95 Or, in English: there is a 95 percent probability that the sample mean lies within 4 units (two standard errors) of the population mean of 100.
What if we have no clue about the population mean? If µ is unknown, we can still say P (µ - 4 < x-hat < µ + 4 ).95 Or, again, in English: with probability.95 the sample mean will be drawn from within 4 units of the unknown population mean. We go ahead and draw the sample, and calcualte a sample mean of (say) 97. If theres a probability of.95 that our sample mean came from within 4 units of the population mean, we can turn that around: Were entitled to be 95 percent confident that µ lies between 93 and 101. We draw up a 95 percent confidence interval for the population mean as X-hat +/- 2 σ x-hat
Go to www. Pollingreport.com /BushJob1.htm /obama_job.htm These web pages list polls that tracked Bushs job approval rating and ongoing polls that are tracking Obamas job approval rating. The latest surveys display a somewhat small range of values from 63-69% approval ratings and 12 – 23% approval ratings over the last month or so. They all claim a margin of error of around +/- 3%. Explain the discrepancies.
Margin of Error This is based on the idea of a confidence interval (but its not the same thing). The margin of error is a statistic expressing the amount of random sampling error in a surveys results. The larger the margin of error, the less faith one should have that the polls reported results are close to the true figures; that is, the figures for the whole population.
Obamas Job Approval Rating An opinion polling agency questions a sample of, say, 1200 people to assess the degree of support for a particular president. Sample info: p = 0.56 Our single best guess at the population proportion, π, is then 0.56, but we can quantify our uncertainty. The standard error of p is The value of π is unknown but we can substitute p or, to be conservative, we can put π = 0.5 which maximizes the value of π (1- π ).
Margin of error continued Using that assumption, the estimated standard error is the square root of: 0.25/1200 or The large sample justifies the assumption of a normal sampling distribution; the 95 percent confidence interval is: /1 2 x = /
Margin of Error Margin of Error = +/- 3% means that if the polled percentage was 50%, then only 5% of the time would we see this poll result when the true percentage was > 53% or < 47%. i.e., the 95% confidence interval would be [47%, 53%]. For example, remember the Bush/Obama race. What is the chance that either 53% of the people prefer Obama or less than 47% prefer Obama? If the polled percentage is anything else, then the 95% confidence interval is less than 3% on each side, but we still use the same Margin of Error.
The top portion of this graph depicts probability densities that show the relative likelihood that the true percentage is in a particular area given a reported percentage of 50% The bottom portion shows the 95% confidence intervals, the corresponding margins of error, and the sample sizes. Margin of error with sample size = 1000 at 99% confidence is +/- 4% Margin of error with sample size = 1000 at 95% confidence is +/- 3%
The Logic of Hypothesis Testing Think about the set-up of a hypothesis test and a court of law. Defendant on trial in the statistical court is the null hypothesis. This is some definite claim regarding a parameter of interest. Just as the defendant is presumed innocent until proven guilty, the null hypothesis or H 0 is assumed true (at least for the sake of argument) until the evidence goes against it.
Power 1 – β is the power of a test There is a trade off between α and β
Lets say we use a cutoff of 0.01 This means that well reject the null hypothesis if the p-value for the test is 0.01 If the null hypothesis is in fact true, what is the probability of our rejecting it? It is the probability of getting a p-value less than or equal to 0.01, which is (by definition) In selecting our cutoff, we selected α, the probability of a Type I error.