2 Sampling Distribution (a.k.a. “Distribution of Sample Outcomes”) Based on the laws of probability“OUTCOMES” = proportions, means, etc.Infinite number of random samples all possible sample outcomesAnd the probability of obtaining each oneAllows us to estimate:What is the likelihood of obtaining our particular sample outcome?Or, “There is a X% chance that the true population parameter is within +/- some distance from this sample outcome.
3 ESTIMATIONTo estimate (make an “inference”) population parameters from sample statisticsWhy Necessary?Most commonly used for polling dataPoint estimateConfidence intervals
4 Estimation1 : Pick Confidence Level Probability that the unknown population parameter falls within the confidence intervalConfidence level = 1 - Alpha () is the probability that the parameter is NOT within the interval
5 Estimation 2: Find Appropriate Z-score Divide the probability of error () equally into the upper and lower tails of the distributionFor 95% confidence level, ( = .05) the area under curve must equal 5%. The corresponding Z score for this is 188.8.131.52Sampling Distribution.025.025-1.961.96 Z scores
6 Estimation 3 : Constructing an Interval Estimate What is your point estimate?How many “standard errors” do you want to go out (on sampling distribution) from this point estimate?What is a particular standard error “worth” for our sample outcome?Takes into account sample size (N) and dispersion/heterogeneityFor proportions, the p (1-p) part of the equation
7 Constructing a Confidence Interval for Proportions What is your point estimate? proportionHow many “standard errors” do you want to go out from this point estimate?1.65 Standard Errors alpha of .10 (Confidence level of 90%)1.96 Standard Errors alpha of .05 (Confidence level of 95%)2.58 Standard Errors alpha of .01 (Confidence level of 99%)What is a particular standard error “worth” for out sample outcome? Everything after the “z” in formula 7.3 in Healey bookNumerator = (your proportion) (1- proportion)Generic = .50 .25 in numeratorSeparate formulas for sample means & sample proportionsToday, let’s focus on proportions…FORMULA 7.3 FOUND IN…(p. 165 of Version 6)(p. 174 of Version 7)
8 Estimation of Population Means EXAMPLE:A researcher has gathered information from a random sample of 178 households in Duluth. Construct a confidence interval to estimate the population mean at the 95% level:An average of 2.3 people reside in each household. Standard deviation is .35.
9 Constructing a Confidence Interval for Means What is your point estimate? meanHow many “standard errors” do you want to go out from this point estimate?1.96 Standard Errors alpha of .052.58 Standard Errors alpha of .01What is a particular standard error “worth” for out sample outcome? s/√N-1We don’t know the population standard deviation (σ) so we substitute our best guessBUT, we subtract one to “correct” for bias
10 Application to Example What is your point estimate? 2.3 people per householdHow many “standard errors” do you want to go out from this point estimate?1.96 Standard Errors alpha of .05What is a particular standard error “worth” for out sample outcome? .35 /√178-1Formulac.i.95% = z +/- (s/√N-1) = 1.96 (.35 /√178-1) = .051595% sure that over the long run, the average number of people in Duluth households (THE WHOLE POPULATION) is between 2.25 and 2.35.
11 In groups, construct confidence intervals for the following means A random sample of 429 college students was interviewedThey reported they had spent an average of $178 on textbooks during the previous semester. If the standard deviation (s) of these data is $15 construct an estimate of the population at the 95% confidence level.They reported they had missed 2.8 days of class per semester because of illness. If the sample standard deviation is 1.0, construct an estimate of the population mean at the 99% confidence level.Two individuals are running for mayor of Duluth. You conduct an election survey of 100 adult Duluth residents 1 week before the election and find that 45% of the sample support candidate Long Duck Dong, while 40% plan to vote for candidate Singalingdon.Using a 95% confidence level, based on your findings, can you predict a winner?
12 Review: What influences confidence intervals? The width of a confidence interval depends on three things: The confidence level can be raised (e.g., to 99%) or lowered (e.g., to 90%)N: We have more confidence in larger sample sizes so as N increases, the interval decreasesVariation: more variation = more errorFor proportions, % agree closer to 50%For means, higher standard deviations
13 Hypothesis Testing (intro) SPRING ’07, SOC 3155 CLASS #12REMINDER:SSR HW #2 IS DUE TUESDAY.HW #3 IS DUE A WEEK FROM TODAY.[ANNOUNCEMENT: BRING BOOKS & CALCULATORS FOR NEXT TIME….]This material coincides with Chapter 8 of Healey.HYPOTHESISTESTINGEstimation
14 Hypothesis Testing Hypothesis (Causal) Hypothesis testing A prediction about the relationship between 2 variables that asserts that changes in the measure of an independent variable will correspond to changes in the measure of a dependent variableHypothesis testingIs the hypothesis supported by facts (empirical data)
15 Hypothesis Testing & Statistical Inference We almost always test hypotheses using sample dataAlso referred to as “significance testing”Draw conclusions about the population based on sample statisticsAs a result, have to account for sampling error when testing hypothesesIs there a “statistically significant” finding
16 Research vs. Null hypotheses Research hypothesisH1Typically predicts relationships or “differences”Null hypothesisHoPredicts “no relationship” or “no difference”Can usually create by inserting “not” into a correctly worded research hypothesisIn Science, we test the null hypothesis!
17 DIRECTIONAL VS. NONDIRECTIONAL HYPOTHESES Non-directional research hypothesis“There was an effect”“There is a difference”Directional research hypothesisSpecifies the direction of the difference (greater or smaller) from the HoGROUP WORK
18 Testing a hypothesis 101 State the null & research hypotheses Set the criteria for a decisionAlpha, critical regions for particular test statisticCompute a “test statistic”Make a decisionREJECT OR FAIL TO REJECT the null hypothesisWe cannot “prove” the null hypothesis (always some non-zero chance we are incorrect)