Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stat 512 Day 6: Sampling. Last Time Get lots of sleep! Characteristics of the distribution of a quantitative variable  Shape, center, spread, outliers.

Similar presentations


Presentation on theme: "Stat 512 Day 6: Sampling. Last Time Get lots of sleep! Characteristics of the distribution of a quantitative variable  Shape, center, spread, outliers."— Presentation transcript:

1 Stat 512 Day 6: Sampling

2 Last Time Get lots of sleep! Characteristics of the distribution of a quantitative variable  Shape, center, spread, outliers (in context) “Formal” analysis for comparing two groups: statistical significance  What is the distribution of the “by chance” results?

3 Statistical Significance Calculate the difference in means  Could a difference this large happen by chance? Can use simulation to mimic the randomization process, assuming no difference between the groups See how often you get a difference at least as large by chance alone (no treatment effect)  p-value, statistical significance Consider study design to decide whether to draw a causal conclusion

4 Statistical Significance

5 Example 2 – Day 5 Actual study Hypothetical data

6 Example 2 – Day 5

7 Statistical Process Compare results Randomized? Getting the observational units in the first place! Explanatory Variable

8 Statistical Process Compare results Randomized?

9 Example 1: Sampling Words Circle 10 representative words Def: A parameter is a numerical characteristic of the population   (pi, mu, sigma) Def: A statistic is a numerical characteristic of the sample , s (x-bar, p-hat, s)

10 Example 1: Sampling Words Does our sampling method generally lead to good estimates of the parameter? Sample results vary from sample to sample! A sampling method is unbiased if the distribution of the sample statistics is centered at the population value.

11 Bias Literary Digest (p. 21) Bad Sampling Frame Voluntary response bias  Those who chose to respond are most likely to feel strongly, usually negatively, on the issue. Nonresponse bias  Those who aren’t home or who don’t have listed numbers or who refuse to participate Convenience sample  Those who are easy to get a hold of, easily remembered

12

13 Example 1: Sampling Words Def: A simple random sample gives everyone word in the population an equal probability of being selected.  Every sample of n words is as likely as any other sample of n words.

14 Example 1: Sampling Words Selecting a simple random sample MTB> set c1 DATA> 1:268 DATA> end MTB> sample 5 c1 c2 Find the corresponding ID numbers of the sampling frame (from webpage) Determine the average length of the 5 words in your sample

15 Example 2: Sampling Words (cont.) What is the long-term pattern of these sample means?  Def: A sampling distribution of a statistic is the distribution of the sample statistic for all possible samples (of the same size) from the population.  An empirical sampling distribution gives you an idea of the pattern from a large number of samples of the same size

16 Summary Values of sample statistics vary from sample to sample – sampling variability  Random sampling error Sampling distribution = distribution of sample statistics (from all possible random samples)  Observational units = samples  Variable = sample statistics (e.g., sample means)  Sampling method is unbiased if sampling distribution is centered at parameter of interest Random samples are unbiased and allow us to estimate the size of the random sampling error  Sampling distribution follows a predictable pattern

17 Statistical Significance This consistent pattern helps us to decide when we might have a surprising value for the sample statistic.  Level of surprise depends on sample size p-value indicates how often a random sample would like to a value of the sample statistic at least as extreme  Is sample statistic result “significantly” different from population parameter?

18 Example 3: Comparison Shopping

19 Example Lost ticket, would you buy another? Lost $20, would you buy another? Lives saved? Lives lost? Prediction: more likely if lost ticket Prediction: Option A more likely when in terms of lives saved

20 Nonsampling Errors March 6-8, 2004 Wall Street Journal/NBC poll of 1,018 adults GAY MARRIAGE opinions depend on how the question is asked. To one poll question, a 52%-43% majority opposes a constitutional amendment "making it illegal for gay couples to marry." A 54%-42% majority responds favorably to a second query that omits the word "illegal" and more benignly asks about an amendment "that defined marriage as a union only between a man and a woman."

21 Sources of Nonsampling Errors Sensitive questions  Social acceptability Wording of question  Appearance of interviewer Order of choices Unsure response, change mind, faulty memory

22 For Tuesday Submit your tentative project proposal (see syllabus for additional guidelines) Submit PP 6 in Blackboard Read Sec. 4.1 and 4.2 Complete Example 3 from the Day 6 handout

23 Project Discussion


Download ppt "Stat 512 Day 6: Sampling. Last Time Get lots of sleep! Characteristics of the distribution of a quantitative variable  Shape, center, spread, outliers."

Similar presentations


Ads by Google