Download presentation

Presentation is loading. Please wait.

Published byTrevor Kennison Modified about 1 year ago

1
Sampling

2
Sampling Probability Sampling Probability Sampling Based on random selection Based on random selection Non-probability sampling Non-probability sampling Based on convenience Based on convenience

3
Sampling Miscues: Alf Landon for President (1936) Literary Digest: post cards to voters in 6 states Literary Digest: post cards to voters in 6 states Correctly predicting elections from Correctly predicting elections from Names selected from telephone directories and automobile registrations Names selected from telephone directories and automobile registrations In 1936, they sent out 10 million post cards In 1936, they sent out 10 million post cards Results pick Landon 57% to Roosevelt 43% Results pick Landon 57% to Roosevelt 43% Election: Roosevelt in the largest landslide Election: Roosevelt in the largest landslide Roosevelt 61% of the vote and in Elect. Col. Roosevelt 61% of the vote and in Elect. Col. Why so inaccurate?: Poor sampling frame Why so inaccurate?: Poor sampling frame Leads to selection of wealthy respondents Leads to selection of wealthy respondents

4
Sampling Miscues: Thomas E. Dewey for President (1948) Gallup uses quota sampling to pick winner Gallup uses quota sampling to pick winner Quota sampling: Quota sampling: matches sample characteristics to characteristics of population matches sample characteristics to characteristics of population Gallup quota samples on the basis of income Gallup quota samples on the basis of income In 1948, Gallup picked Dewey to defeat Truman In 1948, Gallup picked Dewey to defeat Truman Reasons: Reasons: 1. Most pollsters quit polling in October 1. Most pollsters quit polling in October 2. Undecided voters went for Truman 2. Undecided voters went for Truman 3. Unrepresentative samples—WWII changed society since census 3. Unrepresentative samples—WWII changed society since census

5
Non-probability Sampling In situations where sampling frame for randomization doesn’t exist In situations where sampling frame for randomization doesn’t exist Types of non-probability samples: Types of non-probability samples: 1. Reliance on available subjects 1. Reliance on available subjects convenience sampling convenience sampling 2. Purposive or judgmental sampling 2. Purposive or judgmental sampling 3. Snowball sampling 3. Snowball sampling 4. Quota sampling 4. Quota sampling

6
Reliance on Available Subjects Person on the street, easily accessible Person on the street, easily accessible Examples: Examples: Mall intercepts, college students, person on the street Mall intercepts, college students, person on the street Frequently used, but usually biased Frequently used, but usually biased Notoriously inaccurate Notoriously inaccurate Especially in making inferences about larger population Especially in making inferences about larger population

7
Purposive or Judgmental Sampling Dictated by the purpose of the study Dictated by the purpose of the study Situational judgments about what individuals should be surveyed to make for a useful or representative sample Situational judgments about what individuals should be surveyed to make for a useful or representative sample E.g., Using college students to study third-person effects regarding rap and metal music E.g., Using college students to study third-person effects regarding rap and metal music 3pe: Others are more affected by exposure than self 3pe: Others are more affected by exposure than self Assessing effects on self and others Assessing effects on self and others Using college students makes for homogeneity of self Using college students makes for homogeneity of self

8
Snowball Sampling Used when population of interest is difficult to locate Used when population of interest is difficult to locate E.g., homeless people E.g., homeless people Research collects data from of few people in the targeted group Research collects data from of few people in the targeted group Initially surveyed individuals asked to name other people to contact Initially surveyed individuals asked to name other people to contact Good for exploration Good for exploration Bad for generalizability Bad for generalizability

9
Quota Sampling Begins with a table of relevant characteristics of the population Begins with a table of relevant characteristics of the population Proportions of Gender, Age, Education, Ethnicity from census data Proportions of Gender, Age, Education, Ethnicity from census data Selecting a sample to match those proportions Selecting a sample to match those proportions Problems: Problems: 1. Quota frame must be accurate 1. Quota frame must be accurate 2. Sample is not random 2. Sample is not random

10
Probability Sampling Goal: Representativeness Goal: Representativeness Sample resembles larger population Sample resembles larger population Random selection Random selection Enhancing likelihood of representative sample Enhancing likelihood of representative sample Each unit of the population has an equal chance of being selected into the sample Each unit of the population has an equal chance of being selected into the sample

11
Population Parameters Parameter: Summary statistic for the population Parameter: Summary statistic for the population E.g., Mean age of the population E.g., Mean age of the population Sample is used to make parameter estimates Sample is used to make parameter estimates E.g., Mean age of the sample E.g., Mean age of the sample Used as an estimate of the population parameter Used as an estimate of the population parameter

12
Sampling Error Every time you draw a sample from the population, the parameter estimate will fluctuate slightly Every time you draw a sample from the population, the parameter estimate will fluctuate slightly E.g.: E.g.: Sample 1: Mean age = 37.2 Sample 1: Mean age = 37.2 Sample 2: Mean age = 36.4 Sample 2: Mean age = 36.4 Sample 3: Mean age = 38.1 Sample 3: Mean age = 38.1 If you draw lots of samples, you would get a normal curve of values If you draw lots of samples, you would get a normal curve of values

13
Normal Curve of Sample Estimates Frequency of estimated means from multiple samples Estimated Mean Likely population parameter

14
Standard Error The average distance of sample estimates from the population parameter The average distance of sample estimates from the population parameter 68% of sample estimates will fall within in one standard error of the population parameter 68% of sample estimates will fall within in one standard error of the population parameter

15
Normal Curve of Sample Estimates Frequency of estimated means from multiple samples Estimated Mean Population parameter 1 standard error unit

16
Normal Curve of Sample Estimates Frequency of estimated means from multiple samples Estimated Mean Population parameter 1 standard error unit 2/3 of samples

17
Standard Error Estimates and Sample Size As the sample size increases: As the sample size increases: The standard error decreases The standard error decreases In other words, are sample estimate is likely to be closer to the population parameter In other words, are sample estimate is likely to be closer to the population parameter As the sample size increases, we get more confident in our parameter estimate As the sample size increases, we get more confident in our parameter estimate

18
Confidence Levels Two thirds of samples will fall within the standard error of the population parameter Two thirds of samples will fall within the standard error of the population parameter Therefore: a single sample has a 68% chance of being within the standard error Therefore: a single sample has a 68% chance of being within the standard error Confidence levels: Confidence levels: 68% sure estimate is within 1 s.e. of parameter 68% sure estimate is within 1 s.e. of parameter 95% sure estimate is within 2 s.e. of parameter 95% sure estimate is within 2 s.e. of parameter 99% sure estimate is within 3 s.e. of parameter 99% sure estimate is within 3 s.e. of parameter

19
Confidence Interval Interval width at which we are 95% confident contains the population parameter Interval width at which we are 95% confident contains the population parameter For example, we predict that Candidate X will receive 45% of the vote with a 3% confidence interval For example, we predict that Candidate X will receive 45% of the vote with a 3% confidence interval We are 95% sure the parameter will be between: We are 95% sure the parameter will be between: 42% and 48% 42% and 48% Confidence interval shrinks as: Confidence interval shrinks as: Standard error is smaller Standard error is smaller Sample size is larger Sample size is larger

20

21
Sample Size & Confidence Interval How precise does the estimate have to be? How precise does the estimate have to be? More precise: larger sample size More precise: larger sample size Larger samples increase precision Larger samples increase precision But at a diminishing rate But at a diminishing rate Each unit you add to your sample contributes to the accuracy of your estimate Each unit you add to your sample contributes to the accuracy of your estimate But the amount it adds shrinks with additional unit added But the amount it adds shrinks with additional unit added

22
95% Confidence Intervals % split N = 100 N = 200 N = 300 N = 400 N = 500 N = 700 N = 1000 N = / / / Sample Size

23
Sampling Frame List of units from which sample is drawn List of units from which sample is drawn Defines your population Defines your population E.g., List of members of organization or community E.g., List of members of organization or community Ideally you’d like to list all members of your population as your sampling frame Ideally you’d like to list all members of your population as your sampling frame Randomly select your sample from that list Randomly select your sample from that list Often impractical to list entire population Often impractical to list entire population

24
Sampling Frames for Surveys Limitations of the telephone book: Limitations of the telephone book: Misses unlisted numbers Misses unlisted numbers Class bias: Class bias: Poor people may not have phone Poor people may not have phone Less likely to have multiple phone lines Less likely to have multiple phone lines Most studies use a technique such as Random Digit Dialing as a surrogate for a sampling frame Most studies use a technique such as Random Digit Dialing as a surrogate for a sampling frame

25
Types of Sampling Designs Simple Random Sampling Simple Random Sampling Systematic Sampling Systematic Sampling Stratified Sampling Stratified Sampling Multi-stage Cluster Sampling Multi-stage Cluster Sampling

26
Simple Random Sampling Establish a sampling frame Establish a sampling frame A number is assigned to each element A number is assigned to each element Numbers are randomly selected into the sample Numbers are randomly selected into the sample

27
Systematic Sampling Establish sampling frame Establish sampling frame Select every k th element with random start Select every k th element with random start E.g., 1000 on the list, choosing every 10 th name yields a sample size of 100 E.g., 1000 on the list, choosing every 10 th name yields a sample size of 100 Sampling interval: standard distance between units on the sampling frame Sampling interval: standard distance between units on the sampling frame Sampling interval = population size / sample size Sampling interval = population size / sample size Sampling ratio: proportion of population that are selected Sampling ratio: proportion of population that are selected Sampling ratio = sample size / population size Sampling ratio = sample size / population size

28
Stratified Sampling Modification used to reduce potential for sampling error Modification used to reduce potential for sampling error Research ensures that certain groups are represented proportionately in the sample Research ensures that certain groups are represented proportionately in the sample E.g., If the population is 60% female, stratified sample selects 60% females into the sample E.g., If the population is 60% female, stratified sample selects 60% females into the sample E.g., Stratifying by region of the country to make sure that each region is proportionately represented E.g., Stratifying by region of the country to make sure that each region is proportionately represented

29
Two Methods of Stratification 1. Sort population in groups 1. Sort population in groups Randomly select within groups in proportion to relative group size Randomly select within groups in proportion to relative group size 2. Sort population into groups 2. Sort population into groups Systemically select within groups using random start Systemically select within groups using random start Disproportionate stratification: Disproportionate stratification: Some stratification groups can be over-sampled for sub- group analysis Some stratification groups can be over-sampled for sub- group analysis Samples are then weighted to restore population proportions Samples are then weighted to restore population proportions

30
Cluster Sampling Frequently, there is no convenient way of listing the population for sampling purposes Frequently, there is no convenient way of listing the population for sampling purposes E.g., Sample of Dane County or Wisconsin E.g., Sample of Dane County or Wisconsin Hard to get a list of the population members Hard to get a list of the population members Cluster sample Cluster sample Sample of census blocks Sample of census blocks List of people for selected census block List of people for selected census block Select sub-sample of people living on each block Select sub-sample of people living on each block

31
Multi-stage Cluster Sample Cluster sampling done in a series of stages: Cluster sampling done in a series of stages: List, then sample within List, then sample within Example: Example: Stage 1: Listing zip codes Stage 1: Listing zip codes Randomly selecting zip codes Randomly selecting zip codes Stage 2: List census blocks within selected zip codes Stage 2: List census blocks within selected zip codes Randomly select census blocks Randomly select census blocks Stage 3: List households on selected census blocks Stage 3: List households on selected census blocks Randomly select households Randomly select households Stage 4: List residents of selected households Stage 4: List residents of selected households Randomly select person to interview Randomly select person to interview

32
Multi-stage Sampling and Sampling Error Error is introduced at each stage Error is introduced at each stage One solution is to use stratification at each stage to try to reduce sampling error One solution is to use stratification at each stage to try to reduce sampling error

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google