Journalism 614: Sampling and Non-Response

Journalism 614: Sampling and Non-Response

Sampling Probability Sampling Non-probability sampling
Based on random selection Non-probability sampling Based on convenience

Sampling Miscues: Alf Landon for President (1936)
Literary Digest: post cards to voters in 6 states Correctly predicting elections from Names selected from telephone directories and automobile registrations In 1936, they sent out 10 million post cards Results pick Landon 57% to Roosevelt 43% Election: Roosevelt in the largest landslide Roosevelt 61% of the vote and in Elect. Col. Why so inaccurate?: Poor sampling frame Leads to selection of wealthy respondents

Sampling Miscues: Thomas E. Dewey for President (1948)
Gallup picks winner Use quota sampling: matches sample characteristics to population Gallup quota samples on the basis of income In 1948, Gallup picked Dewey to defeat Truman Reasons: 1. Most pollsters quit polling in October 2. Undecided voters went for Truman 3. Unrepresentative samples—WWII changed society since census

Non-probability Sampling
In situations where sampling frame for randomization doesn’t exist Types of non-probability samples: 1. Reliance on available subjects convenience sampling 2. Purposive or judgmental sampling 3. Snowball sampling 4. Quota sampling

Reliance on Available Subjects
Person on the street, easily accessible Examples: Mall intercepts, college students, e-polls Frequently used, but usually biased Notoriously inaccurate Especially in making inferences about larger population, even with many respondents

Purposive or Judgmental Sampling
Dictated by the purpose of the study Situational judgments about what individuals should be surveyed to make for a useful or representative sample E.g., Using college students to study third-person effects regarding rap and metal music 3pe: Others are more affected by exposure than self Assessing effects on self and others Using college students makes for homogeneity of self

Snowball Sampling Used when population of interest is difficult to locate E.g., homeless people, meth addicts Research collects data from of few people in the targeted group Initially surveyed individuals asked to name other people to contact Good for exploration Bad for generalizability

Quota Sampling Begins with a table of relevant characteristics of the population Proportions of Gender, Age, Education, Ethnicity from census data Selecting a sample to match those proportions Problems: 1. Quota frame must be accurate 2. Sample is not random, but can be representative

Probability Sampling Goal: Representativeness Random selection
Sample resembles larger population Random selection Enhancing likelihood of representative sample Each unit of the population has an equal chance of being selected into the sample

Population Parameters
Parameter: Summary statistic for the population E.g., Mean age of the population Sample allows parameter estimates E.g., Mean age of the sample Used as an estimate of the population parameter

Sampling Error Every time you draw a sample from the population, the parameter estimate will fluctuate slightly E.g.: Sample 1: Mean age = 37.2 Sample 2: Mean age = 36.4 Sample 3: Mean age = 38.1 If you draw lots of samples, you would get a normal curve of values

Normal Curve of Sample Estimates
Frequency of estimated means from multiple samples Likely population parameter Estimated Mean

Error and Sample Size As the sample size increases:
The error decreases In other words, large sample estimate is likely to be closer to the population parameter As the sample size increases, we get more confident in our parameter estimate

Confidence Interval Interval width at which we are 95% confident the estimate contains the population parameter For example, we predict that Candidate X will receive 45% of the vote with a 3% confidence interval We are 95% sure the parameter will be between 42% and 48% The “margin of error” in a poll Confidence interval shrinks as: Error is smaller Sample size is larger

Sample Size & Confidence Interval
How precise does the estimate have to be? More precise: larger sample size Larger samples increase precision But at a diminishing rate Each unit you add to your sample contributes to the accuracy of your estimate But the amount it adds shrinks with additional unit added

95% Confidence Intervals
Sample Size % split N = 100 N = 200 N = 300 N = 400 N = 500 N = 700 N = 1000 N = 1500 50/50 10.0 7.1 5.8 5.0 4.5 3.8 3.2 2.6 70/30 9.2 6.5 5.3 4.6 4.1 3.5 2.9 2.4 90/10 6.8 4.2 3.0 2.7 2.3 1.9 1.5

Describe Sampling Frame
List of units from which sample is drawn Defines your population E.g., List of members of population Ideally you’d like to list all members of your population as your sampling frame Randomly select your sample from that list Often impractical to list entire population

Sampling Frames for Surveys
Limitations of the telephone book: Misses unlisted numbers/mobile numbers SES and age bias: Poor people may not have phone Less likely to have multiple phone lines Young people have mobile phone numbers Most studies use a technique such as Random Digit Dialing as a way around this

Types of Sampling Designs
Simple Random Sampling Systematic Sampling Stratified Sampling Multi-stage Cluster Sampling

Simple Random Sampling
Establish a sampling frame A number is assigned to each element Elements randomly selected into the sample Use a random number generator to select every case you need for inclusion.

Systematic Sampling Establish sampling frame
Select every kth element with random start E.g., 1000 on the list, choosing every 5th name yields a sample size of 200 Sampling interval: standard distance between units for the sampling frame Sampling interval = pop. size / sample size Sampling ratio: proportion of pop. selected Sampling ratio = sample size / population size

Stratified Sampling Modification used to reduce potential for sampling error Research ensures that certain groups are represented proportionately in the sample E.g., If the population is 60% female, stratified sample selects 60% females into the sample E.g., Stratifying by region of the country to make sure that each region is proportionately represented

Cluster Sampling Frequently, there is no convenient way of listing the population for sampling E.g., Sample of Dane County or Wisconsin Hard to get a list of the population members Cluster sample Sample of census blocks List of census blocks, list people for selected blocks Select sub-sample of people living on each block

Multi-stage Cluster Sample
Cluster sampling done in a series of stages: List, then sample within Example: Stage 1: Listing zip codes Randomly selecting zip codes Stage 2: List census blocks within selected zip codes Randomly select census blocks Stage 3: List households on selected census blocks Randomly select households Stage 4: List residents of selected households Randomly select person to interview

Nonresponse Declining contact and cooperation rates
Especially for “gold standard” RDD National Telephone Surveys Early research suggests the issues are rather small, with little bias on results Examined by comparing “easy to contact” individuals to “hard to contact” More systematic version is to compare between standard 5-day and “rigorous” survey

Accelerating Problem Survey firms reporting increasingly high rates of non-contact and non-cooperation Americans leading increasingly busy lives More and more unsolicited calls to home Sophisticated technologies to avoid calls Big drop offs in last years Call screening (I only take known callers) Cell phones (I pay for minutes during survey)

Hard to Gauge the Effect
Initial work conducted in late 90s Curtain et al - Low effort “restricted call” design versus high effort “all call” design See no difference in population estimates Keeter et al – Two parallel surveys, one using standard 5 day vs. “rigorous” On average, a two percentage point difference Seem to suggest that lower response rate does not effect survey quality

Non-response in this Century
Lot has changed in last decade + More legislative restrictions More mobile technologies More VOIP technologies Re-ran the study and found similar results comparing 5-day and rigorous 5-day – 10 call backs, one refusal conversion Rigorous – 21 weeks, advance letters, left messages, additional call backs, etc. Little difference in findings

The Problem of Cell Phones
In 2006, 13% of cell phone only HHs Increasing 1% every six months Increasing 2% every six months after 2006 By 2015, 46% of U.S. Adults Live In Cellphone-Only HHs 64% of Millennials (born ) are Cellphone-Only Bias in terms of who is missed is most prominent among young people. “Serious coverage problem” “Particular challenge”

Big differences in wireless only
HHs by Age and SES Only 16% of those 65+ Nearly 70% among 25-29 Just over 40% among “not poor” Over 50% for “near poor” Nearly 60% for the “poor” This creates systematic biases

Some substantial differences
Big differences between cell and non-cell respondents to a range of questions Especially for issues that affect younger people, and behaviors such as voting Register to vote? Political knowledge? Media usage?

Strive for Higher Response Rate

To Achieve a High Response Rate
Incentives: gifts or drawings for completion prize drawing vs smaller incentives to alll donating to a charity as an inducement Run experiments to see which incentive works Online: pre-invitation and landing page Test different versions to see what encourages respondents to click on survey link (online) Reminders and follow-ups to boost response In general, you only want to send one or two reminders. Make online survey friendly for all devices/browsers. Make it usable on mobile or tablet. See if there is a high bounce rate from particular devices

Journalism 614: Sampling and Non-Response

Similar presentations

Presentation on theme: "Journalism 614: Sampling and Non-Response"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Journalism 614: Sampling and Non-Response

Similar presentations

Presentation on theme: "Journalism 614: Sampling and Non-Response"— Presentation transcript:

Similar presentations

About project

Feedback