Statistics for Economist Ch. 14 Sample Surveys 1.Population and Sample 2.Problems of Surveys 3.Sample Selection Methods 4.The Gallup Poll 5.Telephone Surveys 6.Probability Error

INDEX 1 Population and Sample 2 Problems of Surveys 3 Sample Selection Methods 4 The Gallup Poll 5 Telephone Surveys 6 Probability Error

1. Population and Sample population parameter Parameter estimation : estimator test statistic Population estimation : estimate test statistic sampling prior meaning - Random variable posterior meaning - Realized value inference stiatistic sample sampling

INDEX 1 Population and Sample 2 Problems of Surveys 3 Sample Selection Methods 4 The Gallup Poll 5 Telephone Surveys 6 Probability Error

2. Problems of Surveys Survey Roosevelt ' s percentage (%) The election result62 The Digest Prediction of the election result 43 Gallup ' s prediction of the Digest Prediction 44 Gallup ' s prediction for the election result 56 Prediction for the election of 1936 The Digest vs. Gallup Roosevelt wins Landon wins Roosevelt wins Gallup ' s overwhelming victory - Serious problems in sample selecting methods of the Digest. - Gallup even predicted the wrong prediction of the Digest

Problem (1)- sample selection bias Sample Selecting Methods of the Digest : -Survey using address from a tele- phone directory or club member lists -In those days, telephone or clubs were the symbols of the rich -the poor were excluded from the sample -The Digest got unfavorable results to Roosevelt who is popular to the poor -Systematic tendency(Sample selection bias) is related to sample selection. -The sample was not selected in a fair way. -The sample does not represent the population If there is sample selection bias, the sample doesn ' t represent the population. The problem is not solved just by increasing the magnitude of the sample. 2. Problems of Surveys

Problem (2)- non-response bias Sample Selection Procedure of The Digest -Only 2.4 million people bothered to reply, out of the 10 million who got the questionnaire -lower-income and upper-income people tend not to respond to questionnaires so the middle class was over-represented among respondents. Non-respondents can be very different from respondents. (non-respondent bias). -Respondents do not represent the sample Non-respondents may differ from the respondents. If the percentage of the non-respondents is high, you should consider the possibility of non-respondent bias. 2. Problems of Surveys

INDEX 1 Population and Sample 2 Problems of Surveys 3 Sample Selection Methods 4 The Gallup Poll 5 Telephone Surveys 6 Probability Error

3. Sample Selection Methods quota sampling and simple random sampling quota sampling simple random sampling The sample is hand-picked to resemble the population with respect to some key characteristics. Drawing at random without replacement -simple random sample But, there may be unintentional bias under part of the interviewers. But, practically it ' s difficult to do from the whole population. - Use multistage Cluster sampling

multistage cluster sampling MWNE W S 1 st stage : towns in North East 2 nd stage : wards 3 rd stage: an electoral district 4 th stage: households Sample Selection Method of Gallup Allocate the number of the sample according to the region. For each stage select samples by probability methods. Sample Selection Method of Gallup Allocate the number of the sample according to the region. For each stage select samples by probability methods. 3. Sample Selection Methods

sample selecting by a probability method - Simple random sampling, Multistage cluster sampling - features The interviewers have no discretion at all as to who they interview. There is a definite procedure for selecting the sample, and it involves the planned use of probability. -With the probability method it is possible to compute the probability that any particular individuals in the population will get into the sample. -For this makes the reproduced research by other people possible, we can construct the grounds for judging the significance of the results. 3. Sample Selection Methods

Multistage cluster sampling : deciding the sampling probability It is necessary to adjust the sampling probabilities according tothe magnitude of the sampling units. A Seoul B Pusan The probability of selecting a zone in Selecting 3 zones from Seoul & Pusan = 3/30 = 1/10 = the probability that A zone will be selected = the probability that B zone will be selected For the above formula, percentage of selecting Seoul = 2/3 percentage of selecting Pusan =1/3 The probability of selecting a zone in Selecting 3 zones from Seoul & Pusan = 3/30 = 1/10 = the probability that A zone will be selected = the probability that B zone will be selected For the above formula, percentage of selecting Seoul = 2/3 percentage of selecting Pusan =1/3 It is desirable to make the probabilities of sampling each individual same. 3. Sample Selection Methods

Multistage cluster sampling: probability adjustment of Gallup Towns Wards Precincts In each stage, decide the sampling probabilities according the rate of the households The probability of selecting one household is all the same regardless of the town, ward, or precincts to which the household belongs. If we just estimate the sampling probability by the number of households, not by the number of household members, some bias is generated. A member of more-member households is not likely to be selected rather than a member of less-members households. household bias 3. Sample Selection Methods

INDEX 1 Population and Sample 2 Problems of Surveys 3 Sample Selection Methods 4 The Gallup Poll 5 Telephone Surveys 6 Probability Error

4. The Gallup Poll The Gallup Poll The Gallup Survey predicts very accurately through a small sample using the probability methods. To minimize bias, an impartial and objective probability methods should be used to choose the sample. But judgment and choice usually show bias, while probability is impartial. Though you use probability method, you cannot exclude all the bias.. -Eligible voters who does not vote -Eligible voters who didn ' t make decision -Response bias & Non-response bias etc. To solve this problem Gallup made many survey questions.

The Gallup Poll questionnaire for the 1984 election, USA 4. The Gallup Poll

The Gallup Poll questionnaire for the 1997 election, Korea 4. The Gallup Poll

Non-respondents When we predict the election result, it ' s good to screen out Non-respondents from Population and sample. They tend to answer to vote, even the cases they won ' t vote. The questionnaire of Gallup: ex. Where do you go to vote? Did you vote at previous election? If you screen out the people not likely to vote from sample or put less weight, you can get the accurate election result more easily. 4. The Gallup Poll

The questionnaire of Gallup DEMOCRATIC REPUBLICAN MONDALE & Ferraro REAGAN & Bush The Gallup Poll ballot, 1984 Ex. In Politics, as of TODAY, do you consider yourself a Republican, Democrat, or Independent ? 4. The Gallup Poll

Response bias & Non-response bias response bias Non-response bias The answers given by respondents are influenced to some extent by the phrasing of the questions and even the tone or attitude of the interviewer. Many subjects are missed. Since the interviewees tend to be different from the subjects available but hard to get. All interviewers use the same questionnaire, and the interview procedure is standardized as far as possible. Give more weight to the subjects who were available but hard to get. 4. The Gallup Poll

Checking Data & Putting Weights Below high school(20 %) Above high school(80%) The Gallup sample usually includes proportionately too many people with higher education. Below high school(25%) Above high school (75%) Population Sample Less weight is put on the responses of higher educated subjects. Lowly educated subjects (Below high school)=weight 1, highly educated subjects (Above high school)=weight 3/4 This weighting technique is an objective arithmetic technique applied to the sample after it is chosen to compensate for various small biases in the sampling procedure. But Quota sampling is a method for choosing the sample. It has large, subjective component. 4. The Gallup Poll

others Other factors that can affect political tendency are included. Some redundancy is built into the questionnaire, so the answers can be checked for the consistency : Gallup, USA Gallup, Korea education, political party, age, sex, job etc. Race and Religion Hometown of grandfather Inconsistencies suggest the interviewer may not be doing the job properly. A small percentage of the subjects are reinterviewed by administrative staff, as a further check on the quality of the work. 4. The Gallup Poll

INDEX 1 Population and Sample 2 Problems of Surveys 3 Sample Selection Methods 4 The Gallup Poll 5 Telephone Surveys 6 Probability Error

About 1/3 of residential telephones are unlisted. Rich people and poor people are more likely to have unlisted numbers, so the telephone book tilts toward the middle class. 5. Telephone Surveys Gallup : There are 4 time zones in the U.S. The Gallup Poll divided each zone into 3 types of areas, according to population density. That gives 12 strata. Within each stratum, the Gallup Poll just drew a simple random sample of telephone numbers, using random digit dialing. Telephone Surveys Non-respondents create problems, as usual. Gallup: Most of its interviewing on evenings, and the weekends, when people are likely to be at home. The cost is about 1/3 as much and the predictions are fairly well on target.

INDEX 1 Population and Sample 2 Problems of Surveys 3 Sample Selection Methods 4 The Gallup Poll 5 Telephone Surveys 6 Probability Error

6. Probability Error Probability Error Thought there is no bias at all, the estimate is different from the real value. The estimate is deviates as much as the p robability error. EX) Estimating the percentage of 1 ' s in the box. a box with a large number of tickets, some marked 1 and the others marked 0 Population tickets drawn at random without replacement Sample Thought there is no problem about response, probability error exists. Percentage of 1 ' s in sample = percentage of 1 ' s in box + probability error

