# Chapter 12 Sample Surveys

## Presentation on theme: "Chapter 12 Sample Surveys"— Presentation transcript:

Chapter 12 Sample Surveys
Richard Dong & Stanley Chen

Vocabulary Population - the entire group of individuals or instances about whom we hope to learn Sample - a (representative) subset of a population, examined in hope of learning about the population Sample Survey - a study that asks questions of a sample drawn from some population in the hope of learning about the entire population; Polls taken to assess voter preferences are common these Bias - any systematic failure of a sampling method to represent its population is this; These sampling methods tend to over-or underestimate parameters; It is almost impossible to recover from this, so efforts to avoid it are well spent Randomization - the best defense against bias is this, in which each individual is given a fair, random chance of selection Sample size - the number of individuals in a sample; Determines how well the sample represents the population, not the fraction of the population sampled Census - a sample that consists of the entire population Population parameter - a numerically valued attribute of a model for a population.

Vocabulary Statistic - values calculated for sampled data
Representative - if the statistics computed from the sample accurately reflect the corresponding population parameters Simple Random Sample SRS - this of sample size n is a sample in which each set of n elements in the population has an equal chance of selection Sampling Frame - a list of individuals from which the sample is drawn Sampling variability - natural tendency of randomly drawn samples to differ, one from another Stratified random sampling - population is divided into subpopulations, or strata, and random samples are then drawn from each stratum; Best if strata are homogeneous, but different from each other Cluster sampling - entire groups, or cluster, are chosen at random. Selected as a matter of convenience, practicality, or cost; Clusters should be representative of the population, and therefore heterogeneous and similar to each other Multistage sampling - combine several different types of sampling methods

Vocabulary Systematic sample - individuals are selected systematically from a sampling frame. first number must be random; ex: every 10th person Pilot - a small trial run Voluntary Response Bias - individuals can choose on their own whether or not to participate in the sample Convenience Sample - taken from individuals who are conveniently available Undercoverage - part of population is less represented Nonresponse bias - large fraction of those sampled fail to respond; those who respond are not likely to represent the whole population; ex: telephone survey Response bias - the word of questions that influences a responders answer; ex: "How do you feel about the cost cuts to local zoos that are making animals starve to death?"

Formulas There are no formulas for this chapter because this unit covers surveys and thus, math is not required.

Concepts Representative samples can offer us important insights about population. The size of the sample is most important. Simple Random Sample (SRS) is the standard. Every person has an equal chance. Stratified samples reduce variability by identifying homogenous subgroups and use random sampling in those groups. Cluster samples randomly select among heterogeneous subgroups that resemble the population, only smaller. Systematic samples are the least expensive and can work in some situations. We want to start randomly though. Multistage samples combine several of the aforementioned methods.

Concepts Bias can destroy our insights through poor sampling methods.
Nonresponse bias arises when respondents might not respond. Response bias arises when sampled individuals might be influenced by wording or interviewer behavior. Voluntary response bias are almost always biased and should be avoided. Convenience samples are likely to be flawed for similar reasons Even with a reasonable design, sample frames may not be representative. Undercoverage may occur when too few individuals are sampled. Always report all biases when performing a survey so that others can evaluate your data for fairness and accuracy in your results.

Problem Example #21 Question: Examine each of the following questions for possible bias. If you think the question is biased, indicate how and propose a better question. Should companies that pollute the environment be compelled to pay the costs of cleanup? Given that 18-year-olds are old enough to vote and to serve in the military, is it fair to set the drinking age at 21? Answers There’s a bias towards yes because of the word ‘pollute’. A better question should be ‘Should companies be responsible for any costs of environmental clean-up?’ There’s a bias towards no because of ‘old enough to serve in the military’. ‘Do you think drinking age should be lowered from 21?’ is a better survey question.

Problem Example #23 Question: Anytime we conduct a survey we must take care to avoid undercoverage. Suppose we plan to select 500 names from the city phone book, call their homes between noon and 4 p.m., and interview whoever answers, anticipating contacts with at least 200 people. Why is it difficult to use a simple random sample here? Describe a more convenient, but still random, sampling strategy. What kinds of households are likely to be included in the eventual sample of opinion? Who will be excluded? Suppose, instead, that we continue calling each number, perhaps in the morning or evening, until an adult is contacted and interviewed. How does this improve the sampling design? Random digit dialing machines can generate the phone calls for us. How would this improve our design? Is anyone still excluded? Answers People with unlisted numbers, people without phones and those at work cannot be reached. As a result, not everyone has an equal chance. We can generate random numbers and call at random times to make sure everyone has equal chance Families that has person at home are more likely to be included under the original plan. Many more people could be included under the second plan. However, people without phones are still excluded. This design can randomize the phone numbers, but time of day is still an issue. And People without phones are still excluded.

Thank you.

Similar presentations