Sample Surveys AP Statistics Chapter 12. Learning Goals 1.Population versus Sample 2.Sample Surveys 3.Generalizing Results 4.Sampling Frame & Sampling.

Sample Surveys AP Statistics Chapter 12

Learning Goals 1.Population versus Sample 2.Sample Surveys 3.Generalizing Results 4.Sampling Frame & Sampling Design 5.Simple Random Sample (SRS) 6.Convenience Samples 7.Other Random Sampling Designs 8.Types of Bias 9.The Valid Survey

POPULATION VERSUS SAMPLE Learning Goal #1

Learning Goal #1: Background We have learned ways to display, describe, and summarize data, but have been limited to examining the particular batch of data we have. To make decisions, we need to go beyond the data at hand and to the world at large. Let’s investigate three major ideas that will allow us to make this stretch… 1.Examine a part of the whole. 2.Randomize 3.It’s the sample size

Learning Goal #1: Idea 1: Examine a Part of the Whole The first idea is to draw a sample. – We’d like to know about an entire population of individuals, but examining all of them is usually impractical, if not impossible. – We settle for examining a smaller group of individuals—a sample—selected from the population. – Sampling is a natural thing to do. Think about sampling something you are cooking—you taste (examine) a small part of what you’re cooking to get an idea about the dish as a whole.

Learning Goal #1: Population and Sample Population: The collection of all individuals or items under consideration in a statistical study. – The population is determined by what we want to know. Sample: That part of the population from which information is obtained. – The sample is determined by what is practical and should be representative of the population.

Population Population: all the subjects of interest – We use statistics to learn about the population, the entire group of interest Sample: subset of the population – Data is collected for the sample because we cannot typically measure all subjects in the population Learning Goal #1: Population and Sample Sample

Learning Goal #1: Example: Population vs. Sample If we have data on all the individuals who have climbed Mt. Everest, then we have population data. On the other hand, if our data come from some of the climbers, we have sample data.

Learning Goal #1: Data Collection The quality of the results obtained from any statistical method is only as good as the data used. – The reliability and accuracy of the data affect the validity of the results of a statistical analysis. – The reliability and accuracy of the data depend on the method of collection Conclusion: “GARBAGE IN, MEANS GARBAGE OUT” Sampling methods that, by their nature, tend to over- or underemphasize some characterstics of the population are said to be biased.

Learning Goal #1: Idea 2: Randomize Randomization can protect you against factors that you know are in the data. – It can also help protect against factors you are not even aware of. Randomizing protects us from the influences of all the features of our population, even ones that we may not have thought about. – Randomizing makes sure that on the average the sample looks like the rest of the population.

Learning Goal #1: Randomizing Not only does randomizing protect us from bias, it actually makes it possible for us to draw inferences about the population when we see only a sample. Such inferences are among the most powerful things we can do with Statistics. But remember, it’s all made possible because we deliberately choose things randomly.

Learning Goal #1: Idea 3: It’s the Sample Size How large a random sample do we need for the sample to be reasonably representative of the population? It’s the size of the sample, not the size of the population, that makes the difference in sampling. – Exception: If the population is small enough and the sample is more than 10% of the whole population, the population size can matter. The fraction of the population that you’ve sampled doesn’t matter. It’s the sample size itself that’s important.

Learning Goal #1: Sample Size Sample Size – Is the number of individuals selected from our population. – The size of the population does not dictate the size of the sample. – A sample of size 100 may work equally well for a population of 1000 or 10,000 as long as it is a random sample of the population of interest. Example: A ladle of soup gives us the same information regarding the seasoning of the soup regardless of the size of the pot it is taken from as long as the pot is well stirred (random samples). – The general rule is that the sample size should be no more than 10% of the population size.

SAMPLE SURVEYS Learning Goal #2

Learning Goal #2: Sample Surveys Opinion polls are examples of sample surveys, designed to ask questions of a small group of people in the hope of learning something about the entire population. – Professional pollsters work quite hard to ensure that the sample they take is representative of the population. – If not, the sample can give misleading information about the population.

Learning Goal #2: Sample Surveys Sample Surveys solicit information from individuals. Types of sample surveys; opinion polls by – Personal interview – Telephone interview – Questionnaire by mail or internet

Learning Goal #2: Sample Survey A sample survey selects a sample of people from a population and interviews them to collect data. A census is a survey that attempts to count the number of people in the population and to measure certain characteristics about them

Learning Goal #2: Census Why bother determining the right sample size? Wouldn’t it be better to just include everyone and “sample” the entire population? – Such a special sample is called a census. – Often includes a collection of related demographic information (age, race, gender, occupation, income, etc.). Definition: A sample that consists of the entire population (tries to count every individual). Example: US census – an official, periodic (every 10 years) inventory of the entire population of the US.

Learning Goal #2: Census Problems with taking a census: – It can be difficult to complete a census—there always seem to be some individuals who are hard (or expensive) to locate or hard to measure; or it may be impractical - food. – Populations rarely stand still. Even if you could take a census, the population changes while you work, so it’s never possible to get a perfect measure. – Taking a census may be more complex than sampling.

Learning Goal #2: Populations and Parameters Models use mathematics to represent reality. – Parameters are the key numbers in those models. A parameter that is part of a model for a population is called a population parameter. Rarely know the true value of a population parameter; we estimate it from sampled data. We use data to estimate population parameters. – Any summary found from the data (sample) is a statistic. – The statistics that estimate population parameters are called sample statistics.

Learning Goal #2: Population versus Sample Sample: The part of the population we actually examine and for which we do have data How well the sample represents the population depends on the sample design. A statistic is a number describing a characteristic of a sample.  Population: The entire group of individuals in which we are interested but can’t usually assess directly Example: All humans, all working- age people in California, all crickets  A parameter is a number describing a characteristic of the population. Population Sample

Learning Goal #2: Populations and Parameters

Learning Goal #2: Notation We typically use Greek letters to denote parameters and Latin letters to denote statistics.

GENERALIZING RESULTS Learning Goal #3

Learning Goal #3: Generalizing Results Recall that the goal of sampling ask questions of a small group of people, the sample, in the hope of learning something about the entire population. However, care should be taken to generalize the results of a sample only to the population that is represented by the sample.

SAMPLING FRAME & SAMPLING DESIGN Learning Goal #4

Learning Goal #4: Sampling Frame – The sampling frame is a list of individuals from which the sample is drawn. – If the sampling frame is not equal to the population of interest and is different from the population in some way that may affect the response variable, the sample will be biased. – Example: If we are interested in obtaining information about H.S. students in Florida but obtain our sample of students from a list of private schools, then our sampling frame is not reflective of the population of interest nor is our sample.

Learning Goal #4: Sampling Frame & Sampling Design The sampling frame is the list of subjects in the population from which the sample is taken, ideally it lists the entire population of interest. The sampling design determines how the sample is selected. Ideally, it should give each subject an equal chance of being selected to be in the sample

Learning Goal #4: Sampling Design The sampling design is the method used to chose the sample. All statistical sampling designs incorporate the idea that chance (randomness), rather than choice, is used to select the sample. The value of deliberately introducing randomness is one of the great insights of Statistics. Randomizing protects us from the influences of all the features of our population, even ones that we may not have thought about. It does that by making sure that on the average the sample looks like the rest of the population.

SIMPLE RANDOM SAMPLE (SRS) Learning Goal #5:

Learning Goal #5: Simple Random Sample (SRS) We draw samples because we can’t work with the entire population. – We need to be sure that the statistics we compute from the sample reflect the corresponding parameters accurately. – A sample that does this is said to be representative.

Learning Goal #5: Simple Random Sample (SRS) We will insist that every possible sample of the size we plan to draw has an equal chance to be selected. And such samples must also guarantee that each individual has an equal chance of being selected. A sample drawn in this way is called a Simple Random Sample (SRS). An SRS is the standard against which we measure other sampling methods, and the sampling method on which the theory of working with sampled data is based.

Learning Goal #5: Simple Random Sample (SRS) Requirements for Simple Random Sample (SRS): 1)Every sample of size n from the population has an equal chance of being selected and 2)Every member of the population has an equal chance of being included in the sample. The preferred method – probability is the highest that the sample is representative of the population than for any other sampling method. Least chance of sample bias.

Learning Goal #5: Simple Random Sample (SRS) Random Sampling is the best way of obtaining a sample that is representative of the population.

Learning Goal #5: SRS Example Two club officers are to be chosen for a New Orleans trip There are 5 officers: President, Vice-President, Secretary, Treasurer and Activity Coordinator The 10 possible samples are: (P,V) (P,S) (P,T) (P,A) (V,S) (V,T) (V,A) (S,T) (S,A) (T,A) For a SRS, each of the ten possible samples has an equal chance of being selected. Thus, each sample has a 1 in 10 chance of being selected and each officer has a 1 in 4 chance of being selected.

Learning Goal #5: Methods of SRS 1)Place names (population) in a hat and draw out a handful (sample). 2)Computer/TI-84 software. 3)Table of random digits – A long string of the digits 0,1,2,3,4,5,6,7,8,9 with these two properties 1.Each entry in the table is equally likely to be any of the ten digits 0 through 9. 2.The entries are independent of each other, that is, knowledge of one part of the table gives no information about any other part.

Leaning Goal #5: Using Random Numbers to select a SR S To select a simple random sample – Number the subjects in the sampling frame using numbers of the same length (number of digits). – Select numbers of that length from a table of random numbers or using a random number generator. – Include in the sample those subjects having numbers equal to the random numbers selected.

We need to select a random sample of 5 from a class of 20 students. 1)List and number all members of the population, which is the class of 20. 2)The number 20 is two-digits long. 3)Parse the list of random digits into numbers that are two digits long. Here we chose to start with line 103, for no particular reason. Learning Goal #5: Example – Choosing a SRS 22 36 84 65 73 25 59 58 53 93 30 99 58 91 98 27 98 25 34 02

1 Alison 2 Amy 3 Brigitte 4 Darwin 5 Emily 6 Fernando 7 George 8 Harry 9 Henry 10 John 11 Kate 12 Max 13 Moe 14 Nancy 15 Ned 16 Paul 17 Ramon 18 Rupert 19 Tom 20 Victoria Remember that 1 is 01, 2 is 02, etc. If you were to hit 09 again before getting five people, don’t sample Ramon twice—you just keep going. 4)Choose a random sample of size 5 by reading through the list of two-digit random numbers, starting with line 2 and on. 5)The first five random numbers matching numbers assigned to people make the SRS. 22 36 84 65 73 25 59 58 53 93 30 99 58 91 98 27 98 25 34 02 The first individual selected is Amy, number 02. That’s it from line 2. Move to line 3 Then Moe (13), Darwin, (04), Henry (09), and Net (15) 24 13 04 83 60 22 52 79 72 65 76 39 36 48 09 15 17 92 48 30

Learning Goal #5: SRS Example Use a random digit table to pick a random sample of 30 cars from a population of 500 cars. 1)Label - Assign each car a different number from 001 to 500 (3 digit group). 2)Table – Enter Table B on line 108 (can begin anywhere) and regroup the digits in groups of 3 (because our labels have 3 digits). Then select the sample.

Learning Goal #5: SRS Example 10860940 72024 17868 24943 61790 90656 87964 18883 10936009 19365 15412 39638 85453 46816 83485 41979 – 609407202417868249436179 – 090656879641888336009193 – 651 541239638854534681683 Select the first 30 digit groups that are within the range of your labels to make up the SRS. SRS – 407, 202, 417, 249, 436, 179, 090, 336, 009, 193, 239, etc.

Learning Goal #5: Your Turn Suppose 80 students are taking an AP Statistics course and the teacher wants to randomly pick out a sample of 10 students to try out a practice exam. Select a SRS of 10 students. Solve – use the following Random Digit Table beginning at line 108. 10860940 72024 17868 24943 61790 90656 87964 10918883 36009 19365 15412 39638 85453 46816

Learning Goal #5: TI-84 Random Digits Use RANDINT function ( MATH/PRB/5:RANDINT ) RANDINT (lower limit, upper limit, number of digits) – RANDINT(0,9,5) – generates 5 random integers between 0-9. – RANDINT(1,6) – generates a random integers between 1- 6, simulates rolling die. – RANDINT(0,99) – generates two digit number from 00-99 each time the return key is pressed. 124 RAND, sets TI-84 to the same random digits.

Learning Goal #5: Simple Random Samples Samples drawn at random generally differ from one another. – Each draw of random numbers selects different people for our sample. – These differences lead to different values for the variables we measure. – We call these sample-to-sample differences sampling variability.

Learning Goal #5: Sampling Variability Sampling Variability – Is the natural tendency of randomly drawn samples to differ, one from another. – Sampling variability is not an error, just the natural result of random sampling. – Statistics attempts to minimize, control, and understand variability so that informed decisions can drawn from the data despite their variation. – Although samples vary, when we use chance to select them, they do not vary haphazardly but rather according to the laws of probability.

Learning Goal #5: Example: Sample Variability Each of four major news organizations surveys likely voters and separately reports that the percentage favoring the incumbent candidate is 53.5%, 54.1%, 52%, and 54.2%, respectively. What is the correct percentage? Did three or more of the news organizations make a mistake?

Learning Goal #5: Solution There is no way of knowing the correct population percentage from the information given. The four surveys led to four statistics, each an estimate of the population parameter. No one made a mistake unless there was a bad survey. Sampling variation is natural.

CONVENIENCE SAMPLES Learning Goal #6

Learning Goal #6: Convenience Samples: Poor Ways to Sample Convenience Sample: a type of survey sample that is easy to obtain, exactly as its name suggests, by sampling individuals who are conveniently available. – Unlikely to be representative of the population – Often severe biases result from such a sample – Results apply ONLY to the observed subjects – The classic example of a convenience sample is standing at a shopping mall and selecting shoppers as they walk by to fill out a survey.

Learning Goal #6: Convenience Samples: Poor Ways to Sample The classic example of a convenience sample is standing at a shopping mall and selecting shoppers as they walk by to fill out a survey. Internet convenience samples are worthless, they have no well-defined sampling frame and thus report no useful information.

Just ask whoever is around. – Example: “Man on the street” survey (cheap, convenient, often quite opinionated or emotional → now very popular with TV “journalism”) Which men, and on which street? – Ask about gun control or legalizing marijuana “on the street” in Berkeley, CA and in some small town in Idaho and you would probably get totally different answers. – Even within an area, answers would probably differ if you did the survey outside a high school or a country-western bar. Bias: Opinions limited to individuals present Learning Goal #6: Convenience Samples: Poor Ways to Sample

Learning Goal #6: Voluntary Response Sample Voluntary Response Sample: most common form of a convenience sample. – Subjects volunteer for the sample, a large group of individuals are invited to respond and all who do respond are counted. – The sample is not representative, volunteers do not tend to be representative of the entire population. – Results in voluntary response bias (which will be discussed later) which invalidates the survey.

Individuals choose to be involved. These samples are very susceptible to being biased because different people are motivated to respond or not. They are often called “public opinion polls” and are not considered valid or scientific. Bias: Sample design systematically favors a particular outcome. Ann Landers summarizing responses of readers: Seventy percent of (10,000) parents wrote in to say that having kids was not worth it—if they had to do it over again, they wouldn’t. Bias: Most letters to newspapers are written by disgruntled people. A random sample showed that 91% of parents WOULD have kids again. Learning Goal #6: Voluntary Response Sample

OTHER RANDOM SAMPLING DESIGNS Learning Goal #7

Learning Goal #7: Other Random Sampling Designs 1)Stratified Random Sample 2)Cluster Random Sample 3)Multistage Sampling 4)Systematic Sampling

Learning Goal #7: Other Random Sampling Designs It is not always possible to conduct a SRS so it is necessary to have well designed, informative sampling methods that are not a SRS, e.g., sample methods that use randomization. Simple random sampling is not the only fair way to sample. More complicated designs may save time or money or help avoid sampling problems. All statistical sampling designs have in common the idea that chance (randomization), rather than human choice, is used to select the sample.

Learning Goal #7: Stratified Random Sample The population is first sliced into homogeneous groups, called strata, before the sample is selected. These strata are made up of individuals similar in some way that may affect the response variable. Then simple random sampling is used within each stratum before the results are combined. This common sampling design is called stratified random sampling.

Learning Goal #7: Stratified Random Sample With this procedure we can acquire information about – the whole population – each stratum – the relationships among strata. Examples of strata Sex Male Female Age under 20 20-30 31-40 41-50 Occupation professional clerical blue-collar

Learning Goal #7: Stratified Random Sample Stratified Random Sample Steps – Divide the population into separate homogeneous groups, called strata. – Select a simple random sample from each strata. – Combine the samples from all strata to form complete sample.

Learning Goal #7: Stratified Random Sampling Procedure

Learning Goal #7: Stratified Random Sample

Learning Objective 1: Stratified Random Sample - Example Suppose a university has the following student demographics: Undergraduate Graduate First Professional Special 55% 20% 5% 20% In order to insure proper coverage of each demographic, a stratified random sample of 100 students could be chosen as follows: select a SRS of 55 undergraduates, a SRS of 20 graduates, a SRS of 5 first professional students, and a SRS of 20 special students; combine these 100 students.

Learning Goal #7: Stratified Random Sample The most important benefit is Stratifying can reduce the variability of our results. – When we restrict by strata, additional samples are more like one another, so statistics calculated for the sampled values will vary less from one sample to another. Stratified random sampling can reduce bias. Stratified sampling can also help us notice important differences among groups.

Learning Goal #7: Stratified Random Sample Stratified Random Sample Advantage is that you can include in your sample enough subjects in each stratum you want to evaluate. Disadvantage is that you must have a sampling frame and know the stratum into which each subject belongs.

Learning Goal #7: Cluster Random Sample Sometimes stratifying isn’t practical and simple random sampling is difficult. Splitting the population into similar parts or clusters can make sampling more practical. – Then we could select one or a few clusters at random and perform a census within each of them. – This sampling design is called cluster random sampling. – If each cluster fairly represents the full population, cluster sampling will give us an unbiased sample.

Learning Goal #7: Cluster Random Sample Cluster Random Sample Steps – Divide the population into a large number of clusters, such as city blocks – Select a simple random sample of the clusters – Use the subjects in those clusters as the sample

Learning Goal #7: Cluster Random Sample Summary – Cluster Sampling – Divide the population into heterogeneous groups called clusters. – Take an SRS of some of the clusters. – Every member of the cluster is included in the sample. Usually used to reduce the cost of obtaining a sample. Extensively used by government agencies and certain private research organizations.

Learning Goal #7: Cluster Random Sample Example: In conducting a survey of school children in a large city, we could first randomly select 5 schools and then include all the children from each selected school. Although cluster sampling can save time and money, it does have disadvantages. – Ideally, each cluster should mirror the entire population. However, that is often not the case, as members of a cluster are frequently more homogeneous than the members of the population as a whole.

Learning Goal #7: Cluster Random Sample - Procedure

Learning Goal #7: Cluster Random Sample

Cluster sampling is not the same as stratified sampling. – We stratify to ensure that our sample represents different groups in the population, and sample randomly within each stratum. Strata are internally homogeneous, but differ from one another. – Clusters are more or less alike, are internally heterogeneous and each resembling the overall population. We select clusters to make sampling more practical or affordable.

Learning Goal #7: Cluster Random Sample Cluster Random Sample Preferable when – A reliable sampling frame is unavailable. – The cost of selecting a SRS is excessive. Disadvantage – Usually need a larger sample size than with a SRS in order to achieve a similar results.

Learning Goal #7: Comparing Random Sampling Methods

Learning Goal #7: Multistage Sample Sometimes we use a variety of sampling methods together. Sampling schemes that combine several methods are called multistage samples. Most surveys conducted by professional polling organizations use some combination of stratified and cluster sampling as well as simple random sampling.

Learning Goal #7: Multistage Sample Example: A national polling service may stratify the country by geographical regions, select a random sample of cities from each region, and then interview a cluster of residents in each city. Multistage samples may also use multiple stages of stratification. They are often used by the government to obtain information about the U.S. population. Example: Sampling both urban and rural areas, people in different ethnic and income groups within the urban and rural areas, and then individuals of different ethnicities within those strata. Data are obtained by taking an SRS for each substrata.

Learning Goal #7: Systematic Sample Sometimes we draw a sample by selecting individuals systematically. – For example, you might survey every 10th person on an alphabetical list of students. To make it random, you must still start the systematic selection from a randomly selected individual. When there is no reason to believe that the order of the list could be associated in any way with the responses sought, a systematic sample can give a representative sample.

Learning Goal #7: Systematic Sample Method of sampling in which the sample is selected in some predetermined way. For example, we may obtain a list of our population of interest and from that list choose every fifth individual to be part of the sample. Although each individual has an equal chance of being chosen, this method is not a SRS because each possible sample of size n individuals does not have an equal chance of being chosen.

Learning Goal #7: Systematic Sample Example: If we are choosing a sample of 30 students from the 300 students in the senior class by selecting every 10 th student from the alphabetical directory, the first 30 students on the list will never all be chosen as the sample group. Easier to execute than SRS. Usually provides results comparable to SRS.

Learning Goal #7: Systematic Sample - Procedure

Learning Goal #7: Systematic Sample

Systematic sampling can be much less expensive than true random sampling. When you use a systematic sample, you need to justify the assumption that the systematic method is not associated with any of the measured variables.

TYPES OF BIAS Learning Goal #8

Learning Goal #8: Types of Bias 1)Undercoverage Bias 2)Voluntary Response Bias 3)Nonresponse Bias 4)Response Bias

Learning Goal #8: Bias Bias: Tendency to systematically favor certain parts of the population over others. Any systematic failure of a sample to represent its population. Sampling methods that, by their nature, tend to over- or underemphasize some characteristics of the population are said to be biased. – Bias is the bane of sampling—the one thing above all to avoid. – There is usually no way to fix a biased sample and no way to salvage useful information from it. The best way to avoid bias is to select individuals for the sample at random. – The value of deliberately introducing randomness is one of the great insights of Statistics. A Large Sample Does Not Guarantee An Unbiased Sample!

Learning Goal #8: Bias

Learning Goal #8: Undercoverage A sampling scheme that fails to sample part of the population or that gives a part of the population less representation than it has in the population suffers from undercoverage. – A classic example of undercoverage is the Literary Digest voter survey, which predicted that Alfred Landon would beat Franklin Roosevelt in the 1936 presidential election. The survey sample suffered from undercoverage of low- income voters, who tended to be Democrats. Undercoverage is often a problem with convenience samples.

Learning Goal #8: Undercoverage Example: Literary Digest Poll 1936 presidential election Literary Digest magazine poll. The survey team asked a sample of the voting population whether they would vote for Franklin D. Roosevelt, the democratic candidate or Alfred Landon, the republican candidate. Based on the results, the magazine predicted an easy win for Landon.

Learning Goal #8: Undercoverage Example – Result When the actual results were in, Roosevelt won by a landslide. What happened? – The sample was obtained from among people who owned a car or had a telephone. In 1936, that group included mostly rich people and they historically voted republican. – The response rate was low, less than 25% of those polled responded. A disproportionate number of those responding were Landon supporters. Whatever the reason for the poll’s failure, the sample was not representative of the population.

Undercoverage occurs when parts of the population are left out in the process of choosing the sample. Because the U.S. Census goes “house to house,” homeless people are not represented. Illegal immigrants also avoid being counted. Geographical districts with a lot of undercoverage tend to be poor ones. Representatives from richer areas typically strongly oppose statistical adjustment of the census. Historically, clinical trials have avoided including women in their studies because of their periods and the chance of pregnancy. This means that medical treatments were not appropriately tested for women. This problem is slowly being recognized and addressed. Learning Goal #8: Undercoverage

Learning Goal #8: Voluntary Response Bias When choice rather than randomization is used to obtain a sample, the sample suffers from voluntary response bias. Voluntary response bias occurs when sample members are self-selected volunteers (voluntary response sample). – An example would be call-in radio shows that solicit audience participation in surveys on controversial topics (abortion, affirmative action, gun control, etc.). – The resulting sample tends to over represent individuals who have strong opinions, either for or against the topic.

Learning Goal #8: Nonresponse Bias Occurs in a sample design when individuals selected for the sample fail to respond, cannot be contacted, or decline to participate. No survey succeeds in getting responses from everyone. The problem is that those who don’t respond may differ from those who do. – A common problem with mail surveys. Response rate is often low (5% - 30%), making mail surveys vulnerable to nonresponse bias.

Learning Goal #8: Response Bias Anything in a survey that influences responses falls under the heading of response bias. Examples are biased wording of survey questions, lack of privacy while being surveyed, and appearance of the interviewer. Both Question Bias and Interviewer Bias are examples of response bias.

Learning Goal #8: Response Bias - Question Bias Wording of the questions or the questions themselves lead to bias. People often don’t want to be perceived as having unpopular or unsavory views and so may not respond truthfully. – Example: Given that the threat of nuclear war is higher now than it has ever been in human history, and the fact that a nuclear war poses a threat to the very existence of the human race, would you favor an all-out nuclear test ban? – Question is biased in favor of a nuclear test ban.

Learning Goal #8: Response Bias - Question Bias

Learning Goal #8: Response Bias - Interviewer Bias The sex, age, race, dress, attitude, or actions of the interviewer and how the interviewer asks the questions have an influence on the way a subject responds. – Example: A male interviewer asking sex related questions to women. To prevent this, interviewers must be trained to remain neutral throughout the interview. They must also pay close attention to the way they ask each question. If an interviewer changes the way a question is worded, it may impact the respondent's answer.

THE VALID SURVEY Learning Goal #9

Learning Goal #9: Key Parts of a Sample Survey Identify the population of all subjects of interest Construct a sampling frame which attempts to list all subjects in the population Use a random sampling design to select n subjects from the sampling frame Be cautious of sampling bias due to nonrandom samples We can make inferences about the population of interest when sample surveys that use random sampling are employed.

Learning Goal #9: The Valid Survey It isn’t sufficient to just draw a sample and start asking questions. A valid survey yields the information we are seeking about the population we are interested in. Before you set out to survey, ask yourself: – What do I want to know? – Am I asking the right respondents? – Am I asking the right questions? – What would I do with the answers if I had them; would they address the things I want to know?

These questions may sound obvious, but there are a number of pitfalls to avoid. Know what you want to know. – Understand what you hope to learn and from whom you hope to learn it. Use the right frame. – Be sure you have a suitable sampling frame. Tune your instrument. – The survey instrument itself can be the source of errors - too long yields less responses. Learning Goal #9: The Valid Survey

Ask specific rather than general questions. Ask for quantitative results when possible. Be careful in phrasing questions. – A respondent may not understand the question or may understand the question differently than the way the researcher intended it. Even subtle differences in phrasing can make a difference. Learning Goal #9: The Valid Survey

Be careful in phrasing answers. – It’s often a better idea to offer choices rather than inviting a free response. The best way to protect a survey from unanticipated measurement errors is to perform a pilot survey. – A pilot is a trial run of a survey you eventually plan to give to a larger group.

Learning Goal #9: Valid Survey – How to Think About Bias Look for bias in any survey you encounter before you collect the data—there’s no way to recover from a biased sample of a survey that asks biased questions. Spend your time and resources reducing bias. If you possibly can, pilot-test your survey. Always report your sampling methods in detail.

What have we learned? A representative sample can offer us important insights about populations. – It’s the size of the sample, not its fraction of the larger population, that determines the precision of the statistics it yields. There are several ways to draw samples, all based on the power of randomness to make them representative of the population of interest: – Simple Random Sample, Stratified Sample, Cluster Sample, Systematic Sample, Multistage Sample

What have we learned? Bias can destroy our ability to gain insights from our sample: – Nonresponse bias can arise when sampled individuals will not or cannot respond. – Response bias arises when respondents’ answers might be affected by external influences, such as question wording or interviewer behavior.

What have we learned? Bias can also arise from poor sampling methods: – Voluntary response samples are almost always biased and should be avoided and distrusted. – Convenience samples are likely to be flawed for similar reasons. – Even with a reasonable design, sample frames may not be representative. Undercoverage occurs when individuals from a subgroup of the population are selected less often than they should be.

What have we learned? Finally, we must look for biases in any survey we find and be sure to report our methods whenever we perform a survey so that others can evaluate the fairness and accuracy of our results.

Sample Surveys AP Statistics Chapter 12. Learning Goals 1.Population versus Sample 2.Sample Surveys 3.Generalizing Results 4.Sampling Frame & Sampling.

Similar presentations

Presentation on theme: "Sample Surveys AP Statistics Chapter 12. Learning Goals 1.Population versus Sample 2.Sample Surveys 3.Generalizing Results 4.Sampling Frame & Sampling."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sample Surveys AP Statistics Chapter 12. Learning Goals 1.Population versus Sample 2.Sample Surveys 3.Generalizing Results 4.Sampling Frame & Sampling.

Similar presentations

Presentation on theme: "Sample Surveys AP Statistics Chapter 12. Learning Goals 1.Population versus Sample 2.Sample Surveys 3.Generalizing Results 4.Sampling Frame & Sampling."— Presentation transcript:

Similar presentations

About project

Feedback